Prominent applications of sentiment analysis are countless, covering areas such as marketing, customer service and communication. The conventional bag-of-words approach for measuring sentiment merely counts term frequencies; however, it neglects the position of the terms within the discourse. As a remedy, we develop a discourse-aware method that builds upon the discourse structure of documents. For this purpose, we utilize rhetorical structure theory to label (sub-)clauses according to their hierarchical relationships and then assign polarity scores to individual leaves. To learn from the resulting rhetorical structure, we propose a tensor-based, tree-structured deep neural network (named Discourse-LSTM) in order to process the complete discourse tree. The underlying tensors infer the salient passages of narrative materials. In addition, we suggest two algorithms for data augmentation (node reordering and artificial leaf insertion) that increase our training set and reduce overfitting. Our benchmarks demonstrate the superior performance of our approach. Moreover, our tensor structure reveals the salient text passages and thereby provides explanatory insights.

Click to Read Paper
Reinforcement learning refers to a group of methods from artificial intelligence where an agent performs learning through trial and error. It differs from supervised learning, since reinforcement learning requires no explicit labels; instead, the agent interacts continuously with its environment. That is, the agent starts in a specific state and then performs an action, based on which it transitions to a new state and, depending on the outcome, receives a reward. Different strategies (e.g. Q-learning) have been proposed to maximize the overall reward, resulting in a so-called policy, which defines the best possible action in each state. Mathematically, this process can be formalized by a Markov decision process and it has been implemented by packages in R; however, there is currently no package available for reinforcement learning. As a remedy, this paper demonstrates how to perform reinforcement learning in R and, for this purpose, introduces the ReinforcementLearning package. The package provides a remarkably flexible framework and is easily applied to a wide range of different problems. We demonstrate its use by drawing upon common examples from the literature (e.g. finding optimal game strategies).

Click to Read Paper
State-of-the-art systems in deep question answering proceed as follows: (1) an initial document retrieval selects relevant documents, which (2) are then processed by a neural network in order to extract the final answer. Yet the exact interplay between both components is poorly understood, especially concerning the number of candidate documents that should be retrieved. We show that choosing a static number of documents -- as used in prior research -- suffers from a noise-information trade-off and yields suboptimal results. As a remedy, we propose an adaptive document retrieval model. This learns the optimal candidate number for document retrieval, conditional on the size of the corpus and the query. We report extensive experimental results showing that our adaptive approach outperforms state-of-the-art methods on multiple benchmark datasets, as well as in the context of corpora with variable sizes.

* EMNLP 2018
Click to Read Paper
The marvel of markets lies in the fact that dispersed information is instantaneously processed and used to adjust the price of goods, services and assets. Financial markets are particularly efficient when it comes to processing information; such information is typically embedded in textual news that is then interpreted by investors. Quite recently, researchers have started to automatically determine news sentiment in order to explain stock price movements. Interestingly, this so-called news sentiment works fairly well in explaining stock returns. In this paper, we design trading strategies that utilize textual news in order to obtain profits on the basis of novel information entering the market. We thus propose approaches for automated decision-making based on supervised and reinforcement learning. Altogether, we demonstrate how news-based data can be incorporated into an investment system.

* Feuerriegel, Stefan, and Helmut Prendinger. "News-based trading strategies." Decision Support Systems 90 (2016): 65-74
Click to Read Paper
Decision analytics commonly focuses on the text mining of financial news sources in order to provide managerial decision support and to predict stock market movements. Existing predictive frameworks almost exclusively apply traditional machine learning methods, whereas recent research indicates that traditional machine learning methods are not sufficiently capable of extracting suitable features and capturing the non-linear nature of complex tasks. As a remedy, novel deep learning models aim to overcome this issue by extending traditional neural network models with additional hidden layers. Indeed, deep learning has been shown to outperform traditional methods in terms of predictive performance. In this paper, we adapt the novel deep learning technique to financial decision support. In this instance, we aim to predict the direction of stock movements following financial disclosures. As a result, we show how deep learning can outperform the accuracy of random forests as a benchmark for machine learning by 5.66%.

* Twenty-Fourth European Conference on Information Systems (ECIS 2016), Istanbul, Turkey, 2016
Click to Read Paper
This paper provides a holistic study of how stock prices vary in their response to financial disclosures across different topics. Thereby, we specifically shed light into the extensive amount of filings for which no a priori categorization of their content exists. For this purpose, we utilize an approach from data mining - namely, latent Dirichlet allocation - as a means of topic modeling. This technique facilitates our task of automatically categorizing, ex ante, the content of more than 70,000 regulatory 8-K filings from U.S. companies. We then evaluate the subsequent stock market reaction. Our empirical evidence suggests a considerable discrepancy among various types of news stories in terms of their relevance and impact on financial markets. For instance, we find a statistically significant abnormal return in response to earnings results and credit rating, but also for disclosures regarding business strategy, the health sector, as well as mergers and acquisitions. Our results yield findings that benefit managers, investors and policy-makers by indicating how regulatory filings should be structured and the topics most likely to precede changes in stock valuations.

Click to Read Paper
Traditional information retrieval (such as that offered by web search engines) impedes users with information overload from extensive result pages and the need to manually locate the desired information therein. Conversely, question-answering systems change how humans interact with information systems: users can now ask specific questions and obtain a tailored answer - both conveniently in natural language. Despite obvious benefits, their use is often limited to an academic context, largely because of expensive domain customizations, which means that the performance in domain-specific applications often fails to meet expectations. This paper presents cost-efficient remedies: a selection mechanism increases the precision of document retrieval and a fused approach to transfer learning is proposed in order to improve the performance of answer extraction. Here knowledge is inductively transferred from a related, yet different, tasks to the domain-specific application, while accounting for potential differences in the sample sizes across both tasks. The resulting performance is demonstrated with an actual use case from a finance company, where fewer than 400 question-answer pairs had to be annotated in order to yield significant performance gains. As a direct implication to management, this presents a promising path to better leveraging of knowledge stored in information systems.

* Submitted to ACM TMIS
Click to Read Paper
The macroeconomic climate influences operations with regard to, e.g., raw material prices, financing, supply chain utilization and demand quotas. In order to adapt to the economic environment, decision-makers across the public and private sectors require accurate forecasts of the economic outlook. Existing predictive frameworks base their forecasts primarily on time series analysis, as well as the judgments of experts. As a consequence, current approaches are often biased and prone to error. In order to reduce forecast errors, this paper presents an innovative methodology that extends lag variables with unstructured data in the form of financial news: (1) we apply a variety of models from machine learning to word counts as a high-dimensional input. However, this approach suffers from low interpretability and overfitting, motivating the following remedies. (2) We follow the intuition that the economic climate is driven by general sentiments and suggest a projection of words onto latent semantic structures as a means of feature engineering. (3) We propose a semantic path model, together with estimation technique based on regularization, in order to yield full interpretability of the forecasts. We demonstrate the predictive performance of our approach by utilizing 80,813 ad hoc announcements in order to make long-term forecasts of up to 24 months ahead regarding key macroeconomic indicators. Back-testing reveals a considerable reduction in forecast errors.

Click to Read Paper
Company disclosures greatly aid in the process of financial decision-making; therefore, they are consulted by financial investors and automated traders before exercising ownership in stocks. While humans are usually able to correctly interpret the content, the same is rarely true of computerized decision support systems, which struggle with the complexity and ambiguity of natural language. A possible remedy is represented by deep learning, which overcomes several shortcomings of traditional methods of text mining. For instance, recurrent neural networks, such as long short-term memories, employ hierarchical structures, together with a large number of hidden layers, to automatically extract features from ordered sequences of words and capture highly non-linear relationships such as context-dependent meanings. However, deep learning has only recently started to receive traction, possibly because its performance is largely untested. Hence, this paper studies the use of deep neural networks for financial decision support. We additionally experiment with transfer learning, in which we pre-train the network on a different corpus with a length of 139.1 million words. Our results reveal a higher directional accuracy as compared to traditional machine learning when predicting stock price movements in response to financial disclosures. Our work thereby helps to highlight the business value of deep learning and provides recommendations to practitioners and executives.

Click to Read Paper
Business analytics refers to methods and practices that create value through data for individuals, firms, and organizations. This field is currently experiencing a radical shift due to the advent of deep learning: deep neural networks promise improvements in prediction performance as compared to models from traditional machine learning. However, our research into the existing body of literature reveals a scarcity of research works utilizing deep learning in our discipline. Accordingly, the objectives of this work are as follows: (1) we motivate why researchers and practitioners from business analytics should utilize deep neural networks and review potential use cases, necessary requirements, and benefits. (2) We investigate the added value to operations research in different case studies with real data from entrepreneurial undertakings. All such cases demonstrate a higher prediction performance in comparison to traditional machine learning and thus direct value gains. (3) We provide guidelines and implications for researchers, managers and practitioners in operations research who want to advance their capabilities for business analytics with regard to deep learning. (4) We finally discuss directions for future research in the field of business analytics.

Click to Read Paper
Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The approach utilizes LASSO regularization as a statistical tool to extract decisive words from textual content and draw statistical inferences based on the correspondence between the occurrences of words and an exogenous response variable. Accordingly, the method immediately suggests significant implications for social sciences and Information Systems research: everyone can now identify text segments and word choices that are statistically relevant to authors or readers and, based on this knowledge, test hypotheses from behavioral research. We demonstrate the contribution of our method by examining how authors communicate subjective information through narrative materials. This allows us to answer the question of which words to choose when communicating negative information. On the other hand, we show that investors trade not only upon facts in financial disclosures but are distracted by filler words and non-informative language. Practitioners - for example those in the fields of investor communications or marketing - can exploit our insights to enhance their writings based on the true perception of word choice.

Click to Read Paper
Information systems experience an ever-growing volume of unstructured data, particularly in the form of textual materials. This represents a rich source of information from which one can create value for people, organizations and businesses. For instance, recommender systems can benefit from automatically understanding preferences based on user reviews or social media. However, it is difficult for computer programs to correctly infer meaning from narrative content. One major challenge is negations that invert the interpretation of words and sentences. As a remedy, this paper proposes a novel learning strategy to detect negations: we apply reinforcement learning to find a policy that replicates the human perception of negations based on an exogenous response, such as a user rating for reviews. Our method yields several benefits, as it eliminates the former need for expensive and subjective manual labeling in an intermediate stage. Moreover, the inferred policy can be used to derive statistical inferences and implications regarding how humans process and act on negations.

* 39 pages
Click to Read Paper
Learning to predict solutions to real-valued combinatorial graph problems promises efficient approximations. As demonstrated based on the NP-hard edge clique cover number, recurrent neural networks (RNNs) are particularly suited for this task and can even outperform state-of-the-art heuristics. However, the theoretical framework for estimating real-valued RNNs is understood only poorly. As our primary contribution, this is the first work that upper bounds the sample complexity for learning real-valued RNNs. While such derivations have been made earlier for feed-forward and convolutional neural networks, our work presents the first such attempt for recurrent neural networks. Given a single-layer RNN with $a$ rectified linear units and input of length $b$, we show that a population prediction error of $\varepsilon$ can be realized with at most $\tilde{\mathcal{O}}(a^4b/\varepsilon^2)$ samples. We further derive comparable results for multi-layer RNNs. Accordingly, a size-adaptive RNN fed with graphs of at most $n$ vertices can be learned in $\tilde{\mathcal{O}}(n^6/\varepsilon^2)$, i.e., with only a polynomial number of samples. For combinatorial graph problems, this provides a theoretical foundation that renders RNNs competitive.

Click to Read Paper
Emotions widely affect human decision-making. This fact is taken into account by affective computing with the goal of tailoring decision support to the emotional states of individuals. However, the accurate recognition of emotions within narrative documents presents a challenging undertaking due to the complexity and ambiguity of language. Performance improvements can be achieved through deep learning; yet, as demonstrated in this paper, the specific nature of this task requires the customization of recurrent neural networks with regard to bidirectional processing, dropout layers as a means of regularization, and weighted loss functions. In addition, we propose sent2affect, a tailored form of transfer learning for affective computing: here the network is pre-trained for a different task (i.e. sentiment analysis), while the output layer is subsequently tuned to the task of emotion recognition. The resulting performance is evaluated in a holistic setting across 6 benchmark datasets, where we find that both recurrent neural networks and transfer learning consistently outperform traditional machine learning. Altogether, the findings have considerable implications for the use of affective computing.

* Accepted by Decision Support Systems (DSS)
Click to Read Paper