Learning Interpretable Negation Rules via Weak Supervision at Document Level: A Reinforcement Learning Approach

Negation scope detection is widely performed as a supervised learning task which relies upon negation labels at word level. This suffers from two key drawbacks: (1) such granular annotations are costly and (2) highly subjective, since, due to the absence of explicit linguistic resolution rules, human annotators often disagree in the perceived negation scopes. To the best of our knowledge, our work presents the first approach that eliminates the need for world-level negation labels, replacing it instead with document-level sentiment annotations. For this, we present a novel strategy for learning fully interpretable negation rules via weak supervision: we apply reinforcement learning to find a policy that reconstructs negation rules from sentiment predictions at document level. Our experiments demonstrate that our approach for weak supervision can effectively learn negation rules. Furthermore, an out-of-sample evaluation via sentiment analysis reveals consistent improvements (of up to 4.66%) over both a sentiment analysis with (i) no negation handling and (ii) the use of word-level annotations from humans. Moreover, the inferred negation rules are fully interpretable.


Introduction
Negations are a frequently utilized linguistic tool for expressing disapproval or framing negative content with positive words. Neglecting negations can lead to false attributions (Morante et al., 2008) and, moreover, impair accuracy when analyzing natural language; e. g., in information retrieval (Cruz Díaz et al., 2012;Rokach et al., 2008) and especially in sentiment analysis (Cruz et al., 2015;Wiegand et al., 2010). Hence, even simple heuristics for identifying negation scopes can yield substantial improvements in such cases (Jia et al., 2009).
Negation scope detection is sometimes implemented as unsupervised learning (e. g., Pröllochs et al., 2016), while a better performance is commonly achieved via supervised learning (see our supplements for a detailed overview): the resulting models thus learn to identify negation scopes from word-level annotations (e. g., Li and Lu, 2018;Reitan et al., 2015). We argue that this approach suffers from inherent drawbacks. (1) Such granular annotations are costly and, especially at word level, a considerable number of them is needed.
(2) Negation scope detection is highly subjective (Councill et al., 2010). Due to the absence of explicit linguistic rules for resolutions, existing corpora often come with annotation guidelines (Morante and Blanco, 2012;Morante and Daelemans, 2012). Yet there are considerable differences: some corpora were labeled in a way that negation scopes consist of single text spans, while others allowed disjoint spans (Fancellu et al., 2017). More importantly, given the absence of universal rules, human annotators largely disagree in their perception of what words should be labeled as negated.
Motivational experiment. Since prevalent corpora were labeled only by a single rater, we now establish the severity of between-rater discrepancies. For this, we carried out an initial analysis of 500 sentences from movie reviews. 1 Each sentence contained at least one explicit negation phrase from the list of Jia et al. (2009), such as "not" or "no." Two human raters were then asked to annotate negation scopes. They could choose an arbitrary selection of words and were not restricted to a single subspan, as recommended by Fancellu et al. (2017). The annotations resulted in large differences: only 50.20 % of the words were simultaneously labeled as "negated" by both raters. Based on this experimental evidence, we showcase there is no universal definition of negation scopes (rather, human annotations are likely to be noisy or even error-prone) and thus highlight the need for further research.
Contributions. To the best of our knowledge, our work presents the first approach that eliminates the need for word-level annotations of negation labels. Instead, we perform negation scope detection merely by utilizing shallow annotations at document level in the form of sentiment labels (e. g., from user reviews). Our novel strategy learns interpretable negation rules via weak supervision: we apply reinforcement learning to find a policy that reconstructs negation rules based on sentiment prediction at document level (as opposed to conventional word-level annotations).
In our approach, a single document d comes with a sentiment label y d . The document consists of N d words, w d,1 , . . . , w d,N d , where the number of words can easily surpass several hundreds. Based on the sentiment value, we then need to make a decision (especially out-of-sample) for each of the N d words, whether or not it should be negated. In this case, a single sentiment value is outnumbered by potentially hundreds of negation decisions, thus pinpointing to the difficulty of this task. Formally, the goal is to learn individual labels a d,i ∈ {Negated, ¬Negated} for each word w d,i . Rewards are the errors in sentiment prediction at document level.
Strengths. Our approach exhibits several favorable features that overcome shortcomings found in prior works. Among them, it eliminates the need for manual word-level labels. It thus avoids the detrimental influence of subjectivity and misinterpretation. Instead, our model is solely trained on a document-level variable and can thus learn domain-specific particularities of the given prose. The inferred negation rules are fully interpretable while documents can contain multiple instances of negations with arbitrary complexity, sometimes nested or consisting out of disjoint text spans. Despite facing several times more negation decisions than sentiment labels, our experiments demon-strate that this problem can be effectively learned through reinforcement learning.
Evaluation. Given the considerable inconsistencies in human annotations of negation scopes and the lack of universal rules, we regard the "true" negation scopes as unobservable. Hence, we later compare the identified negation scopes with those from rater 1 and 2 only as a sensitivity check because of the fact that both raters have only 50.2 % overlap. Instead, we choose the following evaluation strategy. We concentrate on the performance of negation scope detection as a supporting tool in natural language processing where its primary role is to facilitate more complex learning tasks such as sentiment analysis. Therefore, we report the performance improvements in sentiment analysis resulting from our approach. For a fair comparison, we use baselines that only rely upon the same information as our weak supervision (and thus have no access to word-level negation labels). Our performance is even on par with a supervised classifier that can exploit richer labels during training.

Learning Negation Scope Detection via Weak Supervision
Intuition. The choice of reinforcement learning for weak supervision might not be obvious at first, but, in fact, it is informed by theory: it imitates the human reading process as stipulated by cognitive reading theory (Just and Carpenter, 1980), where readers iteratively process information word-byword.
States and actions. In each learning iteration, the reinforcement learning agent observes the current state s i = (w i , a i−1 ) that we engineer as the combination of the i-th word w i in a document and the previous action a i−1 . This specification establishes a recurrent architecture whereby the previous negation can pass on to the next word. At the same time, this allows for nested negations, as a word can first introduce a negation scope and another subsequent negation can potentially revert it.
After observing the current state, the agent chooses an action a t from of two possibilities: (1) it can set the current word to negated or (2) it can mark it as not negated. Hence, we obtain the following set of possible actions A = {Negated, ¬Negated}. Based on the selected action, the agent receives a reward, r i which updates the knowledge in the state-action function Q(s i , a i ). This state-action function is then used to infer the best possible action a i in each state s i , i. e., the optimal policy π * (s i , a i ).
Reward function. The reward r i depends upon the correlation between a given a document-level label (e. g., a rating in movie reviews) and the sentiment of a document. We predict the sentiment S d of document d using a widely-used sentiment routine based on the occurrences of positively-and negatively-opinionated terms (see Taboada et al., 2011). If a term is negated by the policy, the polarity of the corresponding term is inverted, i. e., positively opinionated terms are counted as negative and vice versa. In the following, S 0 d denotes the document sentiment without considering negations; S π d the sentiment when incorporating negations based on policy π.
When processing a document, we cannot actually compute the reward until we have processed all words. Therefore, we set the reward before the last word to c ≈ 0, i. e., r i = c for i = 1, . . . , N d − 1. For the final word, the agent compares its performance in predicting the document label based on sentiment without considering negations S 0 d to the sentiment when incorporating negations based on the current policy π * . The former is defined by the absolute difference between the document label y d and the predicted sentiment without negations S 0 d , whereas the latter is defined by the absolute difference between y d and the adjusted sentiment using the current policy S π d . Then the difference between these values returns the terminal reward r N d . Thus the reward is with a constant c (we use c = 0.005) that adds a small reward for default (i. e., non-negating) actions to avoid overfitting. Q-learning. During the learning process 2 , the agent then successively observes a sequence of words in which it can select between exploring new actions or taking the current optimal one. This choice is made by ε-greedy selection according to which the agent explores the environment by selecting a random action with probability ε or, alternatively, exploits the current knowledge with probability 1 − ε.

Experiments
Datasets. We use the following benchmark datasets with document-level annotations from the literature (cf. Hogenboom et al., 2011;Pröllochs et al., 2016;Wiegand et al., 2010): IMDb: movie reviews from the Internet Movie Database archive, each annotated with an overall rating at document level (Pang and Lee, 2005).
Airport: user reviews of airports from Skytrax, each annotated with an overall rating at document level (Pérezgonzález and Gilbey, 2011).
Ad hoc: financial announcements with complex, domain-specific language (Pröllochs et al., 2016), labeled with the daily abnormal return of the corresponding stock.
Learning parameters. We perform 4000 learning iterations with a higher exploration rate as given by the following parameters 3 : exploration ε = 0.001, discount factor γ = 0 and learning rate α = 0.005. In a second phase, we run 1000 iterations for fine-tuning with exploration ε = 0.0001, discount factor γ = 0 and learning rate α = 0.001.
Policy learning. For each dataset, the reinforcement learning process converges to a stationary policy that shows reward fluctuations below 0.05 %. As part of a benchmark, we study the mean squared error (MSE) between y d and the predicted sentiment S 0 d when leaving negations untreated as our benchmark. For all datasets, the in-sample MSE decreases substantially (see Figure 1), demonstrating the effectiveness of our learning approach. The reductions number to 14.93 % (IMDb), 16.77 % (airport), and 0.91 % (ad hoc). The latter is a result of the considerably more complex language in financial statements.
Performance in Sentiment Analysis. We use 10-fold cross validation to compare the out-ofsample performance in sentiment analysis of reinforcement learning to benchmarks without wordlevel labels from previous works. The benchmarks consists of rules (Hogenboom et al., 2011;Taboada et al., 2011), which search for the occurrence of specific cues based on pre-defined lists and then invert the meaning of a fixed number of surrounding words.  Figure 1: MSE between the document label and predicted sentiment across different learning iterations using 10-fold cross validation. Additional lines in black from smoothing.

document-level label: 4
IMDb: Negating a fixed window of the next 4 words achieves the lowest error among all rules, similar to Dadvar et al. (2011). This rule reduces the MSE of the benchmark with no negation handling by 1.05 %. Our approach works even more accurately, and dominates all of the rules, reducing the out-of-sample MSE by at least 4.60 %.
Airport: Our method decreases the MSE by 4.66 % compared to the best-performing rule (negating a fixed window of the next 4 words).
Ad hoc: Even for complex financial language, reinforcement learning exceeds this benchmark method by 0.19 % in terms of out-of-sample MSE.
Altogether, our weak supervision improves sentiment analysis consistently across all datasets. 5

Approach
IMDb Airport Ad hoc  Comparison to human raters. For reasons of completeness, our supplements report the overlap with both human raters from our motivational experiment, which is in the range of 18.8 % to 4 We also experimented with performance comparisons in a classification task, yet our approach also yields consistent improvements in this evaluation. 5 We also investigated the relationship between prediction performance and text length, finding only minor effects. 25.2 %. However, these numbers should be treated with caution, as we remind that there is no universal definition of negation scopes and even the two human annotations reveal on 50.2 %. Moreover, our approach was not learned towards reconstructing these human annotations, since we focused on rules that achieve the greatest benefit in sentiment analysis.
Comparison to word-level classifiers. We also compared weak supervision against a supervised HMM classifier from Pröllochs et al. (2016) that draws upon granular word-level negation labels. Here we report the sentiment analysis on IMDb in order to be able to use the domain-specific negation labels from IMDb text snippets of our initial experiment. In comparison to our reinforcement learning, the supervised classifier results in a 5.79 % higher (and thus inferior) MSE. Yet our weak supervision circumvents costly word-level annotations.
Interpretability. Our method yields negation rules that are fully interpretable: one simply has to assess the state-action function Q(s i , a i ). Table 2 provides an example excerpt for the document "this beautiful movie isn't good but fantastic." The agent the starts by observing the first state given by the combination of the first word w 1 and the previous action a 0 , i. e. s 1 = (this, ¬Negated). According to the state-action table, the best action for the agent is to set this state to not negated (a 1 = ¬Negated). This pattern continues until it observes state s 4 = (isn't, ¬Negated) in which the policy implies to initiate a negation scope (a 4 = Negated). Subsequently, the negation scope is forwarded until the agent observes s 6 = (but, Negated) in which it terminates the negation scope (a 6 = ¬Negated). Finally, the agent observes s 7 = (fantastic, ¬Negated) in which it takes action a 7 = ¬Negated.

Related Work
State-of-the-art methods for detecting, handling and interpreting negations can be grouped into different categories (cf. Pröllochs et al., 2015Pröllochs et al., , 2016Rokach et al., 2008).
Rule-based approaches are among the most common due to their ease of implementation and solid out-of-the-box performance. These usually suppose a forward influence of negation cues based on which they invert the meaning of the whole sentence or a fixed number of subsequent words (Hogenboom et al., 2011). Furthermore, they can also incorporate syntactic information in order to imitate subject and object (Padmaja et al., 2014;Chowdhury and Lavelli, 2013). Negation rules have been found to work effectively across different domains and rarely need finetuning (Taboada et al., 2011). However, rule-based approaches entail several drawbacks, as the list of negations must be pre-defined and the selection criterion according to which rule a rule is chosen is usually random or determined via cross validation. In addition, rules cannot effectively cope with implicit expressions or particular, domainspecific characteristics.
Generative probabilistic models (e. g., hidden Markov models or conditional random fields) can partially overcome these shortcomings (Li and Lu, 2018;Reitan et al., 2015;Rokach et al., 2008), such as the difficulty of recognizing implicit negations. These process narrative language word-byword and move between hidden states representing negated and non-negated parts. Such models can adapt to domain-specific language, but require more computational resources and rely upon ex ante transition probabilities. Although variants based on unsupervised learning avoid the need for any labels, practical applications reveal inferior performance compared to supervised approaches (Pröllochs et al., 2015). The latter usu-ally depend on manual labels at a granular level, which are not only costly but suffer from subjective interpretations (Fancellu et al., 2017).
A third category of methods links the polarity shift effect of negations more closely to sentiment analysis tasks at sentence or document level. For example, text parts can be classified into a polarity-unshifted part and a polarity-shifted part according to certain rules (Li and Huang, 2009). Sentiment classification models are then trained using both parts (Li et al., 2010). Alternatively, rule-based algorithms can extract sentences with inconsistent sentiment and omit them from standard sentiment analysis procedures (Orimaye et al., 2012). Reversely, antonym dictionaries have been used to generate sentiment-inverted texts to classify polarity in pairs (Xia et al., 2016). Although such data expansion techniques usually enhance the performance of sentiment analysis, they require either complex linguistic knowledge or extra human annotations (Xia et al., 2015).
Research gap. In contrast to these methods, we propose a novel strategy for learning negation rules via weak supervision. Our model uses reinforcement learning to reconstruct negation rules based on an document-level variable and does not require any kind of manual word-level labeling or precoded linguistic patterns. It is able to recognize explicit as well as implicit negations, while avoiding the influence of subjective interpretations.

Conclusion
This paper proposes the first approach for negation scope detection based on weak supervision. Our proposed reinforcement learning strategy circumvents the need for word-level annotations with negation scopes, as it reconstructs negation rules based on a document-level sentiment labels. Our experiments show that our weak supervision is effective in negation scope detection; it yields consistent improvements (of up to 4.66 %) over a sentiment analysis without negation handling.
Our works suggests important implications. We are in line with growing literature (e. g., Fancellu et al., 2017) that reports challenges in resolving negation scopes through humans. Beyond prior works, our experiment reveals between-rater inconsistencies. While negation scope detection is widely studied as an isolated task, it could be beneficial when linking its evaluation more closely to context-specific uses such as sentiment analysis.