Team "DaDeFrNi" at CASE 2021 Task 1: Document and Sentence Classification for Protest Event Detection
Open access
Date
2021Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
This paper accompanies our top-performing submission to the CASE 2021 shared task, which is hosted at the workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text. Subtasks 1 and 2 of Task 1 concern the classification of newspaper articles and sentences into "conflict" versus "not conflict"-related in four different languages. Our model performs competitively in both subtasks (up to 0.8662 macro F1), obtaining the highest score of all contributions for subtask 1 on Hindi articles (0.7877 macro F1). We describe all experiments conducted with the XLM-RoBERTa (XLM-R) model and report results obtained in each binary classification task. We propose supplementing the original training data with additional data on political conflict events. In addition, we provide an analysis of unigram probability estimates and geospatial references contained within the original training corpus. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000508467Publication status
publishedExternal links
Editor
Book title
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)Pages / Article No.
Publisher
Association for Computational LinguisticsEvent
Funding
787478 - Nationalist State Transformation and Conflict (EC)
More
Show all metadata
ETH Bibliography
yes
Altmetrics