Learning a Cost-Effective Annotation Policy for Question Answering
OPEN ACCESS
Loading...
Author / Producer
Date
2020-11
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive. For this reason, customizing QA systems is challenging. As a remedy, we propose a novel framework for annotating QA datasets that entails learning a cost-effective annotation policy and a semi-supervised annotation scheme. The latter reduces the human effort: it leverages the underlying QA system to suggest potential candidate annotations. Human annotators then simply provide binary feedback on these candidates. Our system is designed such that past annotations continuously improve the future performance and thus overall annotation cost. To the best of our knowledge, this is the first paper to address the problem of annotating questions with minimal annotation cost. We compare our framework against traditional manual annotations in an extensive set of experiments. We find that our approach can reduce up to 21.1% of the annotation cost.
Permanent link
Publication status
published
Book title
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Journal / series
Volume
Pages / Article No.
3051 - 3062
Publisher
Association for Computational Linguistics
Event
Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09623 - Feuerriegel, Stefan (ehemalig) / Feuerriegel, Stefan (former)
Notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.