Error Span Annotation: A Balanced Approach for Human Evaluation of Machine Translation
OPEN ACCESS
Loading...
Author / Producer
Date
2024-11
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
High-quality Machine Translation (MT) evaluation relies heavily on human judgments.Comprehensive error classification methods, such as Multidimensional Quality Metrics (MQM), are expensive as they are time-consuming and can only be done by experts, whose availability may be limited especially for low-resource languages.On the other hand, just assigning overall scores, like Direct Assessment (DA), is simpler and faster and can be done by translators of any level, but is less reliable.In this paper, we introduce Error Span Annotation (ESA), a human evaluation protocol which combines the continuous rating of DA with the high-level error severity span marking of MQM.We validate ESA by comparing it to MQM and DA for 12 MT systems and one human reference translation (English to German) from WMT23. The results show that ESA offers faster and cheaper annotations than MQM at the same quality level, without the requirement of expensive MQM experts.
Permanent link
Publication status
published
External links
Book title
Proceedings of the Ninth Conference on Machine Translation
Journal / series
Volume
Pages / Article No.
1440 - 1453
Publisher
Association for Computational Linguistics
Event
9th Conference on Machine Translation (WMT 2024)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09684 - Sachan, Mrinmaya / Sachan, Mrinmaya
Notes
Funding
Related publications and datasets
Is supplemented by: https://github.com/wmt-conference/ErrorSpanAnnotation