Please Mind the Root: Decoding Arborescences for Dependency Parsing
OPEN ACCESS
Loading...
Author / Producer
Date
2020-11
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
The connection between dependency trees and spanning trees is exploited by the NLP community to train and to decode graph-based dependency parsers. However, the NLP literature has missed an important difference between the two structures: only one edge may emanate from the root in a dependency tree. We analyzed the output of state-of-the-art parsers on many languages from the Universal Dependency Treebank: although these parsers are often able to learn that trees which violate the constraint should be assigned lower probabilities, their ability to do so unsurprisingly de-grades as the size of the training set decreases.In fact, the worst constraint-violation rate we observe is 24%. Prior work has proposed an inefficient algorithm to enforce the constraint, which adds a factor of n to the decoding runtime. We adapt an algorithm due to Gabow and Tarjan (1984) to dependency parsing, which satisfies the constraint without compromising the original runtime.
Permanent link
Publication status
published
Book title
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Journal / series
Volume
Pages / Article No.
4809 - 4819
Publisher
Association for Computational Linguistics
Event
Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09682 - Cotterell, Ryan / Cotterell, Ryan
Notes
Due to the Coronavirus (COVID-19) the conference was conducted virtually.