Multilingual language models predict human reading behavior
OPEN ACCESS
Loading...
Author / Producer
Date
2021-06
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
We analyze if large language models are able to predict patterns of human reading behavior. We compare the performance of language-specific and multilingual pretrained transformer models to predict reading time measures reflecting natural human sentence processing on Dutch, English, German, and Russian texts. This results in accurate models of human reading behavior, which indicates that transformer models implicitly encode relative importance in language in a way that is comparable to human processing mechanisms. We find that BERT and XLM models successfully predict a range of eye tracking features. In a series of experiments, we analyze the cross-domain and cross-language abilities of these models and show how they reflect human sentence processing.
Permanent link
Publication status
published
Book title
Proceedings 2021 Conference of the North American Chapter of the Association for Computational Linguistics
Journal / series
Volume
Pages / Article No.
106 - 123
Publisher
Association for Computational Linguistics
Event
2021 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
09588 - Zhang, Ce (ehemalig) / Zhang, Ce (former)