Multilingual language models predict human reading behavior


Loading...

Date

2021-06

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

We analyze if large language models are able to predict patterns of human reading behavior. We compare the performance of language-specific and multilingual pretrained transformer models to predict reading time measures reflecting natural human sentence processing on Dutch, English, German, and Russian texts. This results in accurate models of human reading behavior, which indicates that transformer models implicitly encode relative importance in language in a way that is comparable to human processing mechanisms. We find that BERT and XLM models successfully predict a range of eye tracking features. In a series of experiments, we analyze the cross-domain and cross-language abilities of these models and show how they reflect human sentence processing.

Publication status

published

Book title

Proceedings 2021 Conference of the North American Chapter of the Association for Computational Linguistics

Journal / series

Volume

Pages / Article No.

106 - 123

Publisher

Association for Computational Linguistics

Event

2021 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

09588 - Zhang, Ce (ehemalig) / Zhang, Ce (former) check_circle

Notes

Funding

Related publications and datasets