Towards Data-Centric Automated Fact Checking
OPEN ACCESS
Author / Producer
Date
2024
Publication Type
Doctoral Thesis
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
In the digital age, we are faced with a steady stream of mis- and disinformation. Automatic fact checking tries to automatically detect factually wrong claims by contrasting them to trustworthy facts found in a dependable knowledge base. Such methods can be used to assist fact checkers and content moderators, and increase online safety by making online discourse more truthful.
This thesis is a cumulative thesis, and the individual projects are concerned with explainable claim verification, evidence retrieval, the knowledge bases from which we retrieve evidence and finally environmental claim detection. The recurrent theme is a focus on data, and thus can be loosely interpreted as data-centric automated fact checking. The contributions consist of firstly an automatically generated dataset for explainable claim verification using few-shot prompting and how to use such new technology to tackle problems which previously were thought of being too expensive to even approach. Secondly, new advances in sparse transformer models enable us to model data in evidence retrieval using more context. We show that this approach leads to better performance on all conceivable metrics while retrieving evidence for claim verification from Wikipedia pages. Thirdly, we expand the definition of data-centric in automated fact checking to all data dependencies, that is not only the individual datasets which should be of high quality, but also the knowledge bases used. Last, we introduce the task of environmental claim detection and annotate and release a strictly speaking data-centric expert-annotated dataset for this task.
Thus, this thesis tackles automated fact checking in the ever fast-paced field of Natural Language Processing. Three years in this field are a long time, and new methods and best practices are seemingly emerging every other month. We tried to do justice to such challenging circumstances.
Permanent link
Publication status
published
External links
Editor
Contributors
Examiner : Ash, Elliott
Examiner : Sachan, Mrinmaya
Examiner : Vlachos, Andreas
Book title
Journal / series
Volume
Pages / Article No.
Publisher
ETH Zurich
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Automated Fact Checking; Natural Language Processing (NLP); MACHINE LEARNING (ARTIFICIAL INTELLIGENCE)
Organisational unit
09627 - Ash, Elliott / Ash, Elliott