Justifying our Credences in the Trustworthiness of AI Systems: A Reliabilistic Approach


METADATA ONLY
Loading...

Author / Producer

Date

2023-07-28

Publication Type

Working Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

We address an open problem in the epistemology of artificial intelligence (AI), namely, the justification of the epistemic attitudes we have towards the trustworthiness of AI systems. We start from a key consideration: the trustworthiness of an AI is a time-relative property of the system, with two distinct facets. One is the actual trustworthiness of the AI, and the other is the perceived trustworthiness of the system as assessed by its users while interacting with it. We show that credences, namely, beliefs we hold with a degree of confidence, are the appropriate attitude for capturing the facets of trustworthiness of an AI over time. Then, we introduce a reliabilistic account providing justification to the credence in the trustworthiness of AI, which we derive from Tang’s probabilistic theory of justified credence. Our account stipulates that a credence in the trustworthiness of an AI system is justified if and only if it is caused by an assessment process that tends to result in a high proportion of credences for which the actual and perceived trustworthiness of the AI are calibrated. Our approach informs research on human-AI interactions and trustworthy AI by providing actionable recommendations on how to measure the reliability of the process through which users perceive the trustworthiness of the system and its calibration to the actual levels of trustworthiness of the AI. It also allows investigating the relation between reliability and the appropriate reliance on the system.

Publication status

published

Editor

Book title

Journal / series

Volume

Pages / Article No.

Publisher

Social Science Research Network

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Artificial intelligence; Trustworthiness; Trustworthy AI; Epistemology; Justification; Reliabilism; Belief; Credence

Organisational unit

03995 - von Wangenheim, Florian / von Wangenheim, Florian check_circle

Notes

Funding

Related publications and datasets

Is previous version of: