error
Kurzer Serviceunterbruch am Donnerstag, 16. April 2026, 12 bis 13 Uhr. Sie können in diesem Zeitraum keine neuen Dokumente hochladen oder bestehende Einträge bearbeiten. Das Login wird in diesem Zeitraum deaktiviert. Grund: Wartungsarbeiten // Short service interruption on Thursday, April 16, 2026, 12.00 – 13.00. During this time, you won’t be able to upload new documents or edit existing records. The login will be deactivated during this time. Reason: maintenance work
 

Data-Centric Factors in Algorithmic Fairness


Date

2022-07

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Notwithstanding the widely held view that data generation and data curation processes are prominent sources of bias in machine learning algorithms, there is little empirical research seeking to document and understand the specific data dimensions affecting algorithmic unfairness. Contra the previous work, which has focused on modeling using simple, small-scale benchmark datasets, we hold the model constant and methodically intervene on relevant dimensions of a much larger, more diverse dataset. For this purpose, we introduce a new dataset on recidivism in 1.5 million criminal cases from courts in the U.S. state of Wisconsin, 2000-2018. From this main dataset, we generate multiple auxiliary datasets to simulate different kinds of biases in the data. Focusing on algorithmic bias toward different race/ethnicity groups, we assess the relevance of training data size, base rate difference between groups, representation of groups in the training data, temporal aspects of data curation, including race/ethnicity or neighborhood characteristics as features, and training separate classifiers by race/ethnicity or crime type. We find that these factors often do influence fairness metrics holding the classifier specification constant, without having a corresponding effect on accuracy metrics. The methodology and the results in the paper provide a useful reference point for a data-centric approach to studying algorithmic fairness in recidivism prediction and beyond.

Publication status

published

Editor

Book title

AIES '22: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society

Journal / series

Volume

Pages / Article No.

396 - 410

Publisher

Association for Computing Machinery

Event

AAAI/ACM Conference on AI, Ethics, and Society (AIES 2022)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Algorithmic Fairness; Datasets; Recidivism Prediction; Machine Learning

Organisational unit

09627 - Ash, Elliott / Ash, Elliott check_circle

Notes

Funding

Related publications and datasets