Metadata only
Date
2022Type
- Conference Paper
ETH Bibliography
yes
Altmetrics
Abstract
We consider two key issues faced by many clustering methods when used for data summarization, namely (a) an unfair representation of "demographic groups” and (b) distorted summarizations, where data points in the summary represent subsets of the original data of vastly different sizes. Previous work made important steps towards handling separately each of these two issues in the context of the fundamental k-Center clustering objective through the study of fast algorithms for natural models that address them. We show that it is possible to effectively address both (a) and (b) simultaneously by presenting a clustering procedure that works for a canonical combined model and (i) is fast, both in theory and practice, (ii) exhibits a worst-case constant-factor guarantee, and (iii) gives promising computational results showing that there can be significant benefits in addressing both issues together instead of sequentially. Show more
Publication status
publishedExternal links
Book title
Proceedings of the 39th International Conference on Machine LearningJournal / series
Proceedings of Machine Learning ResearchVolume
Pages / Article No.
Publisher
PMLREvent
Organisational unit
09487 - Zenklusen, Rico / Zenklusen, Rico
Funding
184622 - Toward Stronger Approximation Algorithms for Fundamental Network Design and Optimization Problems (SNF)
174117 - Theory and Applications of Linear and Semidefinite Relaxations for Combinatorial Optimization Problems (SNF)
817750 - Fundamental Problems at the Interface of Combinatorial Optimization with Integer Programming and Online Optimization (EC)
More
Show all metadata
ETH Bibliography
yes
Altmetrics