Robustifying Independent Component Analysis by Adjusting for Group-Wise Stationary Noise
Open access
Date
2019-10Type
- Journal Article
Abstract
We introduce coroICA, confounding-robust independent component analysis, a novel ICA algorithm which decomposes linearly mixed multivariate observations into independent components that are corrupted (and rendered dependent) by hidden group-wise stationary confounding. It extends the ordinary ICA model in a theoretically sound and explicit way to incorporate group-wise (or environment-wise) confounding. We show that our proposed general noise model allows to perform ICA in settings where other noisy ICA procedures fail. Additionally, it can be used for applications with grouped data by adjusting for different stationary noise within each group. Our proposed noise model has a natural relation to causality and we explain how it can be applied in the context of causal inference. In addition to our theoretical framework, we provide an efficient estimation procedure and prove identifiability of the unmixing matrix under mild assumptions. Finally, we illustrate the performance and robustness of our method on simulated data, provide audible and visual examples, and demonstrate the applicability to real-world scenarios by experiments on publicly available Antarctic ice core data as well as two EEG data sets. We provide a scikit-learn compatible pip-installable Python package coroICA as well as R and Matlab implementations accompanied by a documentation at https://sweichwald.de/coroICA/ Show more
Permanent link
https://doi.org/10.3929/ethz-b-000374036Publication status
publishedExternal links
Journal / series
Journal of Machine Learning ResearchVolume
Pages / Article No.
Publisher
MIT PressSubject
blind source separation; causal inference; confounding noise; group analysis; heterogeneous data; independent component analysis; non-stationary signal; robustnessOrganisational unit
03502 - Bühlmann, Peter L. / Bühlmann, Peter L.
09664 - Schölkopf, Bernhard / Schölkopf, Bernhard
Funding
786461 - Statistics, Prediction and Causality for Large-Scale Data (EC)
More
Show all metadata