Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox
OPEN ACCESS
Loading...
Author / Producer
Date
2021-03-30
Publication Type
Journal Article
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers using machine learning (ML). However, metagenomics-specific software is scarce, and overoptimistic evaluation and limited cross-study generalization are prevailing issues. To address these, we developed SIAMCAT, a versatile R toolbox for ML-based comparative metagenomics. We demonstrate its capabilities in a meta-analysis of fecal metagenomic studies (10,803 samples). When naively transferred across studies, ML models lost accuracy and disease specificity, which could however be resolved by a novel training set augmentation strategy. This reveals some biomarkers to be disease-specific, with others shared across multiple conditions. SIAMCAT is freely available from siamcat.embl.de.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
22 (1)
Pages / Article No.
93
Publisher
BioMed Central
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Microbiome data analysis; Machine learning; Statistical modeling; Microbiome-wide association studies (MWAS); Meta-analysis
Organisational unit
09583 - Sunagawa, Shinichi / Sunagawa, Shinichi