BnpC: Bayesian non-parametric clustering of single-cell mutation profiles


Date

2020-10-01

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Motivation The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intratumor heterogeneity (ITH) by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq datasets and technical limitations, such as high error rates and a large proportion of missing values, complicate this task and limit the applicability of existing methods. Results Here, we introduce BnpC, a novel non-parametric method to cluster individual cells into clones and infer their genotypes based on their noisy mutation profiles. We benchmarked our method comprehensively against state-of-the-art methods on simulated data using various data sizes, and applied it to three cancer scDNA-seq datasets. On simulated data, BnpC compared favorably against current methods in terms of accuracy, runtime and scalability. Its inferred genotypes were the most accurate, especially on highly heterogeneous data, and it was the only method able to run and produce results on datasets with 5000 cells. On tumor scDNA-seq data, BnpC was able to identify clonal populations missed by the original cluster analysis but supported by Supplementary Experimental Data. With ever growing scDNA-seq datasets, scalable and accurate methods such as BnpC will become increasingly relevant, not only to resolve ITH but also as a preprocessing step to reduce data size. Availability and implementation BnpC is freely available under MIT license at https://github.com/cbg-ethz/BnpC. Supplementary information Supplementary data are available at Bioinformatics online.

Publication status

published

Editor

Book title

Volume

36 (19)

Pages / Article No.

4854 - 4859

Publisher

Oxford University Press

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

03790 - Beerenwinkel, Niko / Beerenwinkel, Niko check_circle

Notes

Funding

766030 - Computational ONcology TRaining Alliance (EC)
609883 - Mechanisms of Evasive Resistance in Liver Cancer (EC)

Related publications and datasets