Open access
Author
Date
2020Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
This thesis encompasses three projects devoted to gaining biomedical insight from data gathered using high-throughput assays, such as next-generation sequencing, mass spectrometry (MS), and nuclear magnetic resonance (NMR) spectroscopy. The analyses are based on high-dimensional statistics and graphical models, with an emphasis on robustness and interpretability. In two of the projects, biological validation experiments were performed in order to assess the causal nature of relevant predictions made by the models.
The first project aimed to estimate signal progression from gene expression data obtained in gene perturbation experiments as well as an unperturbed system sampled over time. To this end, we extended the framework of a nested-effects model to incorporate data from perturbation experiments with a non-interventional time series (Cardner, Meyer-Schaller et al., 2019). Jointly analysing the two types of experiment in this manner yields an estimate of how signals progress through a pathway in response to a receptor stimulus. The method's development was motivated by experiments performed by Meyer-Schaller et al. (2019), and it was applied to the corresponding sets of gene expression data. Parts of the inferred signalling pathway were validated in luciferase reporter assays.
The second project concerned the function and composition of high-density lipoprotein (HDL), as well as its role in coronary heart disease (CHD) and type 2 diabetes mellitus (T2DM). Based on functional and compositional measurements of HDL in 51 healthy volunteers and 98 patients with CHD or T2DM, we aimed to understand how disease-relevant functions of HDL are determined by its composition, in terms of the hundreds of proteins and lipids which constitute HDL particles. To this end, we used a robust Gaussian graphical model to infer conditional dependence between HDL functions measured in bioassays and compositional features assayed using MS and NMR spectroscopy. We found several clinically relevant candidates, and experimentally validated novel causal links between certain HDL functions and compositional constituents (Cardner, Yalcinkaya et al., 2020).
The third and final project is devoted to liquid biopsies from cancer patients, which provide a non-invasive means of monitoring systemic tumour burden and assessing treatment response. Here we analysed data from shallow whole-genome sequencing of cell-free DNA (cfDNA) in blood plasma samples taken from 118 cancer patients. By leveraging certain biochemical properties of cfDNA, we developed a method for predicting the proportion of tumour-derived cfDNA in a sample. This prediction is useful in its own right for quantifying the tumour content of a sample, for instance during relapse monitoring. In addition, we used our method to quantify copy-number aberrations across the genome, thereby identifying deletions, gains, and focal amplifications of genetic material in the circulating tumour DNA. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000468190Publication status
publishedExternal links
Search print copy at ETH Library
Contributors
Examiner: Beerenwinkel, Niko
Examiner: Maathuis, Marloes H.
Examiner: Claassen, Manfred
Examiner: Ruiz, Christian
Publisher
ETH ZurichSubject
Statistical learning; High-dimensional data; Variable selection; Probabilistic graphical models; Computational biologyOrganisational unit
03790 - Beerenwinkel, Niko / Beerenwinkel, Niko
More
Show all metadata
ETH Bibliography
yes
Altmetrics