Search

JavaScript is disabled for your browser. Some features of this site may not work without it.

Now showing items 1-10 of 12

Efficient graph-color compression with neighborhood-informed Bloom filters

Schilken, Ingo; Mustafa, Harun; Rätsch, Gunnar; et al. (2017)

bioRxiv

Technological advancements in high throughput DNA sequencing have led to an exponential growth of sequencing data being produced and stored as a byproduct of biomedical research. Despite its public availability, a majority of this data remains inaccessible to the research com- munity through a lack efficient data representation and indexing solutions. One of the available techniques to represent read data on a more abstract level is its ...

Working Paper

Metannot: A succinct data structure for compression of colors in dynamic de Bruijn graphs

Mustafa, Harun; Kahles, André; Karasikov, Mikhail; et al. (2017)

bioRxiv

Much of the DNA and RNA sequencing data available is in the form of high-throughput sequencing (HTS) reads and is currently unindexed by established sequence search databases. Recent succinct data structures for indexing both reference sequences and HTS data, along with associated metadata, have been based on either hashing or graph models, but many of these structures are static in nature, and thus, not well-suited as backends for dynamic ...

Working Paper

Aligning Distant Sequences to Graphs using Long Seed Sketches

Joudaki, Amir; Meterez, Alexandru; Mustafa, Harun; et al. (2022)

bioRxiv

Sequence-to-graph alignment is an important step in applications such as variant genotyping, read error correction and genome assembly. When a query sequence requires a substantial number of edits to align, approximate alignment tools that follow the seed-and-extend approach require shorter seeds to get any matches. However, in large graphs with high variation, relying on a shorter seed length leads to an exponential increase in spurious ...

Working Paper

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing

Rozhoňová, Hana; Danciu, Daniel; Stark, Stefan; et al. (2021)

bioRxiv

Recently developed single-cell DNA sequencing technologies enable whole-genome, amplification-free sequencing of thousands of cells at the cost of ultra-low coverage of the sequenced data (< 0.05x per cell), which mostly limits their usage to the identification of copy number alterations (CNAs) in multi-megabase segments. Aside from CNA-based subclone detection, single-nucleotide variant (SNV)-based subclone detection may contribute ...

Working Paper

Lossless Indexing with Counting de Bruijn Graphs

Karasikov, Mikhail; Mustafa, Harun; Rätsch, Gunnar; et al. (2021)

bioRxiv

High-throughput sequencing data is rapidly accumulating in public repositories. Making this resource accessible for interactive analysis at scale requires efficient approaches for its storage and indexing. There have recently been remarkable advances in solving the experiment discovery problem and building compressed representations of annotated de Bruijn graphs where k-mer sets can be efficiently indexed and interactively queried. However, ...

Working Paper

Using Genome Graph Topology to Guide Annotation Matrix Sparsification

Danciu, Daniel; Karasikov, Mikhail; Mustafa, Harun; et al. (2020)

bioRxiv

Since the amount of published biological sequencing data is growing exponentially, efficient methods for storing and indexing this data are more needed than ever to truly benefit from this invaluable resource for biomedical research. Labeled de Bruijn graphs are a frequently-used approach for representing large sets of sequencing data. While significant progress has been made to succinctly represent the graph itself, efficient methods for ...

Working Paper

MetaGraph: Indexing and Analysing Nucleotide Archives at Petabase-scale

Karasikov, Mikhail; Mustafa, Harun; Danciu, Daniel; et al. (2020)

bioRxiv

The amount of biological sequencing data available in public repositories is growing exponentially, forming an invaluable biomedical research resource. Yet, making all this sequencing data searchable and easily accessible to life science and data science researchers is an unsolved problem. We present MetaGraph, a versatile framework for the scalable analysis of extensive sequence repositories. MetaGraph efficiently indexes vast collections ...

Working Paper

Sparse Binary Relation Representations for Genome Graph Annotation

Karasikov, Mikhail; Mustafa, Harun; Joudaki, Amir; et al. (2018)

bioRxiv

High-throughput DNA sequencing data is accumulating in public repositories, and efficient approaches for storing and indexing such data are in high demand. In recent research, several graph data structures have been proposed to represent large sets of sequencing data and allow for efficient query of sequences. In particular, the concept of colored de Bruijn graphs has been explored by several groups. While there has been good progress ...

Working Paper

MetaGraph-MLA: Label-guided alignment to variable-order De Bruijn graphs

Mustafa, Harun; Karasikov, Mikhail; Rätsch, Gunnar; et al. (2022)

bioRxiv

The amount of data stored in genomic sequence databases is growing exponentially, far exceeding traditional indexing strategies’ processing capabilities. Many recent indexing methods organize sequence data into a sequence graph to succinctly represent large genomic data sets from reference genome and sequencing read set databases. These methods typically use De Bruijn graphs as the graph model or the underlying index model, with auxiliary ...

Working Paper

A comprehensive ML-based Respiratory Monitoring System for Physiological Monitoring & Resource Planning in the ICU

Hüser, Matthias; Lyu, Xinrui; Faltys, Martin; et al. (2024)

medRxiv

Respiratory failure (RF) is a frequent occurrence in critically ill patients and is associated with significant morbidity and mortality as well as resource use. To improve the monitoring and management of RF in intensive care unit (ICU) patients, we used machine learning to develop a monitoring system covering the entire management cycle of RF, from early detection and monitoring, to assessment of readiness for extubation and prediction ...

Working Paper

Results

Efficient graph-color compression with neighborhood-informed Bloom filters

Metannot: A succinct data structure for compression of colors in dynamic de Bruijn graphs

Aligning Distant Sequences to Graphs using Long Seed Sketches

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing

Lossless Indexing with Counting de Bruijn Graphs

Using Genome Graph Topology to Guide Annotation Matrix Sparsification

MetaGraph: Indexing and Analysing Nucleotide Archives at Petabase-scale

Sparse Binary Relation Representations for Genome Graph Annotation

MetaGraph-MLA: Label-guided alignment to variable-order De Bruijn graphs

A comprehensive ML-based Respiratory Monitoring System for Physiological Monitoring & Resource Planning in the ICU

Refine by

Research Collection

Search

Search

Results

Efficient graph-color compression with neighborhood-informed Bloom filters ﻿

Metannot: A succinct data structure for compression of colors in dynamic de Bruijn graphs ﻿

Aligning Distant Sequences to Graphs using Long Seed Sketches ﻿

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing ﻿

Lossless Indexing with Counting de Bruijn Graphs ﻿

Using Genome Graph Topology to Guide Annotation Matrix Sparsification ﻿

MetaGraph: Indexing and Analysing Nucleotide Archives at Petabase-scale ﻿

Sparse Binary Relation Representations for Genome Graph Annotation ﻿

MetaGraph-MLA: Label-guided alignment to variable-order De Bruijn graphs ﻿

A comprehensive ML-based Respiratory Monitoring System for Physiological Monitoring & Resource Planning in the ICU ﻿

Refine by

Efficient graph-color compression with neighborhood-informed Bloom filters

Metannot: A succinct data structure for compression of colors in dynamic de Bruijn graphs

Aligning Distant Sequences to Graphs using Long Seed Sketches

SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing

Lossless Indexing with Counting de Bruijn Graphs

Using Genome Graph Topology to Guide Annotation Matrix Sparsification

MetaGraph: Indexing and Analysing Nucleotide Archives at Petabase-scale

Sparse Binary Relation Representations for Genome Graph Annotation

MetaGraph-MLA: Label-guided alignment to variable-order De Bruijn graphs

A comprehensive ML-based Respiratory Monitoring System for Physiological Monitoring & Resource Planning in the ICU