SimReadUntil for Benchmarking Selective Sequencing Algorithms on ONT Devices
Open access
Date
2023-11-04Type
- Working Paper
ETH Bibliography
yes
Altmetrics
Abstract
Motivation The Oxford Nanopore Technologies (ONT) ReadUntil API enables selective sequencing, which aims to reduce time spent on sequencing uninteresting reads in favor of more interesting reads, e.g., to deplete or enrich certain genomic regions. The performance gain depends on the selective sequencing decision-making algorithm (SSDA) which decides whether to reject a read, stop receiving a read or wait for more data. Since real runs are time-consuming and costly (at scale), simulating the ONT device with support for the ReadUntil API is highly beneficial to compare and optimize the parameters of SSDAs. Existing software like MinKNOW and UNCALLED only return raw signal data, are memory-intensive, require huge and often unavailable multi-fast5 files (≥ 100GB) and are not clearly documented.
Results We present the ONT device simulator SimReadUntil that takes a set of full (real or simulated) reads as input, distributes them to channels and plays them back in real time including mux scans, channel gaps and blockages, and allows to unblock (reject) reads as well as stop receiving data from them (imitating the ReadUntil API). Our modified ReadUntil API provides the basecalled reads rather than the raw signal to reduce computational load and focus on the SSDA rather than basecalling. Tuning the parameters of tools like ReadFish and ReadBouncer becomes easier because no GPU is required anymore for basecalling. We offer various methods to extract simulation parameters from a sequencing summary file and compare them. SimReadUntil ‘s gRPC interface allows standardized interaction with a wide range of programming languages. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000653797Publication status
publishedExternal links
Journal / series
bioRxivPublisher
Cold Spring Harbor LaboratorySubject
Oxford nanopore technologies; Simulator; ReadUntil API; NanoSimOrganisational unit
09568 - Rätsch, Gunnar / Rätsch, Gunnar
Funding
200550 - Dynamic reference indexes for selective sequencing with application to diagnostics (SNF)
Related publications and datasets
Is supplemented by: https://github.com/ratschlab/sim_read_until
More
Show all metadata
ETH Bibliography
yes
Altmetrics