Search
Results
-
AirLift: A Fast and Comprehensive Technique for Remapping Alignments between Reference Genomes
(2021)arXivAs genome sequencing tools and techniques improve, researchers are able to incrementally assemble more accurate reference genomes, which enable sensitivity in read mapping and downstream analysis such as variant calling. A more sensitive downstream analysis is critical for a better understanding of the genome donor (e.g., health characteristics). Therefore, read sets from sequenced samples should ideally be mapped to the latest available ...Working Paper -
TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering
(2022)arXivBasecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads ...Working Paper -
Packaging, containerization, and virtualization of computational omics methods: Advances, challenges, and opportunities
(2022)arXivOmics software tools have reshaped the landscape of modern biology and become an essential component of biomedical research. The increasing dependence of biomedical scientists on these powerful tools creates a need for easier installation and greater usability. Packaging, virtualization, and containerization are different approaches to satisfy this need by wrapping omics tools in additional software that makes the omics tools easier to ...Working Paper -
Utopia: Efficient Address Translation using Hybrid Virtual-to-Physical Address Mapping
(2022)arXivThe conventional virtual-to-physical address mapping scheme enables a virtual address to flexibly map to any physical address. This flexibility necessitates large data structures to store virtual-to-physical mappings, which incurs significantly high address translation latency and translation-induced interference in the memory hierarchy, especially in data-intensive workloads. Restricting the address mapping so that a virtual address can ...Working Paper -
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-Efficient Genome Analysis
(2022)arXivProfile hidden Markov models (pHMMs) are widely used in many bioinformatics applications to accurately identify similarities between biological sequences (e.g., DNA or protein sequences). PHMMs use a commonly-adopted and highly-accurate method, called the Baum-Welch algorithm, to calculate these similarities. However, the Baum-Welch algorithm is computationally expensive, and existing works provide either software- or hardware-only solutions ...Working Paper