MARS: Processing-In-Memory Acceleration of Raw Signal Genome Analysis Inside the Storage Subsystem


Loading...

Date

2025

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Scopus:
Altmetric

Data

Abstract

Conventional genome analysis relies on translating the noisy raw electrical signals generated by DNA sequencing technologies into nucleotide bases (i.e., A, C, G, and T) through a computationally-intensive process called basecalling. Raw signal genome analysis (RSGA) has emerged as a promising approach towards enabling real-time genome analysis by directly analyzing raw electrical signals without the need for basecalling. However, rapid advancements in sequencing technologies make it increasingly difficult for software-based RSGA to match the throughput of raw signal generation. Hardware-based RSGA acceleration has the potential to bridge the gap between software-based RSGA and sequencing throughput.This paper demonstrates that while (i) conventional hardware acceleration techniques (e.g., specialized ASICs) in tandem with (ii) memory-centric approaches (e.g., Processing-In-Memory) can significantly accelerate RSGA, the high volume of genomic data greatly shifts the performance and energy bottleneck from computation to I/O data movement. As sequencing throughput increases, I/O overhead becomes the dominant contributor to both runtime and energy consumption, limiting the scalability of both processor-centric and main-memory-centric accelerators. Therefore, there is a pressing need to design a high-performance, energy-efficient system for RSGA that can both alleviate the data movement bottleneck and provide large acceleration capabilities.We propose MARS, a storage-centric system that leverages the heterogeneous resources available within modern storage systems (e.g., storage-internal DRAM, storage controller, flash chips) alongside their large storage capacity to tackle both data movement and computational overheads of RSGA in an area-efficient and low-cost manner. MARS accelerates RSGA through a novel hardware/software co-design approach using three major techniques. First, MARS modifies the RSGA pipeline via a previously unexplored combination of two filtering mechanisms and a quantization scheme, reducing hardware demands and optimizing for in-storage execution. Second, MARS accelerates the modified RSGA steps directly within the storage device by leveraging both Processing-Near-Memory and Processing-Using-Memory paradigms, tailored to the internal architecture of the storage system. Third, MARS orchestrates the execution of all steps via a streamlined control and data flow to fully exploit in-storage parallelism and minimize data movement. Our evaluation shows that MARS outperforms basecalling-based software and hardware-accelerated state-of-the-art read mapping pipelines by 93 × and 40 ×, on average across different datasets, while reducing their energy consumption by 427 × and 72 ×. MARS improves the performance of state-of-the-art RSGA-based read mapping pipeline by 28 × while reducing its energy consumption by 180 × on average across different datasets.

Publication status

published

Editor

Book title

ICS '25: Proceedings of the 39th ACM International Conference on Supercomputing

Journal / series

Volume

Pages / Article No.

513 - 534

Publisher

Association for Computing Machinery

Event

39th ACM International Conference on Supercomputing (ICS 2025)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Processing-in-memory; Genome analysis; In stroage processing; Processing-near-memory

Organisational unit

09483 - Mutlu, Onur / Mutlu, Onur check_circle

Notes

Funding

Related publications and datasets