This record is in review state, the data has not yet been validated.
This is not the latest version of this item. The latest version can be found here.
Maximus: A Modular Accelerated Query Engine for Data Analytics on Heterogeneous Systems
Loading...
Author / Producer
Date
2025-06-17
Publication Type
Journal Article
ETH Bibliography
yes
Citations
Web of Science:
Altmetric
Data
Rights / License
Abstract
Several trends are changing the underlying fabric for data processing in fundamental ways. On the hardware side, machines are becoming heterogeneous with smart NICs, TPUs, DPUs, etc., but specially with GPUs taking a more dominant role. On the software side, the diversity in workloads, data sources, and data formats has given rise to the notion of composable data processing where the data is processed across a variety of engines and platforms. Finally, on the infrastructure side, different storage types, disaggregated storage, disaggregated memory, networking, and interconnects are all rapidly evolving, which demands a degree of customization to optimize data movement well beyond established techniques. To tackle these challenges, in this paper, we present Maximus, a modular data processing engine that embraces heterogeneity from the ground up. Maximus can run queries on CPUs and GPUs, can split execution between CPUs and GPUs, import and export data in a variety of formats, interact with a wide range of query engines through Substrait, and efficiently manage the execution of complex data processing pipelines. Through the concept of operator-level integration, Maximus can use operators from third-party engines and achieve even better performance with these operators than when they are used with their native engines. The current version of Maximus supports all TPC-H queries on both the GPU and the CPU and optimizes the data movement and kernel execution between them, enabling the overlap of communication and computation to achieve performance comparable to that of the best systems available, but with a far higher degree of completeness and flexibility.
Permanent link
Publication status
External links
Editor
Book title
Journal / series
Proceedings of the ACM on Management of Data
Volume
3 (3)
Pages / Article No.
Publisher
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
03506 - Alonso, Gustavo / Alonso, Gustavo
