Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration
METADATA ONLY
Loading...
Author / Producer
Date
2022-03-01
Publication Type
Journal Article
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
Parallelism patterns (e.g., map or reduce) have proven to be effective tools for parallelizing high-performance applications. In this article, we study the recursive registration of a series of electron microscopy images - a time consuming and imbalanced computation necessary for nano-scale microscopy analysis. We show that by translating the image registration into a specific instance of the prefix scan, we can convert this seemingly sequential problem into a parallel computation that scales to over thousand of cores. We analyze a variety of scan algorithms that behave similarly for common low-compute operators and propose a novel work-stealing procedure for a hierarchical prefix scan. Our evaluation shows that by identifying a suitable and well-optimized prefix scan algorithm, we reduce time-to-solution on a series of 4,096 images spanning ten seconds of microscopy acquisition from over 10 hours to less than 3 minutes (using 1024 Intel Haswell cores), enabling derivation of material properties at nanoscale for long microscopy image series.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
33 (3)
Pages / Article No.
523 - 535
Publisher
Elsevier
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Organisational unit
03950 - Hoefler, Torsten / Hoefler, Torsten
Notes
Funding
170415 - Automatic Performance Modeling of HPC Applications with Multiple Model Parameters (SNF)