Abstract
We present a novel strategy to improve load balancing for large scale Bayesian inference problems. Load imbalance can be particularly destructive in generation based uncertainty quantification (UQ) methods since all compute nodes in a large-scale allocation have to synchronize after every generation and therefore remain in an idle state until the longest model evaluation finishes. Our strategy relies on the concurrent scheduling of independent Bayesian inference experiments while sharing a group of worker nodes, reducing the destructive effects of workload imbalance in population-based sampling methods.
To demonstrate the efficiency of our method, we infer parameters of a red blood cell (RBC) model. We perform a data-driven calibration of the RBC's membrane viscosity by applying hierarchical Bayesian inference methods. To this end, we employ a computational model to simulate the relaxation of an initially stretched RBC towards its equilibrium state. The results of this work advance upon the current state of the art towards realistic blood flow simulations by providing inferred parameters for the RBC membrane viscosity.
We show that our strategy achieves a notable reduction in imbalance and significantly improves effective node usage on 512 nodes of the CSCS Piz Daint supercomputer. Our results show that, by enabling multiple independent sampling experiments to run concurrently on a given allocation of supercomputer nodes, our method sustains a high computational efficiency on a large-scale supercomputing setting. © 2020 ACM. Show more
Publication status
publishedExternal links
Book title
PASC '20: Proceedings of the Platform for Advanced Scientific Computing ConferencePages / Article No.
Publisher
ACMEvent
Subject
Bayesian Inference; Uncertainty Quantification; Load Balancing; High Performance Computing; Erythrocyte Membrane ViscosityOrganisational unit
03499 - Koumoutsakos, Petros (ehemalig) / Koumoutsakos, Petros (former)
02803 - Collegium Helveticum / Collegium Helveticum
Funding
341117 - Fluid Mechanics in Collective Behaviour: Multiscale Modelling and Applications (EC)
Notes
Due to Corona virus (COVID-19) the PASC’20 Conference was postponed to 2021. Conference lecture held on July 9, 2021.More
Show all metadata