Prediction Horizon vs. Efficiency of Optimal Dynamic Thermal Control Policies in HPC Nodes
- Conference Paper
Rights / licenseIn Copyright - Non-Commercial Use Permitted
We are entering the era of thermally-bound computing: Advanced and costly cooling solutions are needed to sustain the high computing densities of high-performance computing equipment. To reduce cooling costs and cooling overprovisioning, dynamic thermal management (DTM) strategies aim at controlling the device temperature by modulating online the performance of processing elements. While operating systems allow the migration of threads between cores, in HPC systems the threads of parallel applications are pinned to the allocated cores at start-time to avoid job-migration overheads. In this scenario state-of-the-art DTM solutions, which use thermal models to map jobs to cores, are based on long-term predictions to map the most critical job to the coldest core. Instead, turbo-mode and DVFS controllers are based on short-term predictions to squeeze the thermal capacitance allowing for short period performance boosts which are thermally unsustainable. In this work we propose an integer-linear programming formulation and a fast solver for controlling, at the same time, the job mapping and cores frequency selections in HPC nodes, tested with real supercomputer workload. Our approach can be integrated with the MPI runtimes and OpenMP libraries and is capable of assigning high-performance cores to performance-critical threads. We show that by combining long and short term predictions with information of the programming model we can significantly improve the performance of final application w.r.t. state-of-the-art DTM solutions Show more
Book title2017 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC)
Pages / Article No.
PublisherIEEE; Curran Associates.
Organisational unit03996 - Benini, Luca / Benini, Luca
MoreShow all metadata