Abstract
C is the lingua franca of programming and almost any device can be programmed using C. However, programming modern heterogeneous architectures such as multi-core CPUs and GPUs requires explicitly expressing parallelism as well as device-specific properties such as memory hierarchies. The resulting code is often hard to understand, debug, and modify for different architectures. We propose to lift C programs to a parametric dataflow representation that lends itself to static data-centric analysis and enables automatic high-performance code generation. We separate writing code from optimizing for different hardware: simple, portable C source code is used to generate efficient specialized versions with a click of a button. Our approach can identify parallelism when no other compiler can, and outperforms a bespoke parallelized version of a scientific proxy application by up to 21%. Show more
Publication status
publishedExternal links
Book title
ICS '22: Proceedings of the 36th ACM International Conference on SupercomputingPages / Article No.
Publisher
Association for Computing MachineryEvent
Subject
Parallelism; Dataflow analysis; Automatic parallelizationOrganisational unit
03950 - Hoefler, Torsten / Hoefler, Torsten
Funding
101034126 - Pilot using Independent Local & Open Technologies (EC)
101002047 - Productive Spatial Accelerator Programming (EC)
955776 - Network Solution for Exascale Architectures (EC)
955606 - DEEP- Software for Exascale Archtiectures (EC)
More
Show all metadata