
Open access
Author
Date
2021Type
- Bachelor Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The Graphics Processing Unit is designed to manipulate plenty of memory fast.
To use its full capacity, a deeper understanding of the underlying architecture is required.
This thesis presents a simple but still flexible Copy API to move N-dimensional data fragments between memory spaces in a GPU efficiently.
We introduce different strategies to divide fine-grained parallelism over a user given workload.
These strategies are then benchmarked to show their possible performance variety.
In a last step, we display the use of the Copy API on different algebraic applications, highlighting the advantages of access to simple and flexible data movement functions. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000490790Publication status
publishedPublisher
ETH ZurichSubject
GPUs; Parallel computing; Data movement; CUDAOrganisational unit
03950 - Hoefler, Torsten / Hoefler, Torsten
More
Show all metadata
ETH Bibliography
yes
Altmetrics