Open access
Author
Date
2023-10-14Type
- Master Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The spatial join is one of the most expensive operations in modern spatial database systems. Throughout the years a number of indices, join algorithms, and different approaches have been proposed, yet no clear optimal strategy for performing the join has been found. Recent research has been heavily focused on introducing a number of system level optimisations and utilizing hardware acceleration to speed up the spatial join.
In this work we set out to better understand the spatial join land- scape, in order to best support current research into the spatial join carried out at the ETH Systems Group. We benchmark a number of existing open source CPU and GPU-based systems, diving into de- sign decisions and tradeoffs. We also implement our own single- and multi-threaded version of two well-known spatial algorithms that have shown to be the best performing in their class, namely R-tree Syn- chronous Traversal and Partition Based Sweep Merge. We then bench- mark all these systems on datasets consisting of millions of entries. These datasets include both synthetic and real-world data. Eventually, we find that the our multi-threaded Synchronous Traversal implemen- tation is the best performing algorithm. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000637149Publication status
publishedPublisher
ETH ZurichSubject
database; Geospatial; computer architectureOrganisational unit
03506 - Alonso, Gustavo / Alonso, Gustavo
More
Show all metadata
ETH Bibliography
yes
Altmetrics