Hinweis
Dies ist nicht die aktuellste Version von diesem Eintrag. Die aktuellste Version finden Sie unter: https://www.research-collection.ethz.ch/handle/20.500.11850/284907
Open access
Autor(in)
Datum
2017-09-27Typ
- Master Thesis
ETH Bibliographie
yes
Altmetrics
Abstract
Distributed Joins over a network have been researched for decades, usually focusing on adapting the join to the network connecting the nodes holding the relations. Most research has gone into optimizing the join itself, i.e. the identification of matching tuples, however the effective materialization of the join result is equally important. The main performance issue identified by materialization strategies is that the network performs significantly worse than the local processing nodes, i.e. the transfer speed between nodes is the limiting factor. The conclusion drawn from this is that a materialization approach should reduce the amount of transmitted data by spending CPU time on the creation of optimal transfer schedules. In this thesis, we explore the possible changes to this materialization approach by considering a high-performance network. We propose a late-materialization approach with two different strategies for the exchange of data. We focus on optimizing CPU time and interleave communication and computation for the exchange of data. We then perform experiments for a wide range of parameters. The results show that, despite the interleaving of communication and computation, the implementation is network bound, thus concluding that even in high-performance networks, the data transfer has to be optimized. Mehr anzeigen
Persistenter Link
https://doi.org/10.3929/ethz-b-000284907Publikationsstatus
publishedBand
Verlag
Systems Group, Department of Computer Science, ETH ZurichOrganisationseinheit
03506 - Alonso, Gustavo / Alonso, Gustavo
ETH Bibliographie
yes
Altmetrics