Fault-Tolerant Distributed Directories
OPEN ACCESS
Loading...
Author / Producer
Date
2024
Publication Type
Conference Paper
ETH Bibliography
yes
Citations
Altmetric
OPEN ACCESS
Data
Rights / License
Abstract
Many fundamental distributed computing problems require coordinated access to a shared resource. A distributed directory is an overlay data structure on an asynchronous graph G that helps to access a shared token t. The directory supports three basic operations: publish, to initialize the directory, lookup, to read the contents of the token, and move, to get exclusive update access to the token. There are known directory schemes that achieve message complexity within polylog factors of the optimal cost with respect to the number of nodes n and the diameter D of G. Motivated by fault-tolerant distributed computing implementations, we consider the impact of edge failures on distributed directories. We give a distributed directory overlay data structure that can tolerate edge failures without disrupting the directory operations. The directory can be repaired concurrently while it processes directory operations. We analyze the impact of the faults on the amortized cost of the three directory operations compared to the optimal cost. We show that f edges failures increase the amortized competitive ratio of the operations by at most factor f. We also analyze the message complexity to repair the overlay structure, in terms of the number of messages that are sent and the maximum distance a message traverses. For an edge failure, the repair mechanism uses messages of size O(log n) that traverse distance at most D′, the graph diameter after the fault. To our knowledge, this is the first asymptotic analysis of a fault-tolerant distributed directory.
Permanent link
Publication status
published
External links
Book title
3rd Symposium on Algorithmic Foundations of Dynamic Networks (SAND 2024)
Volume
292
Pages / Article No.
5
Publisher
Schloss Dagstuhl – Leibniz-Zentrum für Informatik
Event
3rd Symposium on Algorithmic Foundations of Dynamic Networks (SAND 2024)
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
distributed directory; sparse partition; fault tolerance; message complexity; path dilation
Organisational unit
03604 - Wattenhofer, Roger / Wattenhofer, Roger