A Framework To Fast Reroute Traffic Upon Remote Outages
dc.contributor.author
Holterbach, Thomas
dc.contributor.supervisor
Vanbever, Laurent
dc.contributor.supervisor
Perrig, Adrian
dc.contributor.supervisor
Claffy, Kimberly
dc.date.accessioned
2021-12-21T05:35:50Z
dc.date.available
2021-12-20T23:01:23Z
dc.date.available
2021-12-21T05:35:50Z
dc.date.issued
2021-12-03
dc.identifier.uri
http://hdl.handle.net/20.500.11850/521609
dc.identifier.doi
10.3929/ethz-b-000521609
dc.description.abstract
Nowadays, so many services – including critical ones – rely on the Internet to work that even a few minutes of connectivity disruption make customers unhappy and cause sizeable financial loss for companies. Ensuring that customers are always connected to the Internet is thus a top priority for Internet service providers. However, this is harder than one may think because the Internet is often subject to network outages.
Network outages are a headache for network operators because they are unpredictable, can occur in any of the 70,000 independently operated networks composing the Internet, and can affect users’ connectivity network-wide. Far too often, the only way to restore connectivity upon an outage is to wait that (i) BGP, the glue of the Internet, converges; and (ii) the routers update their forwarding decisions accordingly. Unfortunately, these two processes work on a per-destination basis and are thus inherently slow given the always-increasing number of destinations in the Internet. It is therefore not a surprise that network operators still experience minutes of downtime upon outages.
In this dissertation, we tackle the problem of fast connectivity recovery upon outages occurring in remote networks, without requiring network operators to change the standards, manufacture new devices, or cooperate with each other. The final result of our work is Snap, a framework that network operators can deploy on their routers and allows them to quickly detect outages and reroute tra ffic to working alternative paths that comply with the configured routing policies. Snap’s design follows a two-step recipe. First, it uses an outage inference algorithm based on new fundamental results and which, instead of waiting for the slow control-plane (BGP) notifications, analyzes the fast data-plane signals. Second, it uses a rerouting scheme that allows routers to quickly reroute all the a ffected traffi c to alternative paths circumventing the outage.
Snap’s design takes advantage of the recent advances in network programmability and relies on a hardware-software codesign. To be fast, Snap collects data-plane signals at line-rate using programmable switches (e.g., Tofino). The switches then mirror the signals to a controller, which accurately infers remote outages and triggers tra ffic rerouting. We implemented Snap in P416 and Python and show its e ffectiveness in many real-world situations. Our results indicate that Snap can restore connectivity within a few seconds only, which is much faster than the few minutes often needed by traditional routers.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Fast Reroute
en_US
dc.subject
Internet Routing
en_US
dc.title
A Framework To Fast Reroute Traffic Upon Remote Outages
en_US
dc.type
Doctoral Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2021-12-21
ethz.journal.title
TIK Schriftenreihe
ethz.journal.volume
191
en_US
ethz.size
165 p.
en_US
ethz.code.ddc
DDC - DDC::0 - Computer science, information & general works::004 - Data processing, computer science
en_US
ethz.identifier.diss
27857
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02640 - Inst. f. Technische Informatik und Komm. / Computer Eng. and Networks Lab.::09477 - Vanbever, Laurent / Vanbever, Laurent
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02660 - Institut für Informationssicherheit / Institute of Information Security::03975 - Perrig, Adrian / Perrig, Adrian
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02660 - Institut für Informationssicherheit / Institute of Information Security::03975 - Perrig, Adrian / Perrig, Adrian
en_US
ethz.date.deposited
2021-12-20T23:01:28Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2021-12-21T05:35:57Z
ethz.rosetta.lastUpdated
2022-03-29T16:49:03Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=A%20Framework%20To%20Fast%20Reroute%20Traffic%20Upon%20Remote%20Outages&rft.jtitle=TIK%20Schriftenreihe&rft.date=2021-12-03&rft.volume=191&rft.au=Holterbach,%20Thomas&rft.genre=unknown&
Files in this item
Publication type
-
Doctoral Thesis [30558]