Full-System Evaluation of the sPIN In-Network-Compute Architecture
dc.contributor.author
Xu, Pengcheng
dc.contributor.supervisor
Hoefler, Torsten
dc.contributor.supervisor
Khalilov, Mikhail
dc.contributor.supervisor
Schneider, Timo
dc.date.accessioned
2023-10-20T10:11:26Z
dc.date.available
2023-10-20T08:00:14Z
dc.date.available
2023-10-20T10:11:26Z
dc.date.issued
2023-09
dc.identifier.uri
http://hdl.handle.net/20.500.11850/637676
dc.identifier.doi
10.3929/ethz-b-000637676
dc.description.abstract
In-network-computing with SmartNICs is gaining popularity in high-performance networking for their ability to offload packet processing tasks from the CPU and their latency advantage thanks to the proximity to the network traffic without having to go through PCIe. The sPIN in-network-computing paradigm developed at ETH Zürich aims to provide a programming model for developers to build high-performance packet processing routines for on-path SmartNICs. While the paradigm has been evaluated with use cases from diverse scenarios in software and hardware simulation, it has yet to see a full E2E system-level evaluation that exercises the entire packet processing loop on hardware in the real world. In this thesis, we perform an end-to-end analysis of the sPIN paradigm by building a full-system prototype of sPIN on FPGA based on PsPIN, a cycle-accurate simulation prototype of sPIN, and Corundum, an open-source FPGA-based Ethernet NIC. We show that the resulting system FPsPIN facilitates the development and testing of sPIN handlers, allowing real-world performance and computation/communication overlap evaluations that would not have been possible with the old cycle-accurate simulation models due to the slow simulation speed and absence of a host CPU. We present various improvement suggestions to the sPIN specification, discovered through the process of building FPsPIN. In addition, we conduct a detailed performance evaluation of FPsPIN through three benchmarks implemented for the platform, showing a 50 us latency advantage, over 99% computation/communication overlap, 6.4 Gbps and 1.2 Gbps throughput in simple and complex synthetic benchmarks. The lower application throughput shows the deficiency of the packet processing cores used in FPsPIN and shows an opportunity for future research on desirable architectural features for SmartNIC cores.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.title
Full-System Evaluation of the sPIN In-Network-Compute Architecture
en_US
dc.type
Master Thesis
dc.rights.license
In Copyright - Non-Commercial Use Permitted
ethz.size
91 p.
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02666 - Institut für Hochleistungsrechnersysteme / Inst. f. High Performance Computing Syst::03950 - Hoefler, Torsten / Hoefler, Torsten
en_US
ethz.date.deposited
2023-10-20T08:00:14Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2023-10-20T10:11:27Z
ethz.rosetta.lastUpdated
2024-02-03T05:27:28Z
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Full-System%20Evaluation%20of%20the%20sPIN%20In-Network-Compute%20Architecture&rft.date=2023-09&rft.au=Xu,%20Pengcheng&rft.genre=unknown&rft.btitle=Full-System%20Evaluation%20of%20the%20sPIN%20In-Network-Compute%20Architecture
Files in this item
Publication type
-
Master Thesis [2133]