Show simple item record

dc.contributor.author
Jiang, Wenqi
dc.contributor.author
Li, Shigang
dc.contributor.author
Zhu, Yu
dc.contributor.author
de Fine Licht, Johannes
dc.contributor.author
He, Zhenhao
dc.contributor.author
Shi, Runbin
dc.contributor.author
Renggli, Cedric
dc.contributor.author
Zhang, Shuai
dc.contributor.author
Rekatsinas, Theodoros
dc.contributor.author
Hoefler, Torsten
dc.contributor.author
Alonso, Gustavo
dc.date.accessioned
2024-01-31T14:46:56Z
dc.date.available
2024-01-31T14:46:56Z
dc.date.issued
2023-11
dc.identifier.isbn
979-8-4007-0109-2
en_US
dc.identifier.other
10.1145/3581784.3607045
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/656938
dc.description.abstract
Vector search has emerged as the foundation for large-scale information retrieval and machine learning systems, with search engines like Google and Bing processing tens of thousands of queries per second on petabyte-scale document datasets by evaluating vector similarities between encoded query texts and web documents. As performance demands for vector search systems surge, accelerated hardware offers a promising solution in the post-Moore's Law era. We introduce FANNS, an end-to-end and scalable vector search framework on FPGAs. Given a user-provided recall requirement on a dataset and a hardware resource budget, FANNS automatically co-designs hardware and algorithm, subsequently generating the corresponding accelerator. The framework also supports scale-out by incorporating a hardware TCP/IP stack in the accelerator. FANNS attains up to 23.0× and 37.2× speedup compared to FPGA and CPU baselines, respectively, and demonstrates superior scalability to GPUs, achieving 5.5× and 7.6× speedup in median and 95th percentile (P95) latency within an eight-accelerator configuration. The remarkable performance of FANNS lays a robust groundwork for future FPGA integration in data centers and AI supercomputers.
en_US
dc.language.iso
en
en_US
dc.publisher
Association for Computing Machinery
en_US
dc.subject
Approximate nearest neighbor search
en_US
dc.subject
hardware acceleration
en_US
dc.subject
FPGA
en_US
dc.title
Co-design Hardware and Algorithm for Vector Search
en_US
dc.type
Conference Paper
dc.date.published
2023-11-11
ethz.book.title
SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
en_US
ethz.pages.start
87
en_US
ethz.size
15 p.
en_US
ethz.event
International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2023)
en_US
ethz.event.location
Denver, CO, USA
en_US
ethz.event.date
November 12-17, 2023
en_US
ethz.identifier.scopus
ethz.publication.place
New York, NY
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02663 - Institut für Computing Platforms / Institute for Computing Platforms::03506 - Alonso, Gustavo / Alonso, Gustavo
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02666 - Institut für Hochleistungsrechnersysteme / Inst. f. High Performance Computing Syst::03950 - Hoefler, Torsten / Hoefler, Torsten
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02663 - Institut für Computing Platforms / Institute for Computing Platforms::03506 - Alonso, Gustavo / Alonso, Gustavo
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02150 - Dep. Informatik / Dep. of Computer Science::02666 - Institut für Hochleistungsrechnersysteme / Inst. f. High Performance Computing Syst::03950 - Hoefler, Torsten / Hoefler, Torsten
ethz.date.deposited
2023-12-19T10:42:30Z
ethz.source
SCOPUS
ethz.source
BATCH
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2024-01-31T14:46:59Z
ethz.rosetta.lastUpdated
2024-01-31T14:46:59Z
ethz.rosetta.versionExported
true
dc.identifier.olduri
http://hdl.handle.net/20.500.11850/648577
dc.identifier.olduri
http://hdl.handle.net/20.500.11850/656521
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Co-design%20Hardware%20and%20Algorithm%20for%20Vector%20Search&rft.date=2023-11&rft.spage=87&rft.au=Jiang,%20Wenqi&Li,%20Shigang&Zhu,%20Yu&de%20Fine%20Licht,%20Johannes&He,%20Zhenhao&rft.isbn=979-8-4007-0109-2&rft.genre=proceeding&rft_id=info:doi/10.1145/3581784.3607045&rft.btitle=SC%20'23:%20Proceedings%20of%20the%20International%20Conference%20for%20High%20Performance%20Computing,%20Networking,%20Storage%20and%20Analysis
 Search print copy at ETH Library

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record