Stella Nera: A Differentiable Maddness-Based Hardware Accelerator for Efficient Approximate Matrix Multiplication


METADATA ONLY
Loading...

Date

2025

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Web of Science:
Scopus:
Altmetric
METADATA ONLY

Data

Rights / License

Abstract

Artificial intelligence has surged in recent years, with advancements in machine learning rapidly impacting nearly every area of life. However, the growing complexity of these models has far outpaced advancements in available hardware accelerators, leading to significant computational and energy demands, primarily due to matrix multiplications, which dominate the compute workload. MADDNESS (i.e., Multiply-ADDitioN-lESS) presents a hash-based version of product quantization, which renders matrix multiplications into lookups and additions, eliminating the need for multipliers entirely. We present STELLA NERA1, the first MADDNESS-based accelerator achieving an energy efficiency of 161 TOp/s/W@0.55V, 25x better than conventional MatMul accelerators due to its small components and reduced computational complexity. We further enhance MADDNESS with a differentiable approximation, allowing for gradient-based fine-tuning and achieving an end-to-end performance of 92.5% Top-1 accuracy on CIFAR-10

Publication status

Editor

Book title

2025 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)

Journal / series

Volume

Pages / Article No.

11130225

Publisher

IEEE

Event

28th IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2025)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Hardware acceleration; Approximate MatMul; AI

Organisational unit

03996 - Benini, Luca / Benini, Luca check_circle

Notes

Funding

Related publications and datasets