Show simple item record

dc.contributor.author
Rutishauser, Georg
dc.contributor.author
Cavigelli, Lukas Arno Jakob
dc.contributor.author
Benini, Luca
dc.date.accessioned
2022-08-31T09:07:03Z
dc.date.available
2020-01-09T09:58:33Z
dc.date.available
2020-01-09T10:02:06Z
dc.date.available
2022-08-31T09:07:03Z
dc.date.issued
2020-01
dc.identifier.uri
http://hdl.handle.net/20.500.11850/388819
dc.identifier.doi
10.3929/ethz-b-000388819
dc.description.abstract
Specialized hardware architectures and dedicated accelerators allow the application of Deep Learning directly within sensing nodes. With compute resources highly optimized for energy efficiency, a large part of the power consumption of such devices is caused by transfers of intermediate feature maps to and from large memories. Moreover, a significant share of the silicon area is dedicated to these memories to avoid highly expensive off-chip memory accesses. Extended Bit-Plane Compression (EBPC), a recently proposed compression scheme targeting DNN feature maps, offers an opportunity to increase energy efficiency by reducing both the data transfer volume and the size of large background memories. Besides exhibiting state-of-the-art compression ratios, it also has a small, simple hardware implementation. In post-layout power simulations, we show an energy cost between 0.27 pJ/word and 0.45 pJ/word, 3 orders of magnitude lower than the cost of off-chip memory accesses. It allows for a reduction in off-chip access energy by factors of 2.2x to 4x for MobileNetV2 and VGG16 respectively and can reduce on-chip access energy by up to 45 %. We further propose a way to integrate the EBPC hardware blocks, which perform on-the-fly compression and decompression on 8-bit feature map streams, into an embedded ultra-low-power processing system and show how the challenges arising from a variable-length compressed representation can be navigated in this context.
en_US
dc.format
application/pdf
en_US
dc.language.iso
en
en_US
dc.publisher
ETH Zurich
en_US
dc.rights.uri
http://rightsstatements.org/page/InC-NC/1.0/
dc.subject
Edge AI
en_US
dc.subject
Feature Map Compression
en_US
dc.subject
Deep Learning
en_US
dc.subject
Hardware Acceleration
en_US
dc.title
An On-the-Fly Feature Map Compression Engine for Background Memory Access Cost Reduction in DNN Inference
en_US
dc.type
Working Paper
dc.rights.license
In Copyright - Non-Commercial Use Permitted
dc.date.published
2020-01-09
ethz.size
15 p.
en_US
ethz.publication.place
Zurich
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02636 - Institut für Integrierte Systeme / Integrated Systems Laboratory::03996 - Benini, Luca / Benini, Luca
en_US
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02636 - Institut für Integrierte Systeme / Integrated Systems Laboratory::03996 - Benini, Luca / Benini, Luca
en_US
ethz.date.deposited
2020-01-09T09:58:41Z
ethz.source
FORM
ethz.eth
yes
en_US
ethz.availability
Open access
en_US
ethz.rosetta.installDate
2020-01-09T10:02:16Z
ethz.rosetta.lastUpdated
2023-02-07T05:53:19Z
ethz.rosetta.exportRequired
true
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=An%20On-the-Fly%20Feature%20Map%20Compression%20Engine%20for%20Background%20Memory%20Access%20Cost%20Reduction%20in%20DNN%20Inference&rft.date=2020-01&rft.au=Rutishauser,%20Georg&Cavigelli,%20Lukas%20Arno%20Jakob&Benini,%20Luca&rft.genre=preprint&rft.btitle=An%20On-the-Fly%20Feature%20Map%20Compression%20Engine%20for%20Background%20Memory%20Access%20Cost%20Reduction%20in%20DNN%20Inference
 Search print copy at ETH Library

Files in this item

Thumbnail

Publication type

Show simple item record