Show simple item record

dc.contributor.author
Carreras, Marco
dc.contributor.author
Deriu, Gianfranco
dc.contributor.author
Raffo, Luigi
dc.contributor.author
Benini, Luca
dc.contributor.author
Meloni, Paolo
dc.date.accessioned
2020-10-13T15:27:43Z
dc.date.available
2020-10-10T02:45:02Z
dc.date.available
2020-10-13T15:27:43Z
dc.date.issued
2020-09
dc.identifier.issn
2156-3357
dc.identifier.other
10.1109/JETCAS.2020.3014503
en_US
dc.identifier.uri
http://hdl.handle.net/20.500.11850/445393
dc.description.abstract
Convolutional Neural Networks (CNNs) are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition and segmentation. Recent research results demonstrate that multi-layer (deep) network involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modeling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), represent an extremely promising alternative to recurrent architectures, commonly used across a broad range of sequence modeling tasks. While FPGA based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96% of theoretical peak performance. Overall we can achieve up to 111,8 GOPS/s and a power efficiency of 33,8 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10x speedup and 3x power efficiency improvement with respect to pure software implementation).
en_US
dc.language.iso
en
en_US
dc.publisher
Institute of Electrical and Electronics Engineers
en_US
dc.subject
Field programmable gate arrays
en_US
dc.subject
Task analysis
en_US
dc.subject
Computer architecture
en_US
dc.subject
Acceleration
en_US
dc.subject
Kernel
en_US
dc.subject
Quantization (signal)
en_US
dc.subject
Neural networks
en_US
dc.subject
Temporal convolutional network
en_US
dc.subject
TCN
en_US
dc.subject
hardware accelerator
en_US
dc.subject
FPGA
en_US
dc.subject
embedded systems
en_US
dc.title
Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators
en_US
dc.type
Journal Article
dc.date.published
2020-08-05
ethz.journal.title
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
ethz.journal.volume
10
en_US
ethz.journal.issue
3
en_US
ethz.journal.abbreviated
IEEE j. emerg. sel. top. circuits syst.
ethz.pages.start
348
en_US
ethz.pages.end
361
en_US
ethz.grant
software framework for runtime-Adaptive and secure deep Learning On Hetergeneous Architectures
en_US
ethz.identifier.wos
ethz.publication.place
New York, NY
en_US
ethz.publication.status
published
en_US
ethz.leitzahl
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02636 - Institut für Integrierte Systeme / Integrated Systems Laboratory::03996 - Benini, Luca / Benini, Luca
ethz.leitzahl.certified
ETH Zürich::00002 - ETH Zürich::00012 - Lehre und Forschung::00007 - Departemente::02140 - Dep. Inf.technologie und Elektrotechnik / Dep. of Inform.Technol. Electrical Eng.::02636 - Institut für Integrierte Systeme / Integrated Systems Laboratory::03996 - Benini, Luca / Benini, Luca
ethz.grant.agreementno
780788
ethz.grant.fundername
EC
ethz.grant.funderDoi
10.13039/501100000780
ethz.grant.program
H2020
ethz.date.deposited
2020-10-10T02:45:12Z
ethz.source
WOS
ethz.eth
yes
en_US
ethz.availability
Metadata only
en_US
ethz.rosetta.installDate
2020-10-13T15:27:56Z
ethz.rosetta.lastUpdated
2020-10-13T15:27:56Z
ethz.rosetta.exportRequired
true
ethz.rosetta.versionExported
true
ethz.COinS
ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Optimizing%20Temporal%20Convolutional%20Network%20Inference%20on%20FPGA-Based%20Accelerators&rft.jtitle=IEEE%20Journal%20on%20Emerging%20and%20Selected%20Topics%20in%20Circuits%20and%20Systems&rft.date=2020-09&rft.volume=10&rft.issue=3&rft.spage=348&rft.epage=361&rft.issn=2156-3357&rft.au=Carreras,%20Marco&Deriu,%20Gianfranco&Raffo,%20Luigi&Benini,%20Luca&Meloni,%20Paolo&rft.genre=article&
 Search via SFX

Files in this item

FilesSizeFormatOpen in viewer

There are no files associated with this item.

Publication type

Show simple item record