EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference


METADATA ONLY
Loading...

Date

2020

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called EdgeDRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRURNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.

Publication status

published

Editor

Book title

2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)

Journal / series

Volume

Pages / Article No.

41 - 45

Publisher

IEEE

Event

2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS 2020) (virtual)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Edge computing; FPGA; Embedded system; Deep learning; RNN; GRU; Delta network

Organisational unit

02533 - Institut für Neuroinformatik / Institute of Neuroinformatics

Notes

Conference postponed due to Corona virus (COVID-19). Due to the Corona virus (COVID-19) the conference was conducted virtually.

Funding

Related publications and datasets