Efficient Convolutional Dataflows on Low-Power Neural Network Accelerators
METADATA ONLY
Loading...
Author / Producer
Date
2024-09
Publication Type
Journal Article
ETH Bibliography
yes
Citations
Altmetric
METADATA ONLY
Data
Rights / License
Abstract
Dilated and transposed convolutions are widely used in modern convolutional neural networks (CNNs). These kernels are used extensively during CNN training and inference of applications such as image segmentation and high-resolution image generation. We find that commonly-used low-power CNN inference accelerators are not optimized for both these convolutional kernels. Dilated and transposed convolutions introduce significant zero padding when mapped to the underlying spatial architecture, significantly degrading performance and energy efficiency. Existing approaches that address this issue require significant design changes to the otherwise simple, efficient, and well-adopted architectures used to compute direct convolutions. To address this challenge, we propose EcoFlow, a new set of dataflows and mapping algorithms for dilated and transposed convolutions. These algorithms are tailored to execute efficiently on existing low-cost, small-scale spatial architectures and requires minimal changes to existing accelerators. At its core, EcoFlow eliminates zero padding through careful dataflow orchestration and data mapping tailored to the spatial architecture. We evaluate EcoFlow on CNN training workloads and Generative Adversarial Network (GAN) workloads. Experiments in our new cycle-accurate simulator show that, using a common CNN inference accelerator, EcoFlow 1) reduces end-to-end CNN training time between 7-85%, and 2) improves end-to-end GAN training performance between 29-42%, compared to state-of-the-art CNN dataflows.
Permanent link
Publication status
published
External links
Editor
Book title
Journal / series
Volume
73 (9)
Pages / Article No.
2275 - 2289
Publisher
IEEE
Event
Edition / version
Methods
Software
Geographic location
Date collected
Date created
Subject
Convolutional neural network; Hardware accelerators
Organisational unit
09483 - Mutlu, Onur / Mutlu, Onur
Notes
Funding
Related publications and datasets
Is new version of: