Late Breaking Results: FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization


METADATA ONLY
Loading...

Date

2022-07

Publication Type

Conference Paper

ETH Bibliography

yes

Citations

Altmetric
METADATA ONLY

Data

Rights / License

Abstract

Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. However, their complex architecture and enormous computation/storage demand impose urgent needs for new hardware accelerator design methodology. This work proposes an FPGA-aware automatic ViT acceleration framework based on the proposed mixed-scheme quantization. To the best of our knowledge, this is the first FPGA-based ViT acceleration framework exploring model quantization. Compared with state-of-the-art ViT quantization work (algorithmic approach only without hardware acceleration), our quantization achieves 0.31% to 1.25% higher Top-1 accuracy under the same bit-width. Compared with the 32-bit floating-point baseline FPGA accelerator, our accelerator achieves around 5.6× improvement on the frame rate (i.e., 56.4 FPS vs. 10.0 FPS) with 0.83% accuracy drop for DeiT-base.

Publication status

published

Editor

Book title

DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference

Journal / series

Volume

Pages / Article No.

1394 - 1395

Publisher

Association for Computing Machinery

Event

59th ACM/IEEE Design Automation Conference (DAC 2022)

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Organisational unit

Notes

Funding

Related publications and datasets