Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration

Yongshuai Sun; Mengyu Guo; Dacheng Liang; Shan Tang; Naifeng Jing

doi:10.1109/asicon52560.2021.9620448

ScienceGate Book Chapters

JOURNAL ARTICLE

Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration

Yongshuai Sun Mengyu Guo Dacheng Liang Shan Tang Naifeng Jing

Year: 2021 Journal: 2021 IEEE 14th International Conference on ASIC (ASICON) Pages: 1-4

DOI: 10.1109/asicon52560.2021.9620448

Get Full-Text PDF Get Analytical Report

Abstract

Data sparsity is important in accelerating deep neural networks (DNNs). However, besides the zeroed values, the bit sparsity especially in activations are oftentimes missing in conventional DNN accelerators. In this paper, we present a DNN accelerator to exploit the bit sparsity by dynamically skipping zeroed bits in activations. To this goal, we first substitute the multiply-and-accumulate (MAC) units with more serial shift-and-accumulate units to sustain the computing parallelism. To prevent the low efficiency caused by the random number and positions of the zeroed bits in different activations, we propose activation-grouping, so that the activations in the same group can be computed on non-zero bits in different channels freely, and synchronization is only needed between different groups. We implement the proposed accelerator with 16 process units (PU) and 16 processing elements (PE) in each PU on FPGA built upon VTA (Versatile Tensor Accelerator) which can integrate seamlessly with TVM compilation. We evaluate the efficiency of our design with convolutional layers in resnet18 respectively, which achieves over 3.2x speedup on average compared with VTA design. In terms of the whole network, it can achieve over 2.26x speedup and over 2.0x improvement on area efficiency.

Keywords:

Speedup Computer science Convolutional neural network Field-programmable gate array Synchronization (alternating current) Acceleration Hardware acceleration Process (computing) Parallel computing Artificial neural network Computer hardware Exploit Artificial intelligence Channel (broadcasting) Computer network

Metrics

Cited By

0.06

FWCI (Field Weighted Citation Impact)

Refs

0.35

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Parallel Computing and Optimization Techniques

Physical Sciences → Computer Science → Hardware and Architecture

Tensor decomposition and applications

Physical Sciences → Mathematics → Computational Mathematics

Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration

Abstract

Metrics

Citation History

Topics

Related Documents

Deep Neural Network Acceleration Method Based on Sparsity

DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement

Dynamic Regularization on Activation Sparsity for Neural Network Efficiency Improvement

A Precision-Scalable Deep Neural Network Accelerator With Activation Sparsity Exploitation

PASS: Exploiting Post-Activation Sparsity in Streaming Architectures for CNN Acceleration