JOURNAL ARTICLE

Exploiting Dynamic Bit Sparsity in Activation for Deep Neural Network Acceleration

Abstract

Data sparsity is important in accelerating deep neural networks (DNNs). However, besides the zeroed values, the bit sparsity especially in activations are oftentimes missing in conventional DNN accelerators. In this paper, we present a DNN accelerator to exploit the bit sparsity by dynamically skipping zeroed bits in activations. To this goal, we first substitute the multiply-and-accumulate (MAC) units with more serial shift-and-accumulate units to sustain the computing parallelism. To prevent the low efficiency caused by the random number and positions of the zeroed bits in different activations, we propose activation-grouping, so that the activations in the same group can be computed on non-zero bits in different channels freely, and synchronization is only needed between different groups. We implement the proposed accelerator with 16 process units (PU) and 16 processing elements (PE) in each PU on FPGA built upon VTA (Versatile Tensor Accelerator) which can integrate seamlessly with TVM compilation. We evaluate the efficiency of our design with convolutional layers in resnet18 respectively, which achieves over 3.2x speedup on average compared with VTA design. In terms of the whole network, it can achieve over 2.26x speedup and over 2.0x improvement on area efficiency.

Keywords:
Speedup Computer science Convolutional neural network Field-programmable gate array Synchronization (alternating current) Acceleration Hardware acceleration Process (computing) Parallel computing Artificial neural network Computer hardware Exploit Artificial intelligence Channel (broadcasting) Computer network

Metrics

1
Cited By
0.06
FWCI (Field Weighted Citation Impact)
13
Refs
0.35
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Parallel Computing and Optimization Techniques
Physical Sciences →  Computer Science →  Hardware and Architecture
Tensor decomposition and applications
Physical Sciences →  Mathematics →  Computational Mathematics

Related Documents

BOOK-CHAPTER

Deep Neural Network Acceleration Method Based on Sparsity

Ming HeHaiwu ZhaoGuozhong WangYu ChenLinlin ZhuYuan Gao

Communications in computer and information science Year: 2019 Pages: 133-145
JOURNAL ARTICLE

Dynamic Regularization on Activation Sparsity for Neural Network Efficiency Improvement

Qing YangJiachen MaoZuoguan Wang“Helen” Li Hai

Journal:   ACM Journal on Emerging Technologies in Computing Systems Year: 2021 Vol: 17 (4)Pages: 1-16
JOURNAL ARTICLE

A Precision-Scalable Deep Neural Network Accelerator With Activation Sparsity Exploitation

Wenjie LiAokun HuNingyi XuGuanghui He

Journal:   IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Year: 2023 Vol: 43 (1)Pages: 263-276
© 2026 ScienceGate Book Chapters — All rights reserved.