JOURNAL ARTICLE

A Computing Efficient Hardware Architecture for Sparse Deep Neural Network Computing

Abstract

Convolutional Neural Networks (CNNs) have demonstrated significant performance in AI (artificial intelligence) systems. However, CNNs often have tens or even hundreds of neural layers with millions of parameters to achieve state-of-the-art performance, which hinders the deployment to some resource limited scenarios. Meanwhile, those parameters and data usually are sparse, which results in useless calculation as well as unbalanced calculation. To solve these problem, we propose a computing efficient hardware architecture. In order to decrease calculating redundancy, we filter zero-valued weights and zero-valued feature maps. To reduce redundant memory consumption, we propose a memory division and a data reuse mechanism. To resolve load imbalance, we implement a near-zero-cost scheduling switching strategy. Experimental results show that our architecture saves, on average, 22.6% memory times and 60.5% computing time over the state-of-the-art NN accelerator.

Keywords:
Computer science Redundancy (engineering) Convolutional neural network Artificial neural network Architecture Computer engineering Reuse Scheduling (production processes) Computer architecture Distributed computing Parallel computing Embedded system Computer hardware Artificial intelligence Operating system

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
6
Refs
0.15
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Brain Tumor Detection and Classification
Life Sciences →  Neuroscience →  Neurology
CCD and CMOS Imaging Sensors
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.