Abstract

This paper presents the design and FPGA implementation of a convolutional neural network accelerator (CNNA). Two kinds of sparsity, zero-valued weights and zero-valued input feature map, are exploited to save power. The design features hierarchical memory organization to reduce external memory access. Bandwidth compression and decompression are also proposed to reduce external memory bandwidth. The unified scratch memory can be configured dynamically layer-by-layer to maximize memory utilization. The proposed CNNA is designed in Xilinx high level synthesis (HLS) language and implemented on ZCU102 board. With totally 2048 multiply-and-accumulation (MAC) unit, the design is able to deliver 1TOPS computing power when running at 250MHz.

Keywords:
Field-programmable gate array Convolutional neural network Computer science Artificial neural network Embedded system Computer architecture Artificial intelligence

Metrics

7
Cited By
0.46
FWCI (Field Weighted Citation Impact)
4
Refs
0.67
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

CCD and CMOS Imaging Sensors
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Memory and Neural Computing
Physical Sciences →  Engineering →  Electrical and Electronic Engineering

Related Documents

JOURNAL ARTICLE

FPGA-based Accelerator for Convolutional Neural Network

YU Zijian,MA De,YAN Xiaolang,SHEN Juncheng

Journal:   DOAJ (DOAJ: Directory of Open Access Journals) Year: 2017
JOURNAL ARTICLE

A Convolutional Neural Network Accelerator Based on FPGA

Jincheng ZouQing TangCongcong He

Journal:   International Journal of Computer Applications Technology and Research Year: 2023 Pages: 12-15
© 2026 ScienceGate Book Chapters — All rights reserved.