Abstract

This paper presents a barrel RISC-V processor designed to control a deep neural network accelerator. Our design has a 5-stage pipeline data path with 8 hardware threads (harts). Each thread is executed under a strict round robin scheduler and is responsible for providing data and control signals to a neural network processing element (PE). Each PE is capable of arbitrary precision GEneral Matrix Vector (GEMV) operations. The execution of each thread is independent of other threads and any communication between threads are sent through shared memory via software. To reduce the area required for implementation, our processor is an implementation of the RV32I plus a set of custom CSRs for controlling the PEs. Our design passes all riscv_test written in assembly and compiled with RISC-V gcc. Our 8-hart barrel processor runs at 250 MHz with CPI of 1 and consumes 0.372W. To demonstrate the capabilities of our design, we computed a GEMV operation with an input matrix size of 8 by 128 and a weight matrix size of 128 by 128 with two-bit precision in only 16 clock cycles.

Keywords:
Computer science Thread (computing) Reduced instruction set computing Instruction set Network processor Embedded system Pipeline (software) Parallel computing Computer hardware Operating system

Metrics

11
Cited By
0.92
FWCI (Field Weighted Citation Impact)
24
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

CCD and CMOS Imaging Sensors
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Parallel Computing and Optimization Techniques
Physical Sciences →  Computer Science →  Hardware and Architecture
© 2026 ScienceGate Book Chapters — All rights reserved.