Convolutional Neural Networks (CNNs) are the gold-standard for computer vision. Using CNN on embedded hardware that has limited computational capability is an area of active investigation and optimization. In this paper, we investigate the potential of extending the RISC-V Instruction Set Architecture for accelerating the inference of a CNN using in-pipeline hardware blocks and custom instructions. Our preliminary designs have a small footprint and minimal impact on maximum core frequency. The new designed instructions were used to extend an existing soft-core processor. This processor was synthesized to an FPGA for cycle-accurate testing and performance evaluation.
Pallabi SarkarReza SedaghatAnirban Sengupta
Yoshiki KimuraTomoya KikuchiKanemitsu OotsuTakashi Yokota
Jonas GavaGuilherme DornelesRicardo ReisRafael GaribottiLuciano Ost
Farhad TaheriSiavash Bayat-SarmadiShahriar Hadayeghparast
Ádria Barros de OliveiraLucas Antunes TambaraFábio BenevenutiLuis Alberto Contreras BenitesN. AddedVitor A. P. AguiarN. H. MedinaM. GuazzelliFernanda Lima Kastensmidt