Convolutional Neural Network (CNN) accelerator is highly beneficial for mobile and resource-constrained devices. One of the research challenges is to design a power-economic accelerator. This paper proposes a CNN accelerator with low power consumption and acceptable performance. The proposed method uses pipelining between the used kernels for the convolution process and a shared multiplication and accumulation block. The available kernels work consequently while each one performs a different operation in sequence. The proposed method utilizes a series of operations between the kernels and memory weights to speed up the convolution process. The proposed accelerator is implemented using VHDL and FPGA Altera Arria 10 GX. The results show that the proposed method achieves 26.37 GOPS/W of energy consumption, which is lower than the existing method, with acceptable resource usage and performance. The proposed method is ideally suited for small and constrained devices.
Truong Quang VinhDinh Viet Hai
Hong WangXiao ZhangDehui KongGuoning LuDegen ZhenFang ZhuKe Xu
Shuanglong LiuXinyu NiuWayne Luk
YU Zijian,MA De,YAN Xiaolang,SHEN Juncheng
Jincheng ZouQing TangCongcong He