Convolutional neural network has been proved to have a wide range of applications and excellent results in the field of target recognition and image segmentation. However, the huge network scale and weight parameters of convolutional neural network make its computational complexity too high, and the required computing and storage resources increase rapidly with the increase of network layers, which makes it difficult to deploy in embedded computing systems with strict requirements on resources and power consumption, which restricts the development of embedded computing systems in the direction of high intelligence. Aiming at the demand of ultra-light intelligent computing in embedded computing system with limited resources, an optimized full-stream convolutional neural network model acceleration strategy is proposed, and the ultra-light convolutional neural network hardware throttle after the ultra-light processing of the algorithm model is designed, and the reasoning process of the network model is verified based on FPGA. The experimental results show that the accelerator designed in this paper can significantly reduce the occupancy rate of hardware resources and obtain a good algorithm speedup ratio, which has important technical significance for the design of embedded intelligent computing system.
SHI TianjieLIU FeiyangZHANG Xiao
Xinran MaRuiyong ZhaoJianyang Zhou
Guangchao XiangJinxue SuiXia Zhang