Kun-Chih ChenYi-Sheng LiaoCheng-Kang Tsai
The Convolutional Neural Network (CNN) has been shown its superiority to solve the problems of classification and recognition in recent years. However, the CNN hardware implementation is challenging due to the high computational complexity and high diverse dataflow according to different CNN models. To mitigate the design challenge, many researches design the CNN accelerator based on a dedicated dataflow in specific CNN models or layers, which is not a systematic design flow and thereby lacks design flexibility. Because each different CNN model involves similar computing functions with proper permutations, we propose a novel Lego-based Convolutional Neural Network on Chip (CNNoC) design methodology in this work. We define some common neural computing units, such as multiply-accumulation, pooling, etc., called Lego processing elements (LegoPEs). Afterward, we adopt the high flexible Network-on-Chip (NoC) interconnection to connect each involved LegoPE to construct different CNN models. In this way, we can involve different kinds of LegoPE to leverage various CNN model implementations. In addition, we further propose a computing flow to reuse the involved LegoPEs, which helps to mitigate the area overhead. Compared with the related works, the proposed CNNoC design methodology helps to improve 8% to 5,004% throughput according to different target CNN models.
Julia E. AkimovaDmitry O. Budanov
Yingang WangZuocheng MaHuaxiang LuShoujue Wang
Xinran MaRuiyong ZhaoJianyang Zhou