Abstract

Recent work on neural network architectures has focused on bridging the gap between performance/efficiency and programmability. We consider implementations of three popular neural networks, ResNet, AlexNet and ASGD weight-dropped Recurrent Neural Network (AWD RNN) on a low power programmable architecture, Transformer. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between shared and private cache mode operations. We present efficient implementations of key neural network kernels and evaluate the performance of each kernel when operating in different cache modes. The best-performing cache modes are then used in the implementation of the end-to-end network. Simulation results show superior performance with ResNet, AlexNet and AWD RNN achieving 188.19 GOPS/W, 150.53 GOPS/W and 120.68 GOPS/W, respectively, in the 14 nm technology node.

Keywords:
Computer science Bridging (networking) Cache Embedded system Control reconfiguration Computation Artificial neural network Kernel (algebra) Network architecture Architecture Implementation Microarchitecture Computer architecture Parallel computing Distributed computing Computer network Artificial intelligence Algorithm

Metrics

6
Cited By
0.52
FWCI (Field Weighted Citation Impact)
34
Refs
0.66
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Memory and Neural Computing
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
CCD and CMOS Imaging Sensors
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
© 2026 ScienceGate Book Chapters — All rights reserved.