JOURNAL ARTICLE

A 8.93-TOPS/W LSTM Recurrent Neural Network Accelerator Featuring Hierarchical Coarse-Grain Sparsity With All Parameters Stored On-Chip

Deepak KadetotadVisar BerishaChaitali ChakrabartiJae-sun Seo

Year: 2019 Journal:   IEEE Solid-State Circuits Letters Vol: 2 (9)Pages: 119-122   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Long short-term memory (LSTM) networks are widely used for speech applications but pose difficulties for efficient implementation on hardware due to large weight storage requirements. We present an energy-efficient LSTM recurrent neural network (RNN) accelerator, featuring an algorithm-hardware co-optimized memory compression technique called hierarchical coarse-grain sparsity (HCGS). Aided by HCGS-based block-wise recursive weight compression, we demonstrate LSTM networks with up to 16× fewer weights while achieving minimal accuracy loss. The prototype chip fabricated in 65-nm LP CMOS achieves 8.93/7.22 TOPS/W for 2-/3-layer LSTM RNNs trained with HCGS for TIMIT/TED-LIUM corpora.

Keywords:
Computer science Recurrent neural network Block (permutation group theory) Chip Artificial neural network Computer hardware TIMIT TOPS Energy (signal processing) Algorithm Artificial intelligence Materials science Mathematics Telecommunications

Metrics

12
Cited By
1.08
FWCI (Field Weighted Citation Impact)
9
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.