Transformer Model Compression for End-to-End Speech Recognition on Mobile Devices

Leila Ben Letaifa; Jean-Luc Rouas

doi:10.23919/eusipco55093.2022.9909765

ScienceGate Book Chapters

JOURNAL ARTICLE

Transformer Model Compression for End-to-End Speech Recognition on Mobile Devices

Leila Ben Letaifa Jean-Luc Rouas

Year: 2022 Journal: 2022 30th European Signal Processing Conference (EUSIPCO) Pages: 439-443

DOI: 10.23919/eusipco55093.2022.9909765

Get Full-Text PDF Get Analytical Report

Abstract

Transformer-based models have achieved state-of-the-art performance in various areas of machine learning, including automatic speech recognition. However, their cost in terms of computational power, memory or energy consumption can be exorbitant, hence the interest in compression techniques. Trans-former models are mostly composed of attention and feedforward components. In this paper, we propose to reduce the size of a transformer model in an end-to-end speech recognition system by decreasing the number and precision of linear layer parameters. Specifically, we investigate the impact of weight pruning on system performance. We then consider model quantization. To further reduce the model size, we address the combination of pruning and quantization methods. Experiments carried out on several speech datasets from different languages show that the memory footprint can be reduced by up to 84% with an insignificant loss of accuracy.

Keywords:

Computer science Transformer Quantization (signal processing) End-to-end principle Speech recognition Memory footprint Speech coding Artificial intelligence Algorithm Engineering

Metrics

Cited By

0.59

FWCI (Field Weighted Citation Impact)

Refs

0.65

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Transformer Model Compression for End-to-End Speech Recognition on Mobile Devices

Abstract

Metrics

Citation History

Topics

Related Documents

Streaming End-to-end Speech Recognition for Mobile Devices

Variable Scale Pruning for Transformer Model Compression in End-to-End Speech Recognition

Online Compressive Transformer for End-to-End Speech Recognition

Transformer-Based Long-Context End-to-End Speech Recognition

End-To-End Multi-Speaker Speech Recognition With Transformer