Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System

S. Yamini; Ganesh S. Mirishkar; Anil Kumar Vuppala; Suresh Purini

doi:10.1109/ipdpsw59300.2023.00027

ScienceGate Book Chapters

JOURNAL ARTICLE

Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System

S. Yamini Ganesh S. Mirishkar Anil Kumar Vuppala Suresh Purini

Year: 2023 Pages: 93-100

DOI: 10.1109/ipdpsw59300.2023.00027

Get Full-Text PDF Get Analytical Report

Abstract

Hardware accelerators are being designed to offload compute-intensive tasks such as deep neural networks from the CPU to improve the overall performance of an application, specifically on the performance-per-watt metric. Encoder-decoder-based sequence-to-sequence models such as the Transformer model have demonstrated state-of-the-art results in end-to-end automatic speech recognition systems (ASRs). The Transformer model being intensive on memory and computation poses a challenge for an FPGA implementation. This paper proposes an end-to-end architecture to accelerate a Transformer for an ASR system. The host CPU orchestrates the computations from different encoder and decoder stages of the Transformer architecture on the designed hardware accelerator with no necessity for intervening FPGA reconfiguration. The communication latency is hidden by prefetching the weights of the next encoder/decoder block while the current block is being processed. The computation is split across both the Super Logic Regions (SLRs) of the FPGA, mitigating the inter-SLR communication. The proposed design presents an optimal latency, exploiting the available resources. The accelerator design is realized using high-level synthesis tools and evaluated on an Alveo U-50 FPGA card. The design demonstrates an average speed-up of $32 \times$ compared to an Intel Xeon E5-2640 CPU and $8.8 \times$ compared to NVIDIA GeForce RTX 3080 Ti Graphics card for a 32-bit floating point single precision model.

Keywords:

Computer science Field-programmable gate array Computer hardware Encoder Xeon Hardware acceleration Floating point End-to-end principle Computation Transformer Embedded system Parallel computing Operating system Artificial intelligence Engineering

Metrics

Cited By

0.51

FWCI (Field Weighted Citation Impact)

Refs

0.65

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Network Packet Processing and Optimization

Physical Sciences → Computer Science → Hardware and Architecture

Embedded Systems Design Techniques

Physical Sciences → Computer Science → Hardware and Architecture

Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System

Abstract

Metrics

Citation History

Topics

Related Documents

A Transformer-Based End-to-End Automatic Speech Recognition Algorithm

An End-to-End Transformer-Based Automatic Speech Recognition for Qur’an Reciters

Transformer-Based Long-Context End-to-End Speech Recognition

Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition

A CTC Alignment-Based Non-Autoregressive Transformer for End-to-End Automatic Speech Recognition