Abstract

Speech translation is the translation of speech in one language typically to text in another, traditionally accomplished through a combination of automatic speech recognition and machine translation. Speech translation has attracted interest for many years, but the recent successful applications of deep learning to both individual tasks have enabled new opportunities through joint modeling, in what we today call 'end-to-end speech translation.' In this tutorial we will introduce the techniques used in cutting-edge research on speech translation. Starting from the traditional cascaded approach, we will given an overview on data sources and model architectures to achieve state-of-the art performance with end-to-end speech translation for both high- and low-resource languages. In addition, we will discuss methods to evaluate analyze the proposed solutions, as well as the challenges faced when applying speech translation models for real-world applications.

Keywords:
Computer science Speech translation Machine translation Translation (biology) End-to-end principle Natural language processing Speech recognition Artificial intelligence Speech synthesis Language translation

Metrics

3
Cited By
0.42
FWCI (Field Weighted Citation Impact)
17
Refs
0.67
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Tutorial: End-to-End Speech Translation

Niehues, JanSalesky, ElizabethTurchi, MarcoNegri, Matteo

Journal:   Repository KITopen (Karlsruhe Institute of Technology) Year: 2021
JOURNAL ARTICLE

Multilingual End-to-End Speech Translation

Hirofumi InagumaKevin DuhTatsuya KawaharaShinji Watanabe

Journal:   2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Year: 2019 Pages: 570-577
JOURNAL ARTICLE

End-to-End Neural Speech Translation

Matthias Sperber

Journal:   Repository KITopen (Karlsruhe Institute of Technology) Year: 2019
JOURNAL ARTICLE

End-to-End Speech Translation for Code Switched Speech

Orion WellerMatthias SperberTelmo PiresHendra SetiawanChristian GollanDominic TelaarMatthias Paulik

Journal:   Findings of the Association for Computational Linguistics: ACL 2022 Year: 2022
© 2026 ScienceGate Book Chapters — All rights reserved.