Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition

Eugen Beck; Mirko Hannemann; Patrick Dötsch; Ralf Schlüter; Hermann Ney

doi:10.21437/interspeech.2018-1212

ScienceGate Book Chapters

JOURNAL ARTICLE

Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition

Eugen Beck Mirko Hannemann Patrick Dötsch Ralf Schlüter Hermann Ney

Year: 2018

DOI: 10.21437/interspeech.2018-1212

Get Full-Text PDF Get Analytical Report

Abstract

It has been known for a long time that the classic Hidden-Markov-Model (HMM) derivation for speech recognition contains assumptions such as independence of observation vectors and weak duration modeling that are practical but unrealistic.When using the hybrid approach this is amplified by trying to fit a discriminative model into a generative one.Hidden Conditional Random Fields (CRFs) and segmental models (e.g.Semi-Markov CRFs / Segmental CRFs) have been proposed as an alternative, but for a long time have failed to get traction until recently.In this paper we explore different length modeling approaches for segmental models, their relation to attention-based systems.Furthermore we show experimental results on a handwriting recognition task and to the best of our knowledge the first reported results on the Switchboard 300h speech recognition corpus using this approach.

Keywords:

Computer science Speech recognition Encoder Vocabulary Artificial intelligence Speech coding Natural language processing Linguistics

Metrics

Cited By

2.78

FWCI (Field Weighted Citation Impact)

Refs

0.91

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition

Abstract

Metrics

Citation History

Topics

Related Documents

Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition

Large Context End-to-end Automatic Speech Recognition via Extension of Hierarchical Recurrent Encoder-decoder Models

Encoder-decoder models for recognition of Russian speech

A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition

Croatian Large Vocabulary Automatic Speech Recognition