Incremental Text to Speech for Neural Sequence-to-Sequence Models Using Reinforcement Learning

Devang S Ram Mohan; Raphael Lenain; Lorenzo Foglianti; Tian Huey Teh; Marlene Staib; Alexandra Torresquintero; Jiameng Gao

doi:10.21437/interspeech.2020-1822

ScienceGate Book Chapters

JOURNAL ARTICLE

Incremental Text to Speech for Neural Sequence-to-Sequence Models Using Reinforcement Learning

Devang S Ram Mohan Raphael Lenain Lorenzo Foglianti Tian Huey Teh Marlene Staib Alexandra Torresquintero Jiameng Gao

Year: 2020 Pages: 3186-3190

DOI: 10.21437/interspeech.2020-1822

Get Full-Text PDF Get Analytical Report

Abstract

Modern approaches to text to speech require the entire input character\nsequence to be processed before any audio is synthesised. This latency limits\nthe suitability of such models for time-sensitive tasks like simultaneous\ninterpretation. Interleaving the action of reading a character with that of\nsynthesising audio reduces this latency. However, the order of this sequence of\ninterleaved actions varies across sentences, which raises the question of how\nthe actions should be chosen. We propose a reinforcement learning based\nframework to train an agent to make this decision. We compare our performance\nagainst that of deterministic, rule-based systems. Our results demonstrate that\nour agent successfully balances the trade-off between the latency of audio\ngeneration and the quality of synthesised audio. More broadly, we show that\nneural sequence-to-sequence models can be adapted to run in an incremental\nmanner.\n

Keywords:

Computer science Interleaving Reinforcement learning Latency (audio) Sequence (biology) Speech recognition Artificial intelligence Sequence learning Low latency (capital markets) Character (mathematics) Natural language processing

Metrics

Cited By

1.47

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Natural Language Processing Techniques

Physical Sciences → Computer Science → Artificial Intelligence

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Incremental Text to Speech for Neural Sequence-to-Sequence Models Using Reinforcement Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Semi-supervised Training for Sequence-to-Sequence Speech Recognition Using Reinforcement Learning

An Incremental Learning approach using Sequence Models

Neural Abstractive Text Summarization with Sequence-to-Sequence Models

Sequence-to-Sequence Learning via Attention Transfer for Incremental Speech Recognition

Value-Based Reinforcement Learning for Sequence-to-Sequence Models