Singing Voice Synthesis Based on Generative Adversarial Networks

Yukiya Hono; Kei Hashimoto; Keiichiro Oura; Yoshihiko Nankaku; Keiichi Tokuda

doi:10.1109/icassp.2019.8683154

ScienceGate Book Chapters

JOURNAL ARTICLE

Singing Voice Synthesis Based on Generative Adversarial Networks

Yukiya Hono Kei Hashimoto Keiichiro Oura Yoshihiko Nankaku Keiichi Tokuda

Year: 2019 Pages: 6955-6959

DOI: 10.1109/icassp.2019.8683154

Get Full-Text PDF Get Analytical Report

Abstract

This paper proposes a generative adversarial training method for deep neural network (DNN)-based singing voice synthesis. The DNN-based approach has been used in statistical parametric singing voice synthesis and improved the naturalness of the synthesized singing voice [1]. Recently, generative adversarial networks (GANs) [2] have attracted significant attention in various machine learning research areas including speech synthesis [3]. GANs have achieved great success in modeling the distributions of complex data, and they have the potential to alleviate over-smoothing problem on the generated speech parameters in speech synthesis. In this paper, we propose a DNN-based singing voice synthesis system incorporating the GAN. Experimental results show that the proposed method outperforms the conventional method in the naturalness of the synthesized singing voice.

Keywords:

Naturalness Singing Computer science Speech synthesis Speech recognition Parametric statistics Artificial neural network Generative grammar Artificial intelligence Acoustics Mathematics

Metrics

Cited By

6.45

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Generative Adversarial Networks and Image Synthesis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Singing Voice Synthesis Based on Generative Adversarial Networks

Abstract

Metrics

Citation History

Topics

Related Documents

SINGAN: Singing Voice Conversion with Generative Adversarial Networks

Crossfire Conditional Generative Adversarial Networks for Singing Voice Extraction

Mandarin Singing Synthesis Based on Generative Adversarial Network

Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data

Image Synthesis and Voice Conversion Using Generative Adversarial Networks