Tomoki KoriyamaTakashi NoseTakao Kobayashi
This paper examines two issues of a statistical speech synthesis approach based Gaussian process (GP) regression. Although GP-based speech synthesis can give higher performance in generating spectral parameters than the HMM-based one, a number of issues still remain. In this paper, we incorporate global variance (GV) feature to overcome over-smoothing problem into the parameter generation. Furthermore, in order to utilize an appropriate kernel function in accordance with actual data, we propose an EM-based kernel hyperparameter optimization technique. Objective and subjective evaluation results show that using GV and hyperparameter estimation enhanced the performance in spectral feature generation.
Tomoki KoriyamaTakashi NoseTakao Kobayashi
Tomoki KoriyamaTakao Kobayashi
Sri WinarniSapto Wahyu Indratno
Tomoki KoriyamaTakashi NoseTakao Kobayashi
Jinhyeun KimChristopher O. LuettgenKamran PaynabarFani Boukouvala