In this paper we present a speech synthesis method for diphonebased text-to-speech systems. Its main goal is to achieve\nprosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model, which incorporates a novel method to jointly modify pitch and time-scale. Preliminary results show an improvement in the synthetic speech quality when high pitch modification is required.
Ujjwal BhushanKiran MalipatilVivek V. PatilV AnilkumarS AnanyaK P Bharath
Jerneja Žganec GrosMario Žganec
Stas TiomkinD. MalahSlava ShechtmanZvi Kons
Pirros TsiakoulisSpyros RaptisSotiris KarabetsosAimilios Chalamandaris