JOURNAL ARTICLE

Direct F0 control of an electrolarynx based on statistical excitation feature prediction and its evaluation through simulation

Abstract

An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that statistical prediction of excitation parameters, such as F0 patterns, was essential to significantly improve naturalness of EL speech. In these methods, the original EL speech is recorded with a microphone and the enhanced EL speech is presented from a loudspeaker in real time. This framework is effective for telecommunication but it is not suitable to face-to-face conversation because both the original EL speech and the enhanced EL speech are presented to listeners. In this paper, we propose direct F0 control of the electrolarynx based on statistical excitation prediction to develop an EL speech enhancement technique also effective for face-to-face conversation. F0 patterns of excitation signals produced by the electrolarynx are predicted in real time from the EL speech produced by the laryngectomee’s articulation of the excitation signals with previously predicted F0 values. A simulation experiment is conducted to evaluate the effectiveness of the proposed method. The experimental results demonstrate that the proposed method yields significant improvements in naturalness of EL speech while keeping its intelligibility high enough. Index Terms: laryngectomee, electrolarynx, electrolaryngeal speech, statistical excitation prediction, simulation evaluation

Keywords:
Computer science Naturalness Speech recognition Microphone Speech synthesis Statistical model Intelligibility (philosophy) Conversation Acoustics Artificial intelligence Telecommunications Physics Sound pressure

Metrics

8
Cited By
1.93
FWCI (Field Weighted Citation Impact)
19
Refs
0.89
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Voice and Speech Disorders
Health Sciences →  Medicine →  Physiology
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.