JOURNAL ARTICLE

Tone Recognition of Continuous Mandarin Speech Based on Tone Nucleus Model and Neural Network

Xiangli WangKenji HiroseJinkai ZhangNobuaki Minematsu

Year: 2008 Journal:   IEICE Transactions on Information and Systems Vol: E91-D (6)Pages: 1748-1755   Publisher: Institute of Electronics, Information and Communication Engineers

Abstract

A method was developed for automatic recognition of syllable tone types in continuous speech of Mandarin by integrating two techniques, tone nucleus modeling and neural network classifier. The tone nucleus modeling considers a syllable F0 contour as consisting of three parts: onset course, tone nucleus, and offset course. Two courses are transitions from/to neighboring syllable F0 contours, while the tone nucleus is intrinsic part of the F0 contour. By viewing only the tone nucleus, acoustic features less affected by neighboring syllables are obtained. When using the tone nucleus modeling, automatic detection of tone nucleus comes crucial. An improvement was added to the original detection method. Distinctive acoustic features for tone types are not limited to F0 contours. Other prosodic features, such as waveform power and syllable duration, are also useful for tone recognition. Their heterogeneous features are rather difficult to be handled simultaneously in hidden Markov models (HMM), but are easy in neural networks. We adopted multi-layer perception (MLP) as a neural network. Tone recognition experiments were conducted for speaker dependent and independent cases. In order to show the effect of integration, experiments were conducted also for two baselines: HMM classifier with tone nucleus modeling, and MLP classifier viewing entire syllable instead of tone nucleus. The integrated method showed 87.1% of tone recognition rate in speaker dependent case, and 80.9% in speaker independent case, which was about 10% relative error reduction as compared to the baselines.

Keywords:
Speech recognition Tone (literature) Mandarin Chinese Computer science Hidden Markov model Artificial neural network Classifier (UML) Syllable Artificial intelligence Pattern recognition (psychology)

Metrics

15
Cited By
0.62
FWCI (Field Weighted Citation Impact)
8
Refs
0.75
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Phonetics and Phonology Research
Social Sciences →  Psychology →  Experimental and Cognitive Psychology
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing

Related Documents

JOURNAL ARTICLE

Tone recognition of continuous Mandarin speech based on neural networks

Sim-Horng ChenYih‐Ru Wang

Journal:   IEEE Transactions on Speech and Audio Processing Year: 1995 Vol: 3 (2)Pages: 146-150
JOURNAL ARTICLE

Tone Modeling for Continuous Mandarin Speech Recognition

Yang CaoShuwu ZhangTaiyi HuangBo Xu

Journal:   International Journal of Speech Technology Year: 2004 Vol: 7 (2-3)Pages: 115-128
JOURNAL ARTICLE

TONE RECOGNITION OF CONTINUOUS MANDARIN SPEECH BASED ON HIDDEN MARKOV MODEL

Yih‐Ru WangJYH-MING SHIEHSin‐Horng Chen

Journal:   International Journal of Pattern Recognition and Artificial Intelligence Year: 1994 Vol: 08 (01)Pages: 233-246
JOURNAL ARTICLE

A tone recognition framework for continuous Mandarin speech

Lei HeJie Hao

Year: 2006 Pages: paper 1348-Wed1BuP.7
© 2026 ScienceGate Book Chapters — All rights reserved.