JOURNAL ARTICLE

Analysis of effect of single-channel speech-music separation using NMF to automatic speech recognition

Abstract

In this study, single-channel speech source separation is carried out to separate the speech from the background music, which degrades the speech recognition performance especially in broadcast news transcription systems. Since the separation is done using single observation of the source signals, the sources have to be previously modeled using training data. Non-negative Matrix Factorization (NMF) methods are used to model the sources. In order to model the source signals, different training data sets, which contain different music and speech data, are created and the effect of the training data sets are analyzed in this study. The performances of the methods are measured not only using separation performance measure but also with speech recognition performance measures.

Keywords:
Non-negative matrix factorization Speech recognition Computer science Source separation Speech processing Channel (broadcasting) Transcription (linguistics) Voice activity detection Artificial intelligence Matrix decomposition Pattern recognition (psychology)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
12
Refs
0.07
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Blind Source Separation Techniques
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.