JOURNAL ARTICLE

Single-channel music/speech separation using non-linear masks

Abstract

In this paper, we address the problem of monaural music and speech separation, based on soft mask filtering. Likewise other well-known techniques, the estimation of statistical model of the sources are needed. Hence, we employ Vector quantization (VQ) for synthesis stage which results in more accurate codebook entries for each source in contrast to the commonly used GMM (Gaussian Mixture Model) approach. In separation stage we compare the non linear mask proposed in this work with other well-known techniques in terms of undesirable signal to interference ratio (SIR) effects. It is demonstrated that the proposed semi soft mask results in the best performance in terms of both SIR and subjective measures.

Keywords:
Codebook Speech recognition Vector quantization Computer science Monaural Source separation Gaussian Quantization (signal processing) Linear prediction Separation (statistics) Channel (broadcasting) Mixture model Pattern recognition (psychology) Speech enhancement Algorithm Artificial intelligence Machine learning Noise reduction Telecommunications

Metrics

1
Cited By
0.00
FWCI (Field Weighted Citation Impact)
9
Refs
0.18
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Blind Source Separation Techniques
Physical Sciences →  Computer Science →  Signal Processing
Advanced Adaptive Filtering Techniques
Physical Sciences →  Engineering →  Computational Mechanics
© 2026 ScienceGate Book Chapters — All rights reserved.