JOURNAL ARTICLE

Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis

Wenbo ZhangXuefeng XieYanling DuDongmei Huang

Year: 2024 Journal:   The Journal of the Acoustical Society of America Vol: 155 (6)Pages: 3580-3588   Publisher: Acoustical Society of America

Abstract

Speech enhancement aims to make noisy speech signals clearer. Traditional time-frequency domain methods struggle to differentiate between speech and noise, leading to a risk of speech distortion. This paper introduces an approach that combines the time domain and time-frequency domain using the W-net module to suppress noise at the front end. The module is an improved version of Wave-U-Net, called TTF-W-Net. We conducted experiments using the TIMIT speech and NOISEX-92 noise datasets to evaluate the enhancement performance achieved by integrating preprocessing networks, specifically Wave-U-Net and our TTF-W-Net, into the baseline methods: Phase, FullSubNet+, and DB-AIAT. Experimental results show that TTF-W-Net outperforms the baseline Wave-U-Net by 15.7% on the PESQ metric and the effect of the network by using our preprocessing method is improved. Consequently, the TTF-W-Net preprocessing Net offers effective speech enhancement.

Keywords:
Computer science Speech enhancement PESQ Preprocessor Speech recognition Time domain TIMIT Noise (video) Frequency domain Baseline (sea) Speech processing Artificial intelligence Pattern recognition (psychology) Noise reduction Hidden Markov model

Metrics

4
Cited By
2.85
FWCI (Field Weighted Citation Impact)
27
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
© 2026 ScienceGate Book Chapters — All rights reserved.