Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis

Wenbo Zhang; Xuefeng Xie; Yanling Du; Dongmei Huang

doi:10.1121/10.0026219

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis

Wenbo Zhang Xuefeng Xie Yanling Du Dongmei Huang

Year: 2024 Journal: The Journal of the Acoustical Society of America Vol: 155 (6)Pages: 3580-3588 Publisher: Acoustical Society of America

DOI: 10.1121/10.0026219

Get Full-Text PDF Get Analytical Report

Abstract

Speech enhancement aims to make noisy speech signals clearer. Traditional time-frequency domain methods struggle to differentiate between speech and noise, leading to a risk of speech distortion. This paper introduces an approach that combines the time domain and time-frequency domain using the W-net module to suppress noise at the front end. The module is an improved version of Wave-U-Net, called TTF-W-Net. We conducted experiments using the TIMIT speech and NOISEX-92 noise datasets to evaluate the enhancement performance achieved by integrating preprocessing networks, specifically Wave-U-Net and our TTF-W-Net, into the baseline methods: Phase, FullSubNet+, and DB-AIAT. Experimental results show that TTF-W-Net outperforms the baseline Wave-U-Net by 15.7% on the PESQ metric and the effect of the network by using our preprocessing method is improved. Consequently, the TTF-W-Net preprocessing Net offers effective speech enhancement.

Keywords:

Computer science Speech enhancement PESQ Preprocessor Speech recognition Time domain TIMIT Noise (video) Frequency domain Baseline (sea) Speech processing Artificial intelligence Pattern recognition (psychology) Noise reduction Hidden Markov model

Metrics

Cited By

2.85

FWCI (Field Weighted Citation Impact)

Refs

0.84

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech preprocessing and enhancement based on joint time domain and time-frequency domain analysis

Abstract

Metrics

Citation History

Topics

Related Documents

Joint Time-Frequency and Time Domain Learning for Speech Enhancement

Speech Enhancement Based on Time-Frequency Domain GAN

Joint Time-Domain and Frequency-Domain Progressive Learning for Single-Channel Speech Enhancement and Recognition

Neural speech enhancement in the time-frequency domain

Speech enhancement in joint time-frequency domain based on real-valued discrete gabor transform