Noise Classification Speech Enhancement Generative Adversarial Network

Tao Feng; Ye Li; Peng Zhang; Shu Li; Fuqiang Wang

doi:10.1109/itoec53115.2022.9734565

ScienceGate Book Chapters

JOURNAL ARTICLE

Noise Classification Speech Enhancement Generative Adversarial Network

Tao Feng Ye Li Peng Zhang Shu Li Fuqiang Wang

Year: 2022 Journal: 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC) Pages: 11-16

DOI: 10.1109/itoec53115.2022.9734565

Get Full-Text PDF Get Analytical Report

Abstract

The purpose of speech enhancement is to extract the speech signal from various noise backgrounds, improving the quality of the speech signal. After the emergence of the Speech Enhancement Generative Adversarial (SEGAN), it has achieved good results in the field of speech enhancement. However, SEGAN does not have an excellent speech enhancement effect in the case of low signal-to-noise ratio, it has weak generalization ability in the face of unknown noise. In this paper, we propose a method of generative adversarial network speech enhancement using noise background classification. In this method, the inputs are noisy speeches, which have a variety of background noises. Mel Frequency Cepstral Coefficient (MFCC) features of noisy speeches are extracted, convolutional neural network is used to classify each noisy background, and the classified noisy speeches are labeled with the type of background noise. The labeled noisy speeches are sent to the speech enhancement model. There are several SEGANs in the speech enhancement model. Each SEGAN enhances noisy speeches with a particular of background noise. Under extremely low signal-to-noise ratio conditions and in the face of unknown noise, we evaluate this method in extensive experiments, using objective evaluation indicators to evaluate the effectiveness of the model. Compared with the SEGAN model under the condition of extremely low signal-to-noise ratio, the model in this paper can eliminate noise better, and each objective index has been improved. In the face of unknown background noise, objective evaluation index of NCSEGAN is better than SEGAN, which confirms the effectiveness of the method.

Keywords:

Speech enhancement Speech recognition Computer science Noise (video) Noise measurement Background noise Signal-to-noise ratio (imaging) Artificial intelligence Pattern recognition (psychology) Cepstrum Noise reduction Telecommunications

Metrics

Cited By

0.42

FWCI (Field Weighted Citation Impact)

Refs

0.47

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Noise Classification Speech Enhancement Generative Adversarial Network

Abstract

Metrics

Citation History

Topics

Related Documents

Language and noise transfer in speech enhancement generative adversarial network

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech Enhancement Using Generative Adversarial Network (GAN)

Speech Enhancement via Residual Dense Generative Adversarial Network

Improved Wasserstein conditional generative adversarial network speech enhancement