JOURNAL ARTICLE

CP-GAN: Context Pyramid Generative Adversarial Network for Speech Enhancement

Abstract

The topic of speech enhancement has been largely improved recently, especially with the development of generative adversarial networks (GANs). However prior methods simply follow the GAN architectures from computer vision tasks without specific designs for the speech enhancement according to the audio characteristics (i.e., different granularity context), which may leave noise points in some segments or disturb the contents of the original audio. In this work, we make the first attempt to explore the global and local speech features for coarse-to-fine speech enhancement and introduce a Context Pyramid Generative Adversarial Network (CPGAN), which contains a densely-connected feature pyramid generator and a dynamic context granularity discriminator to better eliminate audio noise hierarchically. Extensive experiments demonstrate that our CP-GAN effectively achieves state-of-the-art speech enhancement results and boosts the performance of more high-level speech tasks including automatic speech recognition and speaker recognition.

Keywords:
Discriminator Computer science Speech recognition Speech enhancement Granularity Pyramid (geometry) Context (archaeology) Noise (video) Feature (linguistics) Adversarial system Generator (circuit theory) Speech processing Artificial intelligence Noise reduction Image (mathematics) Power (physics) Telecommunications Linguistics Mathematics

Metrics

41
Cited By
4.88
FWCI (Field Weighted Citation Impact)
28
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Advanced Adaptive Filtering Techniques
Physical Sciences →  Engineering →  Computational Mechanics
Hearing Loss and Rehabilitation
Life Sciences →  Neuroscience →  Cognitive Neuroscience

Related Documents

BOOK-CHAPTER

Speech Enhancement Using Generative Adversarial Network (GAN)

Mahmudul HuqRytis Maskeliūnas

Lecture notes in networks and systems Year: 2022 Pages: 273-282
JOURNAL ARTICLE

Noise Classification Speech Enhancement Generative Adversarial Network

Tao FengYe LiPeng ZhangShu LiFuqiang Wang

Journal:   2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC) Year: 2022 Pages: 11-16
JOURNAL ARTICLE

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Xinmeng XuYang WangDongxiang XuYiyuan PengCong ZhangJie JiaBinbin Chen

Journal:   ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Year: 2022 Pages: 7308-7311
© 2026 ScienceGate Book Chapters — All rights reserved.