JOURNAL ARTICLE

Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning Loss

Abstract

Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise, usually centered on noisy speech, and subsequently learn a parameterized model to reverse this process, conditionally on noisy speech. Unlike supervised methods, generative-based SE approaches often rely solely on an unsupervised loss, which may result in less efficient incorporation of conditioned noisy speech. To address this issue, we propose augmenting the original diffusion training objective with an ℓ 2 loss, measuring the discrepancy between ground-truth clean speech and its estimation at each diffusion time-step. Experimental results demonstrate the effectiveness of our proposed methodology.

Keywords:
Computer science Speech enhancement Generative model Parameterized complexity Artificial intelligence Speech recognition Generative grammar Ground truth Noise (video) Supervised learning Gaussian Diffusion Machine learning Pattern recognition (psychology) Artificial neural network Noise reduction Algorithm

Metrics

7
Cited By
4.99
FWCI (Field Weighted Citation Impact)
24
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Speech Enhancement with Generative Diffusion Models

O. V. GirfanovА. Г. Шишкин

Journal:   Automatic Documentation and Mathematical Linguistics Year: 2023 Vol: 57 (5)Pages: 249-257
JOURNAL ARTICLE

Speech Enhancement and Dereverberation With Diffusion-Based Generative Models

Julius RichterSimon WelkerJean-Marie LemercierBunlong LayTimo Gerkmann

Journal:   IEEE/ACM Transactions on Audio Speech and Language Processing Year: 2023 Vol: 31 Pages: 2351-2364
JOURNAL ARTICLE

Perceptual Loss Function for Speech Enhancement Based on Generative Adversarial Learning

Xin BaiXueliang ZhangHui ZhangHaifeng Huang

Journal:   2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) Year: 2022 Pages: 53-58
© 2026 ScienceGate Book Chapters — All rights reserved.