Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning Loss

Jean-Eudes Ayilo; Mostafa Sadeghi; Romain Serizel

doi:10.1109/icassp48485.2024.10446805

ScienceGate Book Chapters

JOURNAL ARTICLE

Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning Loss

Jean-Eudes Ayilo Mostafa Sadeghi Romain Serizel

Year: 2024 Pages: 12506-12510

DOI: 10.1109/icassp48485.2024.10446805

Get Full-Text PDF Get Analytical Report

Abstract

Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise, usually centered on noisy speech, and subsequently learn a parameterized model to reverse this process, conditionally on noisy speech. Unlike supervised methods, generative-based SE approaches often rely solely on an unsupervised loss, which may result in less efficient incorporation of conditioned noisy speech. To address this issue, we propose augmenting the original diffusion training objective with an ℓ 2 loss, measuring the discrepancy between ground-truth clean speech and its estimation at each diffusion time-step. Experimental results demonstrate the effectiveness of our proposed methodology.

Keywords:

Computer science Speech enhancement Generative model Parameterized complexity Artificial intelligence Speech recognition Generative grammar Ground truth Noise (video) Supervised learning Gaussian Diffusion Machine learning Pattern recognition (psychology) Artificial neural network Noise reduction Algorithm

Metrics

Cited By

4.99

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning Loss

Abstract

Metrics

Citation History

Topics

Related Documents

Unsupervised Speech Enhancement with Diffusion-Based Generative Models

Speech Enhancement with Generative Diffusion Models

Speech Enhancement and Dereverberation With Diffusion-Based Generative Models

Perceptual Loss Function for Speech Enhancement Based on Generative Adversarial Learning

DIFFUSION-BASED SPEECH ENHANCEMENT WITH JOINT GENERATIVE AND PREDICTIVE DECODERS