Speech Enhancement with Generative Diffusion Models

O. V. Girfanov; А. Г. Шишкин

doi:10.3103/s0005105523050035

ScienceGate Book Chapters

JOURNAL ARTICLE

Speech Enhancement with Generative Diffusion Models

O. V. Girfanov А. Г. Шишкин

Year: 2023 Journal: Automatic Documentation and Mathematical Linguistics Vol: 57 (5)Pages: 249-257 Publisher: Pleiades Publishing

DOI: 10.3103/s0005105523050035

Get Full-Text PDF Get Analytical Report

Abstract

An alternative approach to speech denoising using generative diffusion models that model the distribution of training data is proposed. In recent years, such models have led to promising results to be obtained in the field of generating signals of various kinds, and these are superior in many ways to previous generative models, such as variational autoencoders. However, diffusion models have not yet found wide application in the field of speech denoising. A new diffusion model is presented, which can be used to denoise real speech signals using a deep neural network. Our own data set, with more than 150 h of pure speech in Russian, has been created. The obtained results, estimated using the metrics scale invariant signal to distortion ratio and perceptual evaluation of speech quality, are comparable or superior to the results of the best discriminative models.

Keywords:

Discriminative model Computer science Speech recognition Generative model Generative grammar Artificial intelligence Noise reduction Pattern recognition (psychology) Artificial neural network Field (mathematics) Distortion (music) Mathematics

Metrics

Cited By

0.54

FWCI (Field Weighted Citation Impact)

Refs

0.61

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Enhancement with Generative Diffusion Models

Abstract

Metrics

Citation History

Topics

Related Documents

Unsupervised Speech Enhancement with Diffusion-Based Generative Models

Speech Enhancement and Dereverberation With Diffusion-Based Generative Models

SPEECH DENOSING BY GENERATIVE DIFFUSION MODELS

Unsupervised speech enhancement with deep dynamical generative speech and noise models

DIFFUSION-BASED SPEECH ENHANCEMENT WITH JOINT GENERATIVE AND PREDICTIVE DECODERS