Abstract

In this paper, we present a causal speech signal improvement system that is designed to handle different types of distortions. The method is based on a generative diffusion model which has been shown to work well in scenarios with missing data and non-linear corruptions. To guarantee causal processing, we modify the network architecture of our previous work and replace global normalization with causal adaptive gain control. We generate diverse training data containing a broad range of distortions. This work was performed in the context of an "ICASSP Signal Processing Grand Challenge" and submitted to the non-real-time track of the "Speech Signal Improvement Challenge 2023", where it was ranked fifth.

Keywords:
Normalization (sociology) Computer science Speech recognition Speech processing Signal processing Generative grammar SIGNAL (programming language) Context (archaeology) Generative model Range (aeronautics) Artificial intelligence Machine learning Digital signal processing Engineering

Metrics

5
Cited By
1.07
FWCI (Field Weighted Citation Impact)
12
Refs
0.71
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

SPEECH DENOSING BY GENERATIVE DIFFUSION MODELS

O. V. GirfanovА. Г. Шишкин

Journal:   Научно-техническая информация Серия 2 Информационные процессы и системы Year: 2023 Pages: 1-10
JOURNAL ARTICLE

Speech Enhancement with Generative Diffusion Models

O. V. GirfanovА. Г. Шишкин

Journal:   Automatic Documentation and Mathematical Linguistics Year: 2023 Vol: 57 (5)Pages: 249-257
JOURNAL ARTICLE

Causal Diffusion Models for Generalized Speech Enhancement

Julius RichterSimon WelkerJean-Marie LemercierBunlong LayTal PeerTimo Gerkmann

Journal:   IEEE Open Journal of Signal Processing Year: 2024 Vol: 5 Pages: 780-789
JOURNAL ARTICLE

Causal Diffusion Models for Generalized Speech Enhancement

Richter, JuliusWelker, SimonLemercier, Jean-MarieLay, BunlongPeer, TalGerkmann, Timo

Journal:   DESY Publication Database (PUBDB) (Deutsches Elektronen-Synchrotron) Year: 2024
© 2026 ScienceGate Book Chapters — All rights reserved.