Abstract

In this paper, we study the time-domain neural network approach for speech bandwidth extension. We propose a network architecture, named multi-scale fusion neural network (MfNet), that gradually restores the low-frequency signal and predicts the high-frequency signal through the exchange of information across different scale representations. We propose a training scheme to optimize the network with a combination of perceptual loss and time-domain adversarial loss. Experiments show the proposed multi-scale fusion network consistently outperforms the competing methods in terms of perceptual evaluation of speech quality (PESQ), signal to distortion rate (SDR), signal to noise ratio (SNR), log-spectral distance (LSD) and word error rate (WER). More promisingly, the multi-scale fusion network requires only 10% of the parameters of the time-domain reference baseline.

Keywords:
PESQ Computer science Speech recognition Bandwidth extension Bandwidth (computing) Artificial neural network SIGNAL (programming language) Time domain Artificial intelligence Speech enhancement Speech coding Pattern recognition (psychology) Noise reduction Audio signal Computer network

Metrics

12
Cited By
1.33
FWCI (Field Weighted Citation Impact)
54
Refs
0.81
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Image and Signal Denoising Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Bandwidth extension of narrowband speech in log spectra domain using neural network

Sara PourmohammadiMansour ValiMohsen Ghadyani

Journal:   TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES Year: 2015 Vol: 23 Pages: 433-446
JOURNAL ARTICLE

Transform-Domain Speech Bandwidth Extension

Prasad NizampatnamG. R. L. V. N. Srinivasa Raju

Journal:   Circuits Systems and Signal Processing Year: 2019 Vol: 38 (12)Pages: 5717-5733
JOURNAL ARTICLE

A neural network approach for speech enhancement and noise-robust bandwidth extension

Xiang HaoChenglin XuChen ZhangLei Xie

Journal:   Computer Speech & Language Year: 2024 Vol: 89 Pages: 101709-101709
© 2026 ScienceGate Book Chapters — All rights reserved.