ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement.

Muhammad Irfan Khattak; Nasir Saleem; Aamir Nawaz; Aftab Ahmed Almani; Farhana Umer; Elena Verdú

doi:10.9781/ijimai.2022.03.002

ScienceGate Book Chapters

JOURNAL ARTICLE

ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement.

Muhammad Irfan Khattak Nasir Saleem Aamir Nawaz Aftab Ahmed Almani Farhana Umer Elena Verdú

Year: 2022 Journal: International Journal of Interactive Multimedia and Artificial Intelligence Vol: 7 (4)Pages: 185-195 Publisher: International University of La Rioja

DOI: 10.9781/ijimai.2022.03.002

Get Full-Text PDF Get Analytical Report

Abstract

Machine learning-based supervised single-channel speech enhancement has achieved considerable research interest over conventional approaches. In this paper, an extended Restricted Boltzmann Machine (RBM) is proposed for the spectral masking-based noisy speech enhancement. In conventional RBM, the acoustic features for the speech enhancement task are layerwise extracted and the feature compression may result in loss of vital information during the network training. In order to exploit the important information in the raw data, an extended RBM is proposed for the acoustic feature representation and speech enhancement. In the proposed RBM, the acoustic features are progressively extracted by multiple-stacked RBMs during the pre-training phase. The hidden acoustic features from the previous RBM are combined with the raw input data that serve as the new inputs to the present RBM. By adding the raw data to RBMs, the layer-wise features related to the raw data are progressively extracted, that is helpful to mine valuable information in the raw data. The results using the TIMIT database showed that the proposed method successfully attenuated the noise and gained improvements in the speech quality and intelligibility. The STOI, PESQ and SDR are improved by 16.86%, 25.01% and 3.84dB over the unprocessed noisy speech.

Keywords:

Computer science Channel (broadcasting) Boltzmann machine Speech recognition Speech enhancement Artificial intelligence Telecommunications Deep learning

Metrics

Cited By

0.39

FWCI (Field Weighted Citation Impact)

Refs

0.48

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Infant Health and Development

Health Sciences → Health Professions → Pharmacy

ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement.

Abstract

Metrics

Citation History

Topics

Related Documents

ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement

ERBM-SE: Extended Restricted Boltzmann Machine for Multi-Objective Single-Channel Speech Enhancement

Restricted Boltzmann machine based algorithm for multi-objective optimization

Correntropy-Based Multi-objective Multi-channel Speech Enhancement

Single-channel Speech Enhancement Student under Multi-channel Speech Enhancement Teacher