Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation

Masahito Togami

doi:10.1109/icassp40776.2020.9053791

ScienceGate Book Chapters

JOURNAL ARTICLE

Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation

Masahito Togami

Year: 2020 Pages: 3032-3036

DOI: 10.1109/icassp40776.2020.9053791

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we propose a joint training of two deep neural networks (DNNs) for dereverberation and speech source separation. The proposed method connects the first DNN, the dereverberation part, the second DNN, and the speech source separation part in a cascade manner. The proposed method does not train each DNN separately. Instead, an integrated loss function which evaluates an output signal after dereverberation and speech source separation is adopted. The proposed method estimates the output signal as a probabilistic variable. Recently, in the speech source separation context, we proposed a loss function which evaluates the estimated posterior probability density function (PDF) of the output signal. In this paper, we extend this loss function into a loss function which evaluates not only speech source separation performance but also speech derevereberation performance. Since the output signal of the dereverberation part is converted into the input feature of the second DNN, gradient of the loss function is back-propagated into the first DNN through the input feature of the second DNN. Experimental results show that the proposed joint training of two DNNs is effective. It is also shown that the posterior PDF based loss function is effective in the joint training context.

Keywords:

Computer science Source separation Speech recognition Joint (building) Context (archaeology) Artificial neural network SIGNAL (programming language) Reverberation Channel (broadcasting) Feature (linguistics) Function (biology) Probabilistic logic Artificial intelligence Acoustics Telecommunications Engineering

Metrics

Cited By

0.89

FWCI (Field Weighted Citation Impact)

Refs

0.73

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Blind Source Separation Techniques

Physical Sciences → Computer Science → Signal Processing

Speech Recognition and Synthesis

Physical Sciences → Computer Science → Artificial Intelligence

Joint Training of Deep Neural Networks for Multi-Channel Dereverberation and Speech Source Separation

Abstract

Metrics

Citation History

Topics

Related Documents

Single Channel Speech Source Separation Using Hierarchical Deep Neural Networks

Joint Source-Channel Modeling and Estimation for Speech Dereverberation

Source Separation and Speech Dereverberation

Source Separation and Speech Dereverberation

Audio-Visual Multi-Channel Speech Separation, Dereverberation and Recognition