Neural Audio Coding with Deep Complex Networks

Jiawei Ru; Lizhong Wang; Maoshen Jia; Liang Wen; Handong Wang; Yuhao Zhao; Jing Wang

doi:10.1088/1742-6596/2759/1/012005

ScienceGate Book Chapters

JOURNAL ARTICLE

Neural Audio Coding with Deep Complex Networks

Jiawei Ru Lizhong Wang Maoshen Jia Liang Wen Handong Wang Yuhao Zhao Jing Wang

Year: 2024 Journal: Journal of Physics Conference Series Vol: 2759 (1)Pages: 012005-012005 Publisher: IOP Publishing

DOI: 10.1088/1742-6596/2759/1/012005

Get Full-Text PDF Get Analytical Report

Abstract

Abstract This paper proposes a transform domain audio coding method based on deep complex networks. In the proposed codec, the time-frequency spectrum of the audio signal is fed to the encoder which consists of complex convolutional blocks and a frequency-temporal modeling module to obtain the extracted features which are then quantized with a target bitrate by the vector quantizer. The structure of the decoder which reconstruct the time-frequency spectrum of the audio from quantized features is symmetrical to the encoder. In this paper, a structure combining the complex multi-head self-attention module and the complex long short-term memory is proposed to capture both frequency and temporal dependencies. Subjective and objective evaluation tests show the advantage of the proposed method.

Keywords:

Computer science Coding (social sciences) Artificial neural network Deep neural networks Speech recognition Artificial intelligence Mathematics

Metrics

Cited By

0.71

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Music and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Image and Signal Denoising Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Speech and Audio Processing

Physical Sciences → Computer Science → Signal Processing

Neural Audio Coding with Deep Complex Networks

Abstract

Metrics

Citation History

Topics

Related Documents

End-to-end Stereo Audio Coding Using Deep Neural Networks

Audio classification with complex neural networks

Audio Representation Learning with Deep Neural Networks

TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding

Multichannel Audio Source Separation With Deep Neural Networks