JOURNAL ARTICLE

End-to-End Paired Ambisonic-Binaural Audio Rendering

Yin ZhuQiuqiang KongJunjie ShiShilei LiuXuzhou YeJu-Chiang WangHongming ShanJunping Zhang

Year: 2024 Journal:   IEEE/CAA Journal of Automatica Sinica Vol: 11 (2)Pages: 502-513   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Binaural rendering is of great interest to virtual reality and immersive media. Although humans can naturally use their two ears to perceive the spatial information contained in sounds, it is a challenging task for machines to achieve binaural rendering since the description of a sound field often requires multiple channels and even the metadata of the sound sources. In addition, the perceived sound varies from person to person even in the same sound field. Previous methods generally rely on individual-dependent head-related transferred function (HRTF) datasets and optimization algorithms that act on HRTFs. In practical applications, there are two major drawbacks to existing methods. The first is a high personalization cost, as traditional methods achieve personalized needs by measuring HRTFs. The second is insufficient accuracy because the optimization goal of traditional methods is to retain another part of information that is more important in perception at the cost of discarding a part of the information. Therefore, it is desirable to develop novel techniques to achieve personalization and accuracy at a low cost. To this end, we focus on the binaural rendering of ambisonic and propose 1) channel-shared encoder and channel-compared attention integrated into neural networks and 2) a loss function quantifying interaural level differences to deal with spatial information. To verify the proposed method, we collect and release the first paired ambisonic-binaural dataset and introduce three metrics to evaluate the content information and spatial information accuracy of the end-to-end methods. Extensive experimental results on the collected dataset demonstrate the superior performance of the proposed method and the shortcomings of previous methods.

Keywords:
Ambisonics Binaural recording Computer science Rendering (computer graphics) Personalization Virtual reality Headphones Speech recognition Loudspeaker Artificial intelligence Acoustics

Metrics

3
Cited By
2.14
FWCI (Field Weighted Citation Impact)
46
Refs
0.76
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Speech and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Music and Audio Processing
Physical Sciences →  Computer Science →  Signal Processing
Hearing Loss and Rehabilitation
Life Sciences →  Neuroscience →  Cognitive Neuroscience

Related Documents

BOOK-CHAPTER

Ambisonic Binaural Rendering

Newton LeeNewton Lee

Encyclopedia of Computer Graphics and Games Year: 2024 Pages: 96-96
BOOK-CHAPTER

Binaural Rendering Using Mixed-Order Ambisonic Scheme

Jiawei XiaWen Zhang

Lecture notes in electrical engineering Year: 2024 Pages: 85-99
JOURNAL ARTICLE

Diffuse-Field Equalisation of Binaural Ambisonic Rendering

Thomas McKenzieDamian MurphyGavin Kearney

Journal:   Applied Sciences Year: 2018 Vol: 8 (10)Pages: 1956-1956
JOURNAL ARTICLE

Interaural Level Difference Optimization of Binaural Ambisonic Rendering

Thomas McKenzieDamian MurphyGavin Kearney

Journal:   Applied Sciences Year: 2019 Vol: 9 (6)Pages: 1226-1226
© 2026 ScienceGate Book Chapters — All rights reserved.