Subjective and Objective Audio-Visual Quality Assessment for Omnidirectional Videos

Xilei Zhu; Huiyu Duan; Yuqin Cao; Yucheng Zhu; Yuxin Zhu; Jing Liu; Xiongkuo Min; Guangtao Zhai; Patrick Le Callet

doi:10.1109/tip.2025.3613957

ScienceGate Book Chapters

JOURNAL ARTICLE

Subjective and Objective Audio-Visual Quality Assessment for Omnidirectional Videos

Xilei Zhu Huiyu Duan Yuqin Cao Yucheng Zhu Yuxin Zhu Jing Liu Xiongkuo Min Guangtao Zhai Patrick Le Callet

Year: 2025 Journal: IEEE Transactions on Image Processing Vol: 34 Pages: 6506-6523 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/tip.2025.3613957

Get Full-Text PDF Get Analytical Report

Abstract

Virtual Reality (VR) has attracted widespread attention in recent years due to its capability to create immersive experiences by presenting multi-modal information to users. Omnidirectional videos (ODVs), as a prominent component of VR content, are essential across diverse applications. This necessitates service providers to monitor and optimize the quality of ODVs throughout the filming, encoding, decoding, and transmission stages to ensure a high-quality viewing experience. However, most existing Quality of Experience (QoE) studies for ODVs only focus on the visual quality, while overlooking the impact of the audio modality on perceptual quality. This paper presents a comprehensive study of omnidirectional audio-visual quality assessment (OD-AVQA) from both subjective and objective perspectives. Specifically, we first establish a large-scale audio-visual quality assessment database for ODVs named OAVQAD+, which includes 625 distorted omnidirectional audio-visual sequences derived from 25 pristine ODVs, and the corresponding collected mean opinion scores (MOSs) for the QoE of these ODVs. This contributes to the largest database for assessing the audio-visual quality of ODVs. To advance the fields of objective OD-AVQA, we construct a benchmark that includes three types of benchmark models. Type I and Type II models integrate well-known video quality assessment (VQA) and audio quality assessment (AQA) methods using support vector regression (SVR) and multi-layer perceptron (MLP), respectively, while Type III consists of AVQA models specifically designed for traditional 2D audio-visual sequences. We also propose a novel Omnidirectional Audio-Visual quality assessment Network (OmniAVNet) that integrates quality-aware audio, visual, and motion features to predict overall audio-visual quality for ODVs effectively, which supports both full-reference (FR) and no-reference (NR) assessment. Extensive experimental results demonstrate that OmniAVNet outperforms the aforementioned benchmark OD-AVQA models on two OD-AVQA databases, and shows great performance on one omnidirectional VQA database. The database and code are available at https://github.com/IntMeGroup/OmniAVNet.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.42

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Image and Video Quality Assessment

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Subjective and Objective Audio-Visual Quality Assessment for Omnidirectional Videos

Abstract

Metrics

Topics

Related Documents

Subjective Quality Assessment of HDR Stereoscopic Omnidirectional Videos

Audio-Visual Saliency for Omnidirectional Videos

Subjective and objective quality assessment of omnidirectional video

Subjective and objective quality assessment for omnidirectional video

Subjective and Objective Audio-Visual Quality Assessment for User Generated Content