This paper proposes a multi-modal reference-free objective quality assessment method for point cloud videos. The method extracts spatial and temporal features of distorted point cloud videos through three network branches, namely point cloud, projection, and video, respectively, and then fuses the spatio-temporal features using a multi-modal attention mechanism to improve the accuracy and generalization ability of point cloud video quality prediction. In this paper, sufficient experiments are conducted on a self-constructed database consisting of 36 distorted point cloud video sequences, and the experimental results show that the multi-modal quality assessment method proposed in this paper outperforms the current state-of-the-art 11 unimodal and bimodal methods, and can further improve the quality assessment performance.
Zicheng ZhangWei SunXiongkuo MinQiyuan WangJun HeQuan ZhouGuangtao Zhai
Yating LiuZiyu ShanYujie ZhangYiling Xu
Xuemei ZhouIrene ViolaRuihong YinPablo César
Jit ChatterjeeM.R. CreemersJohan CoosemansMaria Torres Vega