JOURNAL ARTICLE

Hybrid video coding scheme based on VVC and spatio-temporal attention convolution neural network

Gang HeKepeng XuChang WuZijia MaXing WenMing Sun

Year: 2022 Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pages: 1790-1793

Abstract

In this paper, we propose a hybrid video coding framework. The framework is built on the basis of VVC (Versatile Video Coding) video coding standard and constructs an implicitly aligned multi-frame fusion model to accomplish subjective video quality enhancement. The proposed framework mainly optimizes video compression efficiency from two perspectives. First is the sequence-level dynamic rate control algorithm, which assigns the appropriate bitrate to each video to obtain the highest overall video quality. Second is the MAQE, a multi frame implicit alignment video quality enhancement model, which performs motion alignment through multiple convolutional kernels of different sizes, uses a residual aggregation layer to fuse features of different frames, and then uses an enhanced attention module to adaptively deflate features based on spatiotemporal contextual features, so as to more effectively fuse feature of multiple frames and obtain higher quality reconstructed frames. The proposed method is validated on two tracks of 0.1M code rate and 1M code rate on CLIC-2022 video compression task, Experimental results show that the proposed method achieves PSNR of 30.301 and 37.251 and obtains MS-SSIM of 0.9368 and 0.9875. This paper is a comprehensive presentation of the scheme used by the Night-Watch team of the CLIC-2022 video track.

Keywords:
Computer science Fuse (electrical) Artificial intelligence Coding (social sciences) Video quality Computer vision Multiview Video Coding Data compression Residual Motion compensation Convolution (computer science) Convolutional neural network Pattern recognition (psychology) Video processing Artificial neural network Video tracking Algorithm Metric (unit)

Metrics

7
Cited By
0.98
FWCI (Field Weighted Citation Impact)
9
Refs
0.74
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Video Coding and Compression Technologies
Physical Sciences →  Computer Science →  Signal Processing
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.