Learning Trajectory-Aware Transformer for Video Super-Resolution

Chengxu Liu; Huan Yang; Jianlong Fu; Xueming Qian

doi:10.1109/cvpr52688.2022.00560

ScienceGate Book Chapters

JOURNAL ARTICLE

Learning Trajectory-Aware Transformer for Video Super-Resolution

Chengxu Liu Huan Yang Jianlong Fu Xueming Qian

Year: 2022 Journal: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pages: 5677-5686

DOI: 10.1109/cvpr52688.2022.00560

Get Full-Text PDF Get Analytical Report

Abstract

Video super-resolution (VSR) aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts. Although some progress has been made, there are grand challenges to effectively utilize temporal dependency in entire video sequences. Existing approaches usually align and aggregate video frames from limited adjacent frames (e.g., 5 or 7 frames), which prevents these approaches from satisfactory results. In this paper, we take one step further to enable effective spatio-temporal learning in videos. We propose a novel Trajectory-aware Transformer for Video Super-Resolution (TTVSR). In particular, we formulate video frames into several pre-aligned trajectories which consist of continuous visual tokens. For a query token, self-attention is only learned on relevant visual tokens along spatio-temporal trajectories. Compared with vanilla vision Transformers, such a design significantly reduces the computational cost and enables Transformers to model long-range features. We further propose a cross-scale feature tokenization module to over-come scale-changing problems that often occur in long-range videos. Experimental results demonstrate the superiority of the proposed TTVSR over state-of-the-art models, by extensive quantitative and qualitative evaluations in four widely-used video super-resolution benchmarks. Both code and pre-trained models can be downloaded at https://github.com/researchmm/TTVSR.

Keywords:

Computer science Artificial intelligence Transformer Security token Computer vision Trajectory Frame rate

Metrics

107

Cited By

7.25

FWCI (Field Weighted Citation Impact)

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Advanced Image Processing Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Vision and Imaging

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Image and Signal Denoising Methods

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Learning Trajectory-Aware Transformer for Video Super-Resolution

Abstract

Metrics

Citation History

Topics

Related Documents

TTVFI: Learning Trajectory-Aware Transformer for Video Frame Interpolation

Compression-Aware Video Super-Resolution

Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

Learning Degradation-Robust Spatiotemporal Frequency-Transformer for Video Super-Resolution

Self-guided Transformer for Video Super-Resolution