Hua WangRapeeporn ChamchongPhatthanaphong ChomphuwisetPornntiwa Pawara
ABSTRACT Space‐time video super‐resolution (STVSR) aims to construct high space‐time resolution video sequences from low frame rate and low‐resolution video sequences. While recent STVSR works combine temporal interpolation and spatial super‐resolution in a unified framework, they face challenges in computational complexity across both temporal and spatial dimensions, particularly in achieving accurate intermediate frame interpolation and efficient temporal information utilisation. To address these, we propose a deformable attention network for efficient STVSR. Specifically, we introduce a deformable interpolation block that employs hierarchical feature fusion to effectively handle complex inter‐frame motions at multiple scales, enabling more accurate intermediate frame generation. To fully utilise temporal information, we design a temporal feature shuffle block (TFSB) to efficiently exchange complementary information across multiple frames. Additionally, we develop a motion feature enhancement block incorporating channel attention mechanism to selectively emphasise motion‐related features, further boosting TFSB's effectiveness. Experimental results on benchmark datasets definitively demonstrate that our proposed method achieves competitive performance in STVSR tasks.
Hai WangXiaoyu XiangYapeng TianWenming YangQingmin Liao
Linling JiangXin WangFan ZhangCaiming Zhang
Chenyu YouLianyi HanAosong FengRuihan ZhaoHui TangWei Fan