The Enhanced Super-Resolution Generative Adver-sarial Networks (ESRGAN) is the state of the art deep learning based image Super-Resolution (SR) and has the best performance in perceptual quality. However, we find it is time-consuming, which makes it impractical for SR at clients' side during video delivery since SR usually uses clients' computing resources (the computational power at the clients' side should not be as powerful as GPU) and videos often require real-time playback. While Efficient Sub-Pixel Convolutional Neural Network (ESPCN) has the best real-time performance, it is still not capable of offering a smooth watching experience and has much lower perceptual quality. In order to simultaneously meet the demands on the real-time performance and the resulting pleasant artifacts of SR at the clients' side, we propose RTSRGAN to exploit the advantages of ESRGAN in image perceptual quality and ESPCN in real-time performance. Our experimental results indicate that RTSRGAN has the fastest reconstruction speed, on the average, 15 images per second on a single 2.3GHz CPU (only 6 images per second by ESPCN), and reconstructs images of a relatively acceptable perceptual quality, which validates that our proposed RTSRGAN can be used for SR at clients' side to enhance the real-time performance and ensure the image perceptual quality.
Xiaoyan HuZechen WangXiangjun LiuXinran LiGuang ChengJian Gong
Guang ChengJian GongZechen WangXiangjun LiuXinran LiXiaoyan Hu
Naveen Ananda Kumar Joseph AnnaiahMohan MahantyB. Omkar Lakshmi Jagan
Vivek JainB. AnnappaShubham Dodia