This paper proposes a method to improve the quality of generated videos in text to video generation techniques based on diffusion models, which suffer from low quality and poor continuity.The method involves dynamically adjusting the noise frame connections to enhance the video quality. A Reconstruction Net is introduced to automatically adjust the noise correlation among frames during the training process. Experimental results demonstrate that this method can enhance the quality of generated videos, improve video continuity, enhance the representation of image details, and strengthen the correspondence between generated and original videos. This research is of significant importance in advancing the development of text-based video generation techniques based on diffusion models.
K. Dilanka Sanjula AppuhamyUdaya S. K. P. Miriya Thanthrige
Taegyeong LeeSoyeong KwonTaehwan Kim
Utkarssh SehgalNavroop KaurBallagan Anuranjana
Nikita SinghalPraval SinghNikhil SinghMahipal SinghHarsimrat Singh
Özgür KaraKrishna Kumar SinghFeng LiuDuygu CeylanJames M. RehgTobias Hinz