JOURNAL ARTICLE

Text To Video: Enhancing Video Generation Using Diffusion Models And Reconstruction Network

Abstract

This paper proposes a method to improve the quality of generated videos in text to video generation techniques based on diffusion models, which suffer from low quality and poor continuity.The method involves dynamically adjusting the noise frame connections to enhance the video quality. A Reconstruction Net is introduced to automatically adjust the noise correlation among frames during the training process. Experimental results demonstrate that this method can enhance the quality of generated videos, improve video continuity, enhance the representation of image details, and strengthen the correspondence between generated and original videos. This research is of significant importance in advancing the development of text-based video generation techniques based on diffusion models.

Keywords:
Computer science Video quality Noise (video) Diffusion Frame (networking) Computer vision Artificial intelligence Process (computing) Representation (politics) Video denoising Quality (philosophy) Video processing Video tracking Image (mathematics) Multimedia Multiview Video Coding

Metrics

2
Cited By
0.36
FWCI (Field Weighted Citation Impact)
31
Refs
0.56
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Computer Graphics and Visualization Techniques
Physical Sciences →  Computer Science →  Computer Graphics and Computer-Aided Design
© 2026 ScienceGate Book Chapters — All rights reserved.