BOOK-CHAPTER

Text-to-Image Generation using Generative Adversarial Network

Abstract

A deep learning model called Text-to-Image Creation with Generative Adversarial Networks (GAN) can generate images from text descriptions. A wide range of applications, including photo-searching, photo-editing, art creation, computer-aided design, image reconstruction, captioning, and portrait drawing, are among the several study fields that it has a significant impact on. Producing realistic visuals consistently under predetermined settings is the most difficult endeavor. Current text-to-image creation algorithms produce images that don't accurately reflect the text. The Caltech-UCSD Birds-200-2011 dataset was used to train the suggested model, and an inception score and PSNR were used to assess its performance. Stage-I and Stage-II make up the proposed StackGAN paradigm. Based on the input written description, Stage-I GAN generates low-resolution images by the method of roughing out the basic shape and colors of the object. By using the Stage-I results and textual descriptions as inputs, together with defect detection and detail addition, Stage-II GAN creates high-resolution and photo-realistic images with fine details.

Keywords:
Adversarial system Generative grammar Image (mathematics) Generative adversarial network Computer science Artificial intelligence Computer vision

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
15
Refs
0.04
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Smart Agriculture and AI
Life Sciences →  Agricultural and Biological Sciences →  Plant Science
Video Analysis and Summarization
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
© 2026 ScienceGate Book Chapters — All rights reserved.