Structure-Aware Generative Adversarial Network for Text-to-Image Generation

Wenjie Chen; Zhangkai Ni; Hanli Wang

doi:10.1109/icip49359.2023.10222100

ScienceGate Book Chapters

JOURNAL ARTICLE

Structure-Aware Generative Adversarial Network for Text-to-Image Generation

Wenjie Chen Zhangkai Ni Hanli Wang

Year: 2023 Pages: 2075-2079

DOI: 10.1109/icip49359.2023.10222100

Get Full-Text PDF Get Analytical Report

Abstract

Text-to-image generation aims at synthesizing photo-realistic images from textual descriptions. Existing methods typically align images with the corresponding texts in a joint semantic space. However, the presence of the modality gap in the joint semantic space leads to misalignment. Meanwhile, the limited receptive field of the convolutional neural network leads to structural distortions of generated images. In this work, a structure-aware generative adversarial network (SaGAN) is proposed for (1) semantically aligning multimodal features in the joint semantic space in a learnable manner; and (2) improving the structure and contour of generated images by the designed content-invariant negative samples. Experimental results show that SaGAN achieves over 30.1% and 8.2% improvements in terms of FID on the datasets of CUB and COCO when compared with the state-of-the-art approaches.

Keywords:

Semantic space Computer science Artificial intelligence Generative adversarial network Convolutional neural network Generative grammar Joint (building) Adversarial system Invariant (physics) Image (mathematics) Pattern recognition (psychology) Semantic gap Computer vision Image retrieval Mathematics

Metrics

Cited By

0.18

FWCI (Field Weighted Citation Impact)

Refs

0.42

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Computer Graphics and Visualization Techniques

Physical Sciences → Computer Science → Computer Graphics and Computer-Aided Design

Digital Media Forensic Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Structure-Aware Generative Adversarial Network for Text-to-Image Generation

Abstract

Metrics

Citation History

Topics

Related Documents

Semantic layout aware generative adversarial network for text-to-image generation

Text-to-Image Generation using Generative Adversarial Network

Text to Image Generation using Generative Adversarial Network Model

Text to Image Generation Using Attentional Generative Adversarial Network

TEXT DESCRIPTION TO IMAGE GENERATION USING GENERATIVE ADVERSARIAL NETWORK