JOURNAL ARTICLE

Generative adversarial networks for image synthesis

Zhang, Han

Year: 2019 Journal:   Rutgers University Community Repository (Rutgers University)   Publisher: Rutgers, The State University of New Jersey

Abstract

Image synthesis is an important problem in computer vision and has many applications, such as computer-aided design and photo-editing. There has been remarkable progress in this direction with the emergence of Generative Adversarial Networks (GANs). However, GANs still face many challenges in generating high quality images: the difficulty of directly approximating the high-resolution image distribution, the poor model generalization ability to datasets with multiple classes, the frequent occurrences of mode collapse and unstable training are among the key challenges. To tackle those challenges, we conduct extensive studies on designing new network architectures, adding regularization, introducing heuristic tricks, and modifying the learning objectives and dynamics. (i) New Stacked Generative Adversarial Networks (StackGANs) are proposed for high-resolution images synthesis. The StackGAN-v1 is first built to decompose the hard image generation problem into more manageable sub-problems through a sketch-refinement process, generating unprecedented 256256 photo-realistic images from text descriptions. Moreover, a novel Conditioning Augmentation technique, that encourages smoothness in the latent conditioning manifold, is introduced to improve the diversity of the synthesized images and stabilize the training of the conditional-GAN. To further improve the quality of generated samples and stabilize GANs’ training, an advanced multi-stage generativeadversarial network architecture, StackGAN-v2, is presented for both conditional and unconditional generative tasks. (ii) A novel Self-Attention Generative Adversarial Networks (SAGAN) is introduced for multi-class image generation. Our SAGAN incorporates the self-attention mechanism into the convolutional GAN framework, so that it can model long-range multi-level dependencies for generating realistic images on challenging datasets, such as ImageNet. Moreover, we show that the spectral normalization applied to the generator can stabilize GANs’ training and the TTUR can speed up training of regularized discriminators. (iii) We present the Optimal Transport Generative Adversarial Networks (OT-GAN), a variant of GANs minimizing a new metric measuring the distance between the generator distribution and the data distribution. This metric, called mini-batch energy distance, combines optimal transport in primal form with an energy distance defined in an adversarially learned feature space, resulting in a highly discriminative distance function with unbiased mini-batch gradients. Both qualitative and quantitative validation experiments are conducted for all proposed methods.

Keywords:
Generative grammar Adversarial system Image synthesis Generalization Key (lock) Image (mathematics) Heuristic Generative adversarial network Normalization (sociology)

Metrics

1
Cited By
0.19
FWCI (Field Weighted Citation Impact)
0
Refs
0.78
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Plant pathogens and resistance mechanisms
Life Sciences →  Agricultural and Biological Sciences →  Plant Science
Agricultural pest management studies
Life Sciences →  Agricultural and Biological Sciences →  Plant Science
Genetics and Plant Breeding
Life Sciences →  Agricultural and Biological Sciences →  Plant Science
© 2026 ScienceGate Book Chapters — All rights reserved.