Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective

Shichang Tang; Xu Zhou; Xuming He; Yi Ma

doi:10.1109/icpr48806.2021.9411925

ScienceGate Book Chapters

JOURNAL ARTICLE

Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective

Shichang Tang Xu Zhou Xuming He Yi Ma

Year: 2021 Vol: abs 1606 3498 Pages: 10042-10049

DOI: 10.1109/icpr48806.2021.9411925

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we look into the problem of disentangled representation learning and controllable image synthesis in a deep generative model. We develop an encoder-decoder architecture for a variant of the Variational Auto-Encoder (VAE) with two latent codes z ₁ and z ₂ . Our framework uses z ₂ to capture specified factors of variation while z ₁ captures the complementary factors of variation. To this end, we analyze the learning problem from the perspective of multivariate mutual information, derive optimizable lower bounds of the conditional mutual information in the image synthesis processes and incorporate them into the training objective. We validate our method empirically on the Color MNIST dataset and the CelebA dataset by showing controllable image syntheses. Our proposed paradigm is simple yet effective and is applicable to many situations, including those where there is not an explicit factorization of features available, or where the features are non-categorical.

Keywords:

MNIST database Computer science Artificial intelligence Representation (politics) Perspective (graphical) Factorization Encoder Image (mathematics) Theoretical computer science Deep learning Machine learning Algorithm

Metrics

Cited By

0.31

FWCI (Field Weighted Citation Impact)

Refs

0.54

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Generative Adversarial Networks and Image Synthesis

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Digital Media Forensic Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Handwritten Text Recognition Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective

Abstract

Metrics

Citation History

Topics

Related Documents

Disentangled Text Representation Learning With Information-Theoretic Perspective for Adversarial Robustness

Disentangled Representation Learning for Controllable Person Image Generation

Unsupervised Retina Image Synthesis via Disentangled Representation Learning

Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis

DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation