DISSERTATION

Structured visual understanding and generation with deep generative models

Song, Yuhang (author)

Year: 2020 University:   University of Southern California Digital Library

Abstract

In recent years, deep learning has made a lot of impacts and achievements to the computer vision community. Nowadays, deep learning model can recognize thousands of image categories, with various architectures, deeper and deeper. In complex scene, deep neural models can localize objects and detect a number of object categories and perform instance segmentation afterward. At most recently, a number of scene graph generation and visual relationship detection methods are developed for high-level image understanding, in order to extract more fine-grained and structural representation from images. As a dual problem of visual understanding, visual generation also attracts lots of attention during these few years in the light of deep learning techniques. Deep generative models can generate realistic images with high resolution and high quality, and also be further applied to make image translation across different domains and environments. The world around us is highly structured and images are highly structured. Images can not only contain multiple foreground object categories but also contain various background either in natural scenes or artificial scenarios. In this thesis, we mainly leverage structure information for visual generation and understanding in these tasks: 1) leveraging the semantic structure to generate realistic images

Keywords:
Deep learning Leverage (statistics) Generative model Generative grammar Segmentation Object detection Image segmentation Pattern recognition (psychology) Cognitive neuroscience of visual object recognition

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Generative Adversarial Networks and Image Synthesis
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Structured Generative Models for Scene Understanding

Christopher K. I. Williams

Journal:   International Journal of Computer Vision Year: 2024 Vol: 133 (5)Pages: 2845-2867
JOURNAL ARTICLE

GAN Lab: Understanding Complex Deep Generative Models using Interactive Visual Experimentation

Minsuk KahngNikhil ThoratDuen Horng ChauFernanda ViégasMartin Wattenberg

Journal:   IEEE Transactions on Visualization and Computer Graphics Year: 2018 Vol: 25 (1)Pages: 310-320
JOURNAL ARTICLE

Human-controllable and structured deep generative models

Tran, Dieu Linh

Journal:   Spiral (Imperial College London) Year: 2021
© 2026 ScienceGate Book Chapters — All rights reserved.