Prototype-Based Embedding Network for Scene Graph Generation

Chaofan Zheng; Xinyu Lyu; Lianli Gao; Bo Dai; Jingkuan Song

doi:10.1109/cvpr52729.2023.02182

ScienceGate Book Chapters

JOURNAL ARTICLE

Prototype-Based Embedding Network for Scene Graph Generation

Chaofan Zheng Xinyu Lyu Lianli Gao Bo Dai Jingkuan Song

Year: 2023 Pages: 22783-22792

DOI: 10.1109/cvpr52729.2023.02182

Get Full-Text PDF Get Analytical Report

Abstract

Current Scene Graph Generation (SGG) methods explore contextual information to predict relationships among entity pairs. However, due to the diverse visual appearance of numerous possible subject-object combinations, there is a large intra-class variation within each predicate category, e.g., "man-eating-pizza, giraffe-eating-leaf", and the severe inter-class similarity between different classes, e.g., "man-holding-plate, man-eating-pizza", in model's latent space. The above challenges prevent current SGG methods from acquiring robust features for reliable relation prediction. In this paper, we claim that the predicate's category-inherent semantics can serve as class-wise prototypes in the semantic space for relieving the challenges. To the end, we propose the Prototype-based Embedding Network (PE-Net), which models entities/predicates with prototype-aligned compact and distinctive representations and thereby establishes matching between entity pairs and predicates in a common embedding space for relation recognition. Moreover, Prototype-guided Learning (PL) is introduced to help PE-Net efficiently learn such entity-predicate matching, and Prototype Regularization (PR) is devised to relieve the ambiguous entity-predicate matching caused by the predicate's semantic overlap. Extensive experiments demonstrate that our method gains superior relation recognition capability on SGG, achieving new state-of-the-art performances on both Visual Genome and Open Images datasets. The codes are available at https://github.com/VL-Group/PENET.

Keywords:

Embedding Computer science Predicate (mathematical logic) Scene graph Artificial intelligence Graph embedding Matching (statistics) Natural language processing Pattern recognition (psychology) Mathematics

Metrics

Cited By

10.55

FWCI (Field Weighted Citation Impact)

Refs

0.98

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Prototype-Based Embedding Network for Scene Graph Generation

Abstract

Metrics

Citation History

Topics

Related Documents

Transformer-based Deep Embedding Network for Scene Graph Generation

Union-Redefined Prototype Network for scene graph generation

Multifeature fusion embedding network for unbiased scene graph generation

Synergetic Prototype Learning Network for Unbiased Scene Graph Generation

Complex Relation Embedding for Scene Graph Generation