A Novel End-to-End Transformer for Scene Graph Generation

Chengkai Ren; Xiuhua Liu; Mengyuan Cao; Jian Zhang; Hongwei Wang

doi:10.1109/ijcnn54540.2023.10191798

ScienceGate Book Chapters

JOURNAL ARTICLE

A Novel End-to-End Transformer for Scene Graph Generation

Chengkai Ren Xiuhua Liu Mengyuan Cao Jian Zhang Hongwei Wang

Year: 2023 Pages: 1-7

DOI: 10.1109/ijcnn54540.2023.10191798

Get Full-Text PDF Get Analytical Report

Abstract

An image usually contains not only visual information but also higher-level semantic information. Nevertheless, previous computer vision algorithms, such as target detection and image classification, use only the visual features of the image alone. Recently, the explosion of scene graphs in computer vision has led to the challenge of generating structured scene graphs with rich semantic information. This paper proposes a one-stage query-based end-to-end Transformer model and generates scene graphs using the Hungarian matching algorithm. We develop an anti-bias reasoner module to reduce the impact of the unbalanced data distribution. Time-division training strategy is proposed to improve model training efficiency and speed up model convergence while improving model training performance. Experiments on the large-scale dataset Visual Genome were conducted in order to confirm the validity of our method. Compared with the existing state-of-the-art method, our method guarantees inference speed while maintaining acceptable performance and is more suitable for tasks with high real-time performance. Our work demonstrates that the one-stage method has great potential for exploration in scene graph generation.

Keywords:

Computer science Transformer Inference Semantic reasoner End-to-end principle Artificial intelligence Scene graph Graph Information leakage Computer vision Machine learning Data mining Theoretical computer science Voltage

Metrics

Cited By

0.36

FWCI (Field Weighted Citation Impact)

Refs

0.53

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

A Novel End-to-End Transformer for Scene Graph Generation

Abstract

Metrics

Citation History

Topics

Related Documents

SGTR: End-to-end Scene Graph Generation with Transformer

SGTR+: End-to-End Scene Graph Generation With Transformer

End-to-End Video Scene Graph Generation With Temporal Propagation Transformer

DSGG: Dense Relation Transformer for an End-to-End Scene Graph Generation

Pair with prior queries for end-to-end scene graph generation