JOURNAL ARTICLE

SGTR+: End-to-End Scene Graph Generation With Transformer

Rongjie LiSongyang ZhangXuming He

Year: 2023 Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Vol: 46 (4)Pages: 2191-2205   Publisher: IEEE Computer Society

Abstract

Scene Graph Generation (SGG) remains a challenging visual understanding task due to its compositional property. Most previous works adopt a bottom-up, two-stage or point-based, one-stage approach, which often suffers from high time complexity or suboptimal designs. In this paper, we propose a novel SGG method to address the aforementioned issues, formulating the task as a bipartite graph construction problem. To address the issues above, we create a transformer-based end-to-end framework to generate the entity and entity-aware predicate proposal set, and infer directed edges to form relation triplets. Moreover, we design a graph assembling module to infer the connectivity of the bipartite scene graph based on our entity-aware structure, enabling us to generate the scene graph in an end-to-end manner. Based on bipartite graph assembling paradigm, we further propose a new technical design to address the efficacy of entity-aware modeling and optimization stability of graph assembling. Equipped with the enhanced entity-aware design, our method achieves optimal performance and time-complexity. Extensive experimental results show that our design is able to achieve the state-of-the-art or comparable performance on three challenging benchmarks, surpassing most of the existing approaches and enjoying higher efficiency in inference.

Keywords:
Computer science Inference End-to-end principle Theoretical computer science Graph Bipartite graph Transformer Artificial intelligence

Metrics

9
Cited By
1.64
FWCI (Field Weighted Citation Impact)
84
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

SGTR: End-to-end Scene Graph Generation with Transformer

Rongjie LiSongyang ZhangXuming He

Journal:   2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Year: 2022 Pages: 19464-19474
JOURNAL ARTICLE

End-to-End Video Scene Graph Generation With Temporal Propagation Transformer

Yong ZhangYingwei PanTing YaoRui HuangTao MeiChang Wen Chen

Journal:   IEEE Transactions on Multimedia Year: 2023 Vol: 26 Pages: 1613-1625
JOURNAL ARTICLE

Pair with prior queries for end-to-end scene graph generation

Songqing CaiXiaojun ChangShengsheng Ren

Journal:   IET conference proceedings. Year: 2024 Vol: 2023 (38)Pages: 100-105
© 2026 ScienceGate Book Chapters — All rights reserved.