Zhenghao WangJing LianLinhui LiJian Zhao
The scene graph generation aims to recognize objects and infer the relationships between them, which can provide a comprehensive understanding of image visual perception. However, the long-tailed issue of relations remains challenging for scene graph generation. This paper proposes a novel framework based on knowledge-driven data-driven joining to address the long-tail issues in scene graph generation. The proposed framework consists of two modules: the relation inference module and the prior knowledge learning module. The relation inference module aims to learn the relational features of entity pairs in images and the structural features of scene graphs. The prior knowledge learning module aims to learn the triplet representation from the knowledge graph and use it as prior knowledge to provide logical guidance and constraints for relation inference. This provides prior bias for relation inference to transfer the bias towards head categories to reasonable categories, thereby mitigating the long-tail problem. Experiment results indicate that the proposed framework outperforms on Visual Genome datasets and that the generated scene graph relation is logically reasonable.
Zoltán JeskóTuan-anh TranGergely HalászJános AbonyiTamás Ruppert
Jiale LuLianggangxu ChenYouqi SongShaohui LinChangbo WangGaoqi He
Shuang WangLianli GaoXinyu LyuYuyu GuoPengpeng ZengJingkuan Song
Xiang YuRuoxin ChenJie LiJiawei SunShijing YuanHuxiao JiXinyu LuChentao Wu