Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation

Hongshuo Tian; Ning Xu; Yanhui Wang; Chenggang Yan; Bolun Zheng; Xuanya Li; An-An Liu

doi:10.1109/icme55011.2023.00385

ScienceGate Book Chapters

JOURNAL ARTICLE

Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation

Hongshuo Tian Ning Xu Yanhui Wang Chenggang Yan Bolun Zheng Xuanya Li An-An Liu

Year: 2023 Pages: 2255-2260

DOI: 10.1109/icme55011.2023.00385

Get Full-Text PDF Get Analytical Report

Abstract

Commonsense knowledge has been widely explored to improve Scene Graph Generation (SGG). Existing methods simply incorporate the described relations of knowledge bases into each part of the scene for a concrete understanding. However, they ignore the discussion about whether a visual scene needs to associate commonsense knowledge for making inferences. Specifically, the difficulty of relation recognition varies from its type. Some frequent spatial relations (e.g. on) usually produce less perception error even without any prior information, while others involved many rules and patterns (e.g. throwing) possess few samples and require to combine with some commonsense knowledge as supplementary. In this paper, we propose a novel confidence-aware commonsense knowledge integration for SGG. Firstly, we depend on mutual information maximization to design a hybrid-attention module, which decreases the uncertainty in representation learning given external knowledge. Second, we introduce an extra branch for SGG network to perform confidence estimation independent of any ground truth labels, in which the output scalar explicitly reflects the difficulty of visual recognition. This value is equipped with the ability to balance the demand for commonsense knowledge in a given scene. Experiments are conducted with the backbone of MOTIFS on Visual Genome (VG) and our method effectively promotes the metric of mRecall with little performance hit for metric Recall, especially for predicting unseen relations.

Keywords:

Commonsense knowledge Commonsense reasoning Computer science Artificial intelligence Ground truth Natural language processing Machine learning Domain knowledge

Metrics

Cited By

0.55

FWCI (Field Weighted Citation Impact)

Refs

0.61

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Towards Confidence-Aware Commonsense Knowledge Integration for Scene Graph Generation

Abstract

Metrics

Citation History

Topics

Related Documents

Gaussian Distribution-Aware Commonsense Knowledge Learning for Scene Graph Generation

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration

Triple confidence-aware encoder–decoder model for commonsense knowledge graph completion

Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge