3D Scene Graph Generation Using Prior Knowledge from Large Language Model (LLM)

Ho-Jun Baek; Incheol Kim

doi:10.9717/kmms.2023.26.8.859

ScienceGate Book Chapters

JOURNAL ARTICLE

3D Scene Graph Generation Using Prior Knowledge from Large Language Model (LLM)

Ho-Jun Baek Incheol Kim

Year: 2023 Journal: Journal of Korea Multimedia Society Vol: 26 (8)Pages: 859-873 Publisher: Korea Multimedia Society

DOI: 10.9717/kmms.2023.26.8.859

Get Full-Text PDF Get Analytical Report

Abstract

In this paper, we propose a novel 3D scene graph generation model, L3DSG, which can make use of rich prior knowledge obtained from large language model (LLM) by prompt engineering. The proposed model is built upon our previous 3D scene graph generation model, C3DSG, that adopts Point Transformer as 3D geometric feature extractor and uses the NE-GAT graph neural network as context reasoner. The new proposed model addresses the inability of C3DSG to utilize prior knowledge on indoor physical environments. It focuses on issues of how to obtain prior knowledge from LLM and how to make use of it for predicting objects and their relations effectively. The proposed model is extended from C3DSG by adding several elaborate modules to prompt, encode, and fuse prior knowledge from LLM. Through various experiments using the benchmark dataset 3DSSG, we show the superiority of the proposed model.

Keywords:

Computer science Scene graph Semantic reasoner Knowledge graph Transformer Artificial intelligence Graph ENCODE Extractor Machine learning Theoretical computer science Engineering

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.12

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Neural Network Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

3D Scene Graph Generation Using Prior Knowledge from Large Language Model (LLM)

Abstract

Metrics

Topics

Related Documents

Enriching Scene-Graph Generation with Prior Knowledge from Work Instruction

LLM-TIKG: Threat intelligence knowledge graph construction utilizing large language model

LLM-ACNC: Aerospace Requirement Texts Knowledge Graph Construction Utilizing Large Language Model

Scene Graph Generation with Large Vision-Language Model and Its Applications

SceneLLM: Implicit language reasoning in LLM for dynamic scene graph generation