Multi-Agent Embodied Visual Semantic Navigation With Scene Prior Knowledge

Xinzhu Liu; Di Guo; Huaping Liu; Fuchun Sun

doi:10.1109/lra.2022.3145964

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-Agent Embodied Visual Semantic Navigation With Scene Prior Knowledge

Xinzhu Liu Di Guo Huaping Liu Fuchun Sun

Year: 2022 Journal: IEEE Robotics and Automation Letters Vol: 7 (2)Pages: 3154-3161 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/lra.2022.3145964

Get Full-Text PDF Get Analytical Report

Abstract

In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when conducting more complicated tasks. Multi-agent collaboration can improve the efficiency and has strong application potentials. In this letter, we propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects. It is a challenging task that requires agents to learn reasonable collaboration strategies to perform efficient exploration under the restrictions of communication bandwidth. We develop a hierarchical decision framework based on semantic mapping, scene prior knowledge, and communication mechanism to solve this task. The experimental results in unseen scenes with both seen objects and unseen objects illustrate the higher accuracy and efficiency of the proposed model compared with the single-agent model.

Keywords:

Computer science Task (project management) Object (grammar) Artificial intelligence Embodied cognition Human–computer interaction Robot Class (philosophy) Computer vision Engineering

Metrics

Cited By

3.71

FWCI (Field Weighted Citation Impact)

Refs

0.93

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Advanced Image and Video Retrieval Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Visual Attention and Saliency Detection

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multi-Agent Embodied Visual Semantic Navigation With Scene Prior Knowledge

Abstract

Metrics

Citation History

Topics

Related Documents

Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation

Augmented Hierarchical Scene Prior Learning with Context-based Scene Completion Network for Visual Semantic Navigation

Semantic Scene Graph and Multi-agent Visual SLAM

HSPNav: Hierarchical Scene Prior Learning for Visual Semantic Navigation Towards Real Settings

Visual Navigation for Embodied Agents Using Semantic-Based Multi-Modal Cognitive Graph