When determining navigation actions, it is important to design effective visual and semantic representations of the observation scenes and robust navigation strategies. The paper proposes a goal-oriented visual semantic navigation method using semantic knowledge graph and transformer. Two kinds of knowledge graphs representing the location relationship between objects are constructed, namely current knowledge graph and prior knowledge graph. The pre-constructed prior knowledge graph is periodically updated by the current knowledge graph obtained in real time, and embedded into the semantic feature vector through graph convolutional network (GCN). The semantic features and extracted scene features are jointly embedded and stored, they are jointly fed into the transformer module to explore the spatio-temporal dependencies between objects in the environment. The navigation strategy is obtained from the Asynchronous Advantage Actor-Critic (A3C) model composed of Long-Short Term Memory (LSTM) and Multi-Layer Perception (MLP). Experiments show that the knowledge graph can significantly improve the navigation performance. More importantly, our experimental results show that our method can improve the generalization ability of navigation across novel scenes and novel objects. Video can be available at https://youtu.be/ZMjNvoK2rbY. Note to Practitioners — The motivation of this work is to develop an efficient visual semantic navigation method. Conventional navigation algorithms lack semantic information and learning ability, and can not adapt to the complex unknown environments. When semantic information is included in navigation, the location relationship between objects can be obtained as a prior knowledge, which can be combined with reinforcement learning to achieve autonomous navigation of agents. In this article, a knowledge graph representing the location relationships between objects has been constructed and regularly updated in real-time. The proposed visual semantic navigation method further improves the generalization ability of navigation. This navigation method can be applied to mobile robots and deployed in many scenarios such as home, restaurant, hospitals, and even factories.
Jingwen GuoZhisheng LuTi WangWeibo HuangHong Liu
Liang HeBin ShaoYanghua XiaoYatao LiTie‐Yan LiuEnhong ChenHuanhuan Xia
Yunlian LyuMohammad Sadegh Talebi
Yu WuNiansheng ChenLei RaoGuangyu FanDingyu YangSonglin ChengXiaoyong SongYiping Ma
Sige LiuNan LiYansha DengTony Q. S. Quek