DISSERTATION

Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Abstract

In this thesis, we investigate the use of single-image depth prediction from convolutional neural networks (CNNs) in sparse and dense monocular visual simultaneous localization and mapping (SLAM) problems. Mainly, we are interested in solving three problems: (1) data association, (2) dense mapping, and (3) long-term adaptation. Hence, we divide the thesis into three parts to discuss the contributions to solving the problems mentioned above. To improve the robustness of data association in visual SLAM, our first proposal extends the state-of-the-art semi-direct visual SLAM algorithm using single-image depth prediction to improve the reliability of feature matching. We propose to use the additional depth information to initialize new features with a small uncertainty centred at the predicted depth. By reducing depth uncertainty, feature correspondence can be identified in a reduced search range along the epipolar line, resulting in fast convergence of the feature depth and improved mapping performance. With the improved mapping performance, our method outperforms the state-of-the-art visual SLAM algorithms in camera tracking error. To recover a dense structure, we densify the semi-dense structure of the scene recovered from the state-of-the-art direct SLAM algorithm, LSD-SLAM. To this end, our second proposal exploits the local depth gradient consistency from single-image relative depth prediction as a spatial regularizer to densify the semi-dense depth maps. In addition, we propose an adaptive filtering scheme that incorporates the depth and pixel intensity of a local window to reduce the noise of the semi-dense structure, which allows for a substantial gain in densification accuracy. The optimized semi-dense and densified structures, in turn, are being used to refine the pose-graph to refine the pose estimation. Experimental results show that our dense reconstruction accuracy outperforms the state-of-the-art methods by a large margin. Nevertheless, single-image depth prediction from CNNs tends to give accurate depth estimations on images similar to that of the training images. Therefore, to improve the generality of single-image depth prediction used in visual SLAM, our third proposal introduces a long-term adaptation framework, which supports online fine-tuning of a depth prediction CNN to improve its accuracy while leveraging improved quality of depth prediction to optimize the structure and camera pose estimation globally. Particularly, we propose a novel online adaptation method in which the fine-tuning is enhanced with regularization to retain the previously learned knowledge while the CNN is continually trained. We demonstrate the use of fine-tuned depth prediction for map point culling before running global photometric BA, resulting in a more accurate map reconstruction than running global photometric BA on all map points.

Keywords:
Epipolar geometry Simultaneous localization and mapping Robustness (evolution) Monocular Feature (linguistics) Pixel Visualization Depth map Pattern recognition (psychology) Range (aeronautics)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Mycorrhizal Fungi and Plant Interactions
Life Sciences →  Agricultural and Biological Sciences →  Plant Science
Genomics and Phylogenetic Studies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Lichen and fungal ecology
Life Sciences →  Agricultural and Biological Sciences →  Ecology, Evolution, Behavior and Systematics

Related Documents

JOURNAL ARTICLE

DeepRelativeFusion: Dense Monocular SLAM using Single-Image Relative Depth Prediction

Shing Yan LooSyamsiah MashohorSai Hong TangHong Zhang

Journal:   2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Year: 2021 Pages: 6641-6648
BOOK-CHAPTER

Monocular Dense SLAM with Consistent Deep Depth Prediction

Feihu YanJiawei WenZhaoxin LiZhong Zhou

Lecture notes in computer science Year: 2021 Pages: 113-124
JOURNAL ARTICLE

DVL-SLAM: sparse depth enhanced direct visual-LiDAR SLAM

Young-Sik ShinYeong Sang ParkAyoung Kim

Journal:   Autonomous Robots Year: 2019 Vol: 44 (2)Pages: 115-130
© 2026 ScienceGate Book Chapters — All rights reserved.