Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Loo, Shing Yan

doi:10.7939/r3-tp4b-ke15

ScienceGate Book Chapters

DISSERTATION

Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Loo, Shing Yan

Year: 2022 University: University of Alberta Library

DOI: 10.7939/r3-tp4b-ke15

Get Full-Text PDF Get Analytical Report

Abstract

In this thesis, we investigate the use of single-image depth prediction from convolutional neural networks (CNNs) in sparse and dense monocular visual simultaneous localization and mapping (SLAM) problems. Mainly, we are interested in solving three problems: (1) data association, (2) dense mapping, and (3) long-term adaptation. Hence, we divide the thesis into three parts to discuss the contributions to solving the problems mentioned above. To improve the robustness of data association in visual SLAM, our first proposal extends the state-of-the-art semi-direct visual SLAM algorithm using single-image depth prediction to improve the reliability of feature matching. We propose to use the additional depth information to initialize new features with a small uncertainty centred at the predicted depth. By reducing depth uncertainty, feature correspondence can be identified in a reduced search range along the epipolar line, resulting in fast convergence of the feature depth and improved mapping performance. With the improved mapping performance, our method outperforms the state-of-the-art visual SLAM algorithms in camera tracking error. To recover a dense structure, we densify the semi-dense structure of the scene recovered from the state-of-the-art direct SLAM algorithm, LSD-SLAM. To this end, our second proposal exploits the local depth gradient consistency from single-image relative depth prediction as a spatial regularizer to densify the semi-dense depth maps. In addition, we propose an adaptive filtering scheme that incorporates the depth and pixel intensity of a local window to reduce the noise of the semi-dense structure, which allows for a substantial gain in densification accuracy. The optimized semi-dense and densified structures, in turn, are being used to refine the pose-graph to refine the pose estimation. Experimental results show that our dense reconstruction accuracy outperforms the state-of-the-art methods by a large margin. Nevertheless, single-image depth prediction from CNNs tends to give accurate depth estimations on images similar to that of the training images. Therefore, to improve the generality of single-image depth prediction used in visual SLAM, our third proposal introduces a long-term adaptation framework, which supports online fine-tuning of a depth prediction CNN to improve its accuracy while leveraging improved quality of depth prediction to optimize the structure and camera pose estimation globally. Particularly, we propose a novel online adaptation method in which the fine-tuning is enhanced with regularization to retain the previously learned knowledge while the CNN is continually trained. We demonstrate the use of fine-tuned depth prediction for map point culling before running global photometric BA, resulting in a more accurate map reconstruction than running global photometric BA on all map points.

Keywords:

Epipolar geometry Simultaneous localization and mapping Robustness (evolution) Monocular Feature (linguistics) Pixel Visualization Depth map Pattern recognition (psychology) Range (aeronautics)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Mycorrhizal Fungi and Plant Interactions

Life Sciences → Agricultural and Biological Sciences → Plant Science

Genomics and Phylogenetic Studies

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Lichen and fungal ecology

Life Sciences → Agricultural and Biological Sciences → Ecology, Evolution, Behavior and Systematics

Sparse and Dense Visual SLAM with Single-Image Depth Prediction

Abstract

Metrics

Topics

Related Documents

DeepRelativeFusion: Dense Monocular SLAM using Single-Image Relative Depth Prediction

Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image

Monocular Dense SLAM with Consistent Deep Depth Prediction

DVL-SLAM: sparse depth enhanced direct visual-LiDAR SLAM

Dense Depth Posterior (DDP) From Single Image and Sparse Range