JOURNAL ARTICLE

Semi-Supervised Deep Learning for Monocular Depth Map Prediction

Abstract

Supervised deep learning often suffers from the lack of sufficient training data. Specifically in the context of monocular depth map prediction, it is barely possible to determine dense ground truth depth images in realistic dynamic outdoor environments. When using LiDAR sensors, for instance, noise is present in the distance measurements, the calibration between sensors cannot be perfect, and the measurements are typically much sparser than the camera images. In this paper, we propose a novel approach to depth map prediction from monocular images that learns in a semi-supervised way. While we use sparse ground-truth depth for supervised learning, we also enforce our deep network to produce photoconsistent dense depth maps in a stereo setup using a direct image alignment loss. In experiments we demonstrate superior performance in depth map prediction from single images compared to the state-of-the-art methods.

Keywords:
Artificial intelligence Depth map Ground truth Computer science Monocular Computer vision Deep learning Lidar Context (archaeology) Noise (video) Calibration Supervised learning Pattern recognition (psychology) Image (mathematics) Artificial neural network Remote sensing Mathematics Geology

Metrics

684
Cited By
45.51
FWCI (Field Weighted Citation Impact)
37
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Robotics and Sensor-Based Localization
Physical Sciences →  Engineering →  Aerospace Engineering
Optical measurement and interference techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.