Abstract

Previous monocular depth estimation methods take a single view and directly regress the expected results. Though recent advances are made by applying geometrically inspired loss functions during training, the inference procedure does not explicitly impose any geometrical constraint. Therefore these models purely rely on the quality of data and the effectiveness of learning to generalize. This either leads to suboptimal results or the demand of huge amount of expensive ground truth labelled data to generate reasonable results. In this paper, we show for the first time that the monocular depth estimation problem can be reformulated as two sub-problems, a view synthesis procedure followed by stereo matching, with two intriguing properties, namely i) geometrical constraints can be explicitly imposed during inference; ii) demand on labelled depth data can be greatly alleviated. We show that the whole pipeline can still be trained in an end-to-end fashion and this new formulation plays a critical role in advancing the performance. The resulting model outperforms all the previous monocular depth estimation methods as well as the stereo block matching method in the challenging KITTI dataset by only using a small number of real training data. The model also generalizes well to other monocular depth estimation benchmarks. We also discuss the implications and the advantages of solving monocular depth estimation using stereo methods.

Keywords:
Monocular Computer science Inference Matching (statistics) Artificial intelligence Pipeline (software) Ground truth Constraint (computer-aided design) Computer vision Block (permutation group theory) Mathematics

Metrics

203
Cited By
19.35
FWCI (Field Weighted Citation Impact)
56
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Sparse view stereo matching

Rimon Elias

Journal:   Pattern Recognition Letters Year: 2007 Vol: 28 (13)Pages: 1667-1678
JOURNAL ARTICLE

Unsupervised Learning for Stereo Matching Using Single-View Videos

Phuc Nguyen HongChang Wook Ahn

Journal:   IEEE Access Year: 2020 Vol: 8 Pages: 73804-73815
BOOK-CHAPTER

Learning Two-View Stereo Matching

Jianxiong XiaoJingni ChenDit‐Yan YeungLong Quan

Lecture notes in computer science Year: 2008 Pages: 15-27
JOURNAL ARTICLE

Robust multi-view stereo without matching

Philippe LambertP. Hébert

Year: 2009 Vol: 1 Pages: 1614-1621
BOOK-CHAPTER

Wide Field of View Stereo Matching

Year: 2021 Pages: 1394-1394
© 2026 ScienceGate Book Chapters — All rights reserved.