JOURNAL ARTICLE

MVSNet++: Learning Depth-Based Attention Pyramid Features for Multi-View Stereo

Po-Heng ChenHsiao-Chien YangKuan‐Wen ChenYong‐Sheng Chen

Year: 2020 Journal:   IEEE Transactions on Image Processing Vol: 29 Pages: 7261-7273   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The goal of Multi-View Stereo (MVS) is to reconstruct 3D point-cloud model from multiple views. On the basis of the considerable progress of deep learning, an increasing amount of research has moved from traditional MVS methods to learning-based ones. However, two issues remain unsolved in the existing state-of-the-art methods: (1) only high-level information is considered for depth estimation. This may reduce the localization accuracy of 3D points as the learned model lacks spatial information; and (2) most of the methods require additional post-processing or network refinement to generate a smooth 3D model. This significantly increases the number of model parameters or the computational complexity. To this end, we propose MVSNet++, an end-to-end trainable network for dense depth estimation. Such an estimated depth map can further be applied to 3D model reconstruction. Different from previous methods, in the proposed method, we first adopt feature pyramid structures for both feature extraction and cost volume regularization. This can lead to accurate 3D point localization by fusing multi-level information. To generate smooth depth map, we then carefully integrate instance normalization into MVSNet++ without increasing model parameters and computational burden. Furthermore, we additionally design three loss functions and integrate Curriculum Learning framework into the training process, which can lead to an accurate reconstruction of 3D model. MVSNet++ is evaluated on DTU and Tanks & Temples benchmarks with comprehensive ablation studies. Experimental results demonstrate that our proposed method performs favorably against previous state-of-the-art methods, showing the accuracy and effectiveness of the proposed MVSNet++.

Keywords:
Computer science Point cloud Artificial intelligence Pyramid (geometry) Deep learning Feature extraction Depth map Normalization (sociology) Regularization (linguistics) Feature (linguistics) Robustness (evolution) Computer vision Pattern recognition (psychology) Machine learning Image (mathematics) Mathematics

Metrics

47
Cited By
3.04
FWCI (Field Weighted Citation Impact)
54
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Optical measurement and interference techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
3D Surveying and Cultural Heritage
Physical Sciences →  Earth and Planetary Sciences →  Geology

Related Documents

JOURNAL ARTICLE

PA-MVSNet: Sparse-to-Dense Multi-View Stereo With Pyramid Attention

Ke ZhangMengyu LiuJinlai ZhangZhenbiao Dong

Journal:   IEEE Access Year: 2021 Vol: 9 Pages: 27908-27915
BOOK-CHAPTER

MVSNet: Depth Inference for Unstructured Multi-view Stereo

Yao YaoZixin LuoShiwei LiTian FangLong Quan

Lecture notes in computer science Year: 2018 Pages: 785-801
JOURNAL ARTICLE

NR-MVSNet: Learning Multi-View Stereo Based on Normal Consistency and Depth Refinement

Jingliang LiZhengda LuYiqun WangJun XiaoYing Wang

Journal:   IEEE Transactions on Image Processing Year: 2023 Vol: 32 Pages: 2649-2662
JOURNAL ARTICLE

DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis

Jingliang LiZhengda LuYiqun WangYing WangJun Xiao

Journal:   Proceedings of the 30th ACM International Conference on Multimedia Year: 2022 Pages: 5593-5601
JOURNAL ARTICLE

EPP-MVSNet: Epipolar-assembling based Depth Prediction for Multi-view Stereo

Xinjun MaYue GongQirui WangJingwei HuangLei ChenFan Yu

Journal:   2021 IEEE/CVF International Conference on Computer Vision (ICCV) Year: 2021 Pages: 5712-5720
© 2026 ScienceGate Book Chapters — All rights reserved.