JOURNAL ARTICLE

Efficient Multi-View Stereo Network with Cross-Scale Transformer

WANG Sicheng, JIANG Hao, CHEN Xiao

Year: 2024 Journal:   DOAJ (DOAJ: Directory of Open Access Journals)

Abstract

At present, deep Multi-View Stereo (MVS) methods widely introduce Transformers into cascade networks to achieve high-resolution depth estimation, thereby ensuring highly accurate and complete 3D reconstruction results. However, Transformer-based methods are limited by their computational costs and cannot be extended to more refined stages. To solve this problem, this paper proposes a novel cross-scale Transformer-based MVS network that can manage feature representations at different stages without incurring additional computation. In particular, this study introduces an Adaptive Matching-aware Transformer (AMT), which uses different interactive attention combinations on multiple scales, enabling the proposed network to capture contextual information within images and enhance the feature relationships between images. In addition, this study proposes Dual Feature Guided Aggregation(DFGA) to embed coarse global semantic information into finer cost body construction, further enhancing the perception of global and local features. Simultaneously, a feature metric loss is designed to evaluate feature deviation before and after the Transformation and thereby reduce the impact of feature mismatch on depth estimation. Experimental results show that the integrity and overall measurements of the proposed network are 0.264 and 0.302 on the DTU dataset, respectively. The average reconstruction values for Tank and temples scenarios are 64.28 and 38.03, respectively.

Keywords:
Feature (linguistics) Transformer Cascade Feature extraction Pattern recognition (psychology) Feature model

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.47
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Optical measurement and interference techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.