JOURNAL ARTICLE

MA-CVP-MVSNet: Multi-View Stereo Model Based on Hybrid Attention Network

Abstract

In current deep-learning-based multi-view stereo methods, feature extraction and cost volume regularization are two key steps that affect the reconstruction quality. Most current methods have difficulties both in accurately extract the required features and fully utilize the multi-scale contextual semantic information in the cost volumes. In this work, we propose a MA-CVP-MVSNet based on hybrid attention mechanism for multi-view stereo. The proposed method consists of two core attention mechanisms. One is the Criss-Cross Attention module to capture the global dependencies of the pixels in the feature map. The other is the SK Attention module, which is used for cost volume regularization to aggregate multi-scale contextual semantic information in the cost volumes. Experiments show that our method has a remarkable improvement in accuracy and achieves competitive results.

Keywords:
Computer science Artificial intelligence Regularization (linguistics) Feature extraction Pixel Feature (linguistics) Volume (thermodynamics) Aggregate (composite) Computer vision Pattern recognition (psychology)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
32
Refs
0.21
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Enhancement Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.