LTMVSNet: A Lightweight Transformer Network for Multi-View Stereo

Pan Li; Mingfu Xiong; Xinrong Hu; Tao Peng; Ziqi Wang

doi:10.1109/aihcir61661.2023.00059

ScienceGate Book Chapters

JOURNAL ARTICLE

LTMVSNet: A Lightweight Transformer Network for Multi-View Stereo

Pan Li Mingfu Xiong Xinrong Hu Tao Peng Ziqi Wang

Year: 2023 Pages: 328-336

DOI: 10.1109/aihcir61661.2023.00059

Get Full-Text PDF Get Analytical Report

Abstract

Multi-View Stereo (MVS) has been a popular area of interest in computer vision research. The learning-based MVS approach consists of four steps: 2D CNN feature extraction, variance-based cost aggregation by homography warping, 3D CNN cost regularisation and deep regression. Existing MVS methods often benefit from heavy backbones at the expense of model size, so designing lightweight effective models is crucial for applications using low-configuration devices. In this paper, LTMVSNet is proposed for small scenes to explore for feature extraction and cost aggregation. With a lightweight Feature Extraction Transformer (FET) and internal attention, LTMVSNet is able to aggregate global contextual information and improve the handling of low-texture and non-Lambertian regions or severely occluded areas. For cost aggregation, LTMVSNet utilises epipolar constraints to construct 3D associations of 2D features, reducing the number of depth assumptions and eliminating the need for additional parameters. Propagation of depth maps using a coarse- to-fine cascade structure, and extensive experiments show that LTMVSNet achieves state-of-the-art performance on the DTU dataset as well as the Tanks and Temples intermediate set.

Keywords:

Computer science Image warping Artificial intelligence Feature extraction Transformer Computer vision Pattern recognition (psychology) Voltage

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.23

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Advanced Vision and Imaging

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Optical measurement and interference techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Robotics and Sensor-Based Localization

Physical Sciences → Engineering → Aerospace Engineering

LTMVSNet: A Lightweight Transformer Network for Multi-View Stereo

Abstract

Metrics

Topics

Related Documents

LE-MVSNet: Lightweight Efficient Multi-view Stereo Network

Transformer-guided Feature Pyramid Network for Multi-View Stereo

Efficient Multi-View Stereo Network with Cross-Scale Transformer

U-ETMVSNet: Uncertainty-Epipolar Transformer Multi-View Stereo Network for Object Stereo Reconstruction

MTD-MVSNet: Multi-view Stereo Network with Multi-scale Transformer and Dual Attention