JOURNAL ARTICLE

Multi-Scale Residual Pyramid Attention Network for Monocular Depth Estimation

Abstract

Monocular depth estimation is a challenging problem in computer vision and is crucial for understanding 3D scene geometry. Recently, deep convolutional neural networks (DCNNs) based methods have improved the estimation accuracy significantly. However, existing methods fail to consider complex textures and geometries in scenes, thereby resulting in loss of local details, distorted object boundaries, and blurry reconstruction. In this paper, we proposed an end-to-end multi-scale residual pyramid attention network (MRPAN) to mitigate these problems. First, we propose a multi-scale attention context aggregation (MACA) module, which consists of spatial attention module (SAM) and global attention module (GAM). By considering the position and scale correlation of pixels from spatial and global perspectives, the proposed module can adaptively learn the similarity between pixels so as to obtain more global context information of the image and recover complex structures in the scene. Then we proposed an improved residual refinement module (RRM) to further refine the scene structure, giving rise to deeper semantic information and retain more local details. Experimental results show that our method achieves more promising performance in object boundaries and local details compared with other state-of-the-art methods.

Keywords:
Artificial intelligence Computer science Pyramid (geometry) Residual Computer vision Pixel Convolutional neural network Context (archaeology) Similarity (geometry) Monocular Scale (ratio) Spatial contextual awareness Object (grammar) Pattern recognition (psychology) Image (mathematics) Algorithm Mathematics Geography Cartography

Metrics

6
Cited By
0.61
FWCI (Field Weighted Citation Impact)
45
Refs
0.68
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
Optical measurement and interference techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Promoting Monocular Depth Estimation by Multi-Scale Residual Laplacian Pyramid Fusion

Anmei ZhangYunchao MaJiangyu LiuJian Sun

Journal:   IEEE Signal Processing Letters Year: 2023 Vol: 30 Pages: 205-209
JOURNAL ARTICLE

Monocular Depth Estimation with Multi-Scale Attention

Bingyuan WuYongxiong Wang

Journal:   SSRN Electronic Journal Year: 2021
BOOK-CHAPTER

Residual Feature Pyramid Architecture for Monocular Depth Estimation

Chunxiu ShiJie ChenJuan Chen

Lecture notes in computer science Year: 2019 Pages: 261-266
JOURNAL ARTICLE

Multi-scale depth classification network for monocular depth estimation

Yi YangLihua TianChen LiBotong Zhang

Journal:   Computers & Electrical Engineering Year: 2022 Vol: 102 Pages: 108206-108206
© 2026 ScienceGate Book Chapters — All rights reserved.