JOURNAL ARTICLE

Multi-Scale Spatial Attention-Guided Monocular Depth Estimation With Semantic Enhancement

Xianfa XuZhe ChenFuliang Yin

Year: 2021 Journal:   IEEE Transactions on Image Processing Vol: 30 Pages: 8811-8822   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Depth estimation from single monocular image is a vital but challenging task in 3D vision and scene understanding. Previous unsupervised methods have yielded impressive results, but the predicted depth maps still have several disadvantages such as missing small objects and object edge blurring. To address these problems, a multi-scale spatial attention guided monocular depth estimation method with semantic enhancement is proposed. Specifically, we first construct a multi-scale spatial attention-guided block based on atrous spatial pyramid pooling and spatial attention. Then, the correlation between the left and right views is fully explored by mutual information to obtain a more robust feature representation. Finally, we design a double-path prediction network to simultaneously generate depth maps and semantic labels. The proposed multi-scale spatial attention-guided block can focus more on the objects, especially on small objects. Moreover, the additional semantic information also enables the objects edge in the predicted depth maps more sharper. We conduct comprehensive evaluations on public benchmark datasets, such as KITTI and Make3D. The experiment results well demonstrate the effectiveness of the proposed method and achieve better performance than other self-supervised methods.

Keywords:
Computer science Artificial intelligence Pyramid (geometry) Monocular Block (permutation group theory) Pattern recognition (psychology) Enhanced Data Rates for GSM Evolution Computer vision Pooling Scale (ratio) Feature (linguistics) Depth map Focus (optics) Benchmark (surveying) Spatial analysis Image (mathematics) Mathematics Geography Remote sensing

Metrics

37
Cited By
2.56
FWCI (Field Weighted Citation Impact)
75
Refs
0.91
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
Advanced Image Processing Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Monocular Depth Estimation with Multi-Scale Attention

Bingyuan WuYongxiong Wang

Journal:   SSRN Electronic Journal Year: 2021
JOURNAL ARTICLE

DAR-MDE: Depth-Attention Refinement for Multi-Scale Monocular Depth Estimation

Saddam AbdulwahabHatem A. RashwanMoumen El-MelegyDomènec Puig

Journal:   Journal of Sensor and Actuator Networks Year: 2025 Vol: 14 (5)Pages: 90-90
JOURNAL ARTICLE

Multi-frame self-supervised monocular depth estimation with multi-scale feature enhancement

Qiqi KOUWei-Chen WangChenggong HANChen LÜDeqiang CHENGYing Ji

Journal:   Optics and Precision Engineering Year: 2024 Vol: 32 (24)Pages: 3603-3615
JOURNAL ARTICLE

Monocular depth estimation with multi-view attention autoencoder

Geunho JungSang Min Yoon

Journal:   Multimedia Tools and Applications Year: 2022 Vol: 81 (23)Pages: 33759-33770
© 2026 ScienceGate Book Chapters — All rights reserved.