JOURNAL ARTICLE

TransDSSL: Transformer Based Depth Estimation via Self-Supervised Learning

Daechan HanJeongmin ShinNamil KimSoonmin HwangYukyung Choi

Year: 2022 Journal:   IEEE Robotics and Automation Letters Vol: 7 (4)Pages: 10969-10976   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Recently, transformers have been widely adopted for various computer vision tasks and show promising results due to their ability to encode long-range spatial dependencies in an image effectively. However, very few studies on adopting transformers in self-supervised depth estimation have been conducted. When replacing the CNN architecture with the transformer in self-supervised learning of depth, we encounter several problems such as problematic multi-scale photometric loss function when used with transformers and, insufficient ability to capture local details. In this letter, we propose an attention-based decoder module, Pixel-Wise Skip Attention (PWSA), to enhance fine details in feature maps while keeping global context from transformers. In addition, we propose utilizing self-distillation loss with single-scale photometric loss to alleviate the instability of transformer training by using correct training signals. We demonstrate that the proposed model performs accurate predictions on large objects and thin structures that require global context and local details. Our model achieves state-of-the-art performance among the self-supervised monocular depth estimation methods on KITTI and DDAD benchmarks.

Keywords:
Computer science Transformer Artificial intelligence Monocular Pixel Machine learning Pattern recognition (psychology) Computer vision Engineering Voltage

Metrics

33
Cited By
4.09
FWCI (Field Weighted Citation Impact)
38
Refs
0.94
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
Optical measurement and interference techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

TinyDepth: Lightweight self-supervised monocular depth estimation based on transformer

Zeyu ChengYi ZhangYang YuZhe SongChengkai Tang

Journal:   Engineering Applications of Artificial Intelligence Year: 2024 Vol: 138 Pages: 109313-109313
JOURNAL ARTICLE

Self-Supervised Monocular Depth Estimation Using Hybrid Transformer Encoder

Seung-Jun HwangSung Jun ParkJoong-Hwan BaekByungkyu Kim

Journal:   IEEE Sensors Journal Year: 2022 Vol: 22 (19)Pages: 18762-18770
JOURNAL ARTICLE

Transformer-based Models for Supervised Monocular Depth Estimation

Arijit GuptaA. PrinceJac Fredo Agastinose RonickomF. Robert

Journal:   2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP) Year: 2022 Vol: 3 Pages: 1-5
© 2026 ScienceGate Book Chapters — All rights reserved.