JOURNAL ARTICLE

Transformer-based Models for Supervised Monocular Depth Estimation

Arijit GuptaA. PrinceJac Fredo Agastinose RonickomF. Robert

Year: 2022 Journal:   2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP) Vol: 3 Pages: 1-5

Abstract

Existing traditional solutions for monocular depth estimation, usually use convolution networks as the backbone of their model architecture. This work presents an encoder-decoder network using a transformer architecture that can perform monocular depth estimation on a single RGB image. For environment perception and autonomous navigation systems, where depth estimation is done on edge devices, there is a need for lightweight and efficient models. It is shown that transformer-based architectures provide comparable results to the currently used convolution networks with significantly fewer parameters. Unlike convolutional networks, transformers don't downsample the input progressively at each layer. Maintaining a similar resolution throughout the encoding process allows for global awareness at each stage. 2 different decoder models are implemented on top of a transformer encoder and their usability is evaluated for depth estimation. On comparing with a comparable convolution network, it is observed that on the KITTI outdoor dataset, the lighter transformer model performs better in terms of robustness and accuracy.

Keywords:
Computer science Monocular Encoder Artificial intelligence Convolutional neural network Robustness (evolution) Transformer Upsampling Computer vision Engineering Voltage Image (mathematics)

Metrics

2
Cited By
0.14
FWCI (Field Weighted Citation Impact)
33
Refs
0.41
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Vision and Imaging
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Processing Techniques and Applications
Physical Sciences →  Engineering →  Media Technology
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.