JOURNAL ARTICLE

Depth-Aware Transformer for Aerial Localization

Jianjun LeiDemin TuBo PengJie ZhuZhe ZhangChong WuQingming Huang

Year: 2025 Journal:   ACM Transactions on Multimedia Computing Communications and Applications Vol: 22 (1)Pages: 1-16   Publisher: Association for Computing Machinery

Abstract

Recently, deep learning-based visual localization has gained significant attention and made remarkable advancements. Although previous visual localization methods have obtained promising performance on indoor or outdoor street scenes, there have been few attempts at visual localization on aerial scenes. In this article, a depth-aware aerial localization transformer (DALTR) is proposed to learn camera poses in real-world aerial scenes assisted by the depth map. To improve the ability of network to perceive on aerial scenes, a multi-level depth embedding transformer module is presented by adaptively incorporating depth information into multiple levels of transformer. In addition, to encourage the piece-wise smooth geometric characteristic of the scene coordinates, a depth-guided smoothness constraint is developed to provide additional supervision for scene coordinate regression. Extensive experimental results on aerial localization benchmark datasets demonstrate that the proposed DALTR achieves superior aerial localization performance.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
16
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Related Documents

JOURNAL ARTICLE

UTLNet: Uncertainty-Aware Transformer Localization Network for RGB-Depth Mirror Segmentation

Wujie ZhouYuqi CaiLiting ZhangWeiqing YanLu Yu

Journal:   IEEE Transactions on Multimedia Year: 2023 Vol: 26 Pages: 4564-4574
JOURNAL ARTICLE

Local Perception-Aware Transformer for Aerial Tracking

Changhong FuWeiyu PengSihang LiJunjie YeZiang Cao

Journal:   2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Year: 2022 Pages: 12122-12129
JOURNAL ARTICLE

Context-aware Transformer Model for Crowd Localization

Yiming GongKan Li

Journal:   2022 3rd International Conference on Computer Vision, Image and Deep Learning & International Conference on Computer Engineering and Applications (CVIDL & ICCEA) Year: 2022 Pages: 199-202
JOURNAL ARTICLE

Structure-Aware Cross-Modal Transformer for Depth Completion

Linqing ZhaoYi WeiJianqin LiJie ZhouJiwen Lu

Journal:   IEEE Transactions on Image Processing Year: 2024 Vol: 33 Pages: 1016-1031
© 2026 ScienceGate Book Chapters — All rights reserved.