JOURNAL ARTICLE

Distillation-Based Hashing Transformer for Cross-Modal Vessel Image Retrieval

Jiaen GuoXin GuanYing LiuLu Yu

Year: 2023 Journal:   IEEE Geoscience and Remote Sensing Letters Vol: 20 Pages: 1-5   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Cross-modal image retrieval has attracted much attention in remote sensing(RS) data analysis these years, however, the retrieval of target images such as surface vessels receives little interest. Considering the complex geometric features of vessel images and the modality gap, the widely used joint feature learning based on CNNs tends to have low precision. In this letter, a distillation-based hashing transformer(DBHT) is proposed to solve the above problems. Specifically, we adopt vision transformer(ViT) as the feature extractor for target images and a hash token is designed and attended to ViT for hashing generation. To avoid the precision attenuation caused by the uncontrollability in common feature space construction, we design a two-step feature learning strategy and build a well-performed unimodal hashing retrieval framework firstly, and then transfer the hashing knowledge to another modality. Two distillation strategies, as well as cross-modal weighted triplet loss, are designed to supervise the above process and ensure complete knowledge transfer. Cross-modal weight transfer is also adopted to bridge the modality gap. Extensive experiments on two bimodal vessel image datasets show that the proposed DBHT is superior to several cross-modal hashing baselines in cross-modal vessel image retrieval tasks.

Keywords:
Hash function Computer science Artificial intelligence Modal Transformer Image retrieval Feature (linguistics) Feature extraction Pattern recognition (psychology) Computer vision Data mining Image (mathematics) Engineering

Metrics

7
Cited By
1.27
FWCI (Field Weighted Citation Impact)
18
Refs
0.76
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Deep Adversarial Cascaded Hashing for Cross-Modal Vessel Image Retrieval

Jiaen GuoXin Guan

Journal:   IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Year: 2023 Vol: 16 Pages: 2205-2220
JOURNAL ARTICLE

TECMH: Transformer-Based Cross-Modal Hashing For Fine-Grained Image-Text Retrieval

Qiqi LiLongfei MaZheng JiangMingyong LiBo Jin

Journal:   Computers, materials & continua/Computers, materials & continua (Print) Year: 2023 Vol: 75 (2)Pages: 3713-3728
JOURNAL ARTICLE

CKDH: CLIP-Based Knowledge Distillation Hashing for Cross-Modal Retrieval

Jiaxing LiWai Keung WongLin JiangXiaozhao FangShengli XieYong Xu

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2024 Vol: 34 (7)Pages: 6530-6541
JOURNAL ARTICLE

Text-Image Cross-modal Retrieval Based on Transformer

YANG Xiaoyu, LI Chao, CHEN Shunyao, LI Haoliang, YIN Guangqiang

Journal:   DOAJ (DOAJ: Directory of Open Access Journals) Year: 2023
© 2026 ScienceGate Book Chapters — All rights reserved.