T. J. ZhuJinyong ChenGang Wang
In the field of low-altitude aerial drone data fusion, the fusion of infrared and visible light images remains challenging due to issues such as large modal differences, insufficient cross-modal alignment, and limited global context modeling. Traditional methods struggle to extract complementary information across modalities, while deep learning methods often lack sufficient global receptive fields (convolutional neural networks) or fail to preserve local details (standard Transformers). To address these issues, we propose a Cross-modal Guided Dual-Branch Network (CGDBN) that combines convolutional neural networks and Transformer architecture. Our framework contribution: We designed a Target-modal Feature Extraction Mechanism (TMFEM) module with specialized thermal characteristics for infrared feature extraction, which does not require processing of visible light features; we introduced Simplified Linear Attention Blocks (SLABs) into our framework to improve global context capture as a module; we designed a Cross-Modal Interaction Mechanism (CMIM) module for bidirectional feature interaction; and we designed a Density Adaptive Multimodal Fusion (DAMF) module that weights modal contributions based on content analysis. This asymmetric design recognizes that different types of images have different characteristics and require targeted processing. The experimental results on AVMS, M3FD, and TNO datasets show that the proposed model has a peak signal-to-noise ratio (PSNR) of 16.2497 on the AVMS dataset, which is 0.9971 higher than the best benchmark method YDTR (peak signal-to-noise ratio: approximately 15.2526). The peak signal-to-noise ratio on the M3FD dataset is 16.5044, which is 0.7480 higher than the best benchmark method YDTR (peak signal-to-noise ratio of approximately 15.7564). The peak signal-to-noise ratio on the TNO dataset is 17.3956, which is 0.7934 higher than the best benchmark method YDTR (peak signal-to-noise ratio: approximately 16.6022), and the overall performance on all other indicators is among the top in all comparison models. This method has broad application prospects in fields such as drone data fusion.
Xiaolin ShiZhen WangXinping PanJunjie LiKe Wang
Xi GaoChengcheng ZhangHui ChenYuan Yao
Liang ZhangYueqiu JiangWei YangB. Liu