Infrared-Visible Image Fusion Meets Object Detection: Towards Unified Optimization for Multimodal Perception

X. Q. Xiang; Guangyao Zhou; Ben Niu; Zongxu Pan; Lijia Huang; Wenshuai Li; Zixiao Wen; Jiamin Qi; Wanxin Gao

doi:10.3390/rs17213637

ScienceGate Book Chapters

JOURNAL ARTICLE

Infrared-Visible Image Fusion Meets Object Detection: Towards Unified Optimization for Multimodal Perception

X. Q. Xiang Guangyao Zhou Ben Niu Zongxu Pan Lijia Huang Wenshuai Li Zixiao Wen Jiamin Qi Wanxin Gao

Year: 2025 Journal: Remote Sensing Vol: 17 (21)Pages: 3637-3637 Publisher: Multidisciplinary Digital Publishing Institute

DOI: 10.3390/rs17213637

Get Full-Text PDF Get Analytical Report

Abstract

Infrared-visible image fusion and object detection are crucial components in remote sensing applications, each offering unique advantages. Recent research has increasingly sought to combine these tasks to enhance object detection performance. However, the integration of these tasks presents several challenges, primarily due to two overlooked issues: (i) existing infrared-visible image fusion methods often fail to adequately focus on fine-grained or dense information, and (ii) while joint optimization methods can improve fusion quality and downstream task performance, their multi-stage training processes often reduce efficiency and limit the network’s global optimization capability. To address these challenges, we propose the UniFusOD method, an efficient end-to-end framework that simultaneously optimizes both infrared-visible image fusion and object detection tasks. The method integrates Fine-Grained Region Attention (FRA) for region-specific attention operations at different granularities, enhancing the model’s ability to capture complex information. Furthermore, UnityGrad is introduced to balance the gradient conflicts between fusion and detection tasks, stabilizing the optimization process. Extensive experiments demonstrate the superiority and robustness of our approach. Not only does UniFusOD achieve excellent results in image fusion, but it also provides significant improvements in object detection performance. The method exhibits remarkable robustness across various tasks, achieving a 0.8 and 1.9 mAP50 improvement over state-of-the-art methods on the DroneVehicle dataset for rotated object detection and the M3FD dataset for horizontal object detection, respectively.

Keywords:

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Infrared-Visible Image Fusion Meets Object Detection: Towards Unified Optimization for Multimodal Perception

Abstract

Metrics

Citation History

Topics

Related Documents

MAFTNet: Multimodal Adaptive Fusion-based Transformer Network for Infrared and Visible Image UAV Object Detection

Object Detection in Visible and Infrared missile borne fusion image

Visible and Infrared Image Fusion for Object Detection: A Survey

Visible-Infrared Features Fusion Based Object Detection

A Lightweight Infrared and Visible Image Fusion Method for Object Detection