JOURNAL ARTICLE

Multimodal Feature Fusion YOLOv5 for RGB-T Object Detection

Abstract

Multimodal image pairs (e.g. visible and thermal images) can provide mutually beneficial pixel information and enhance the robustness and reliability of object detection in applications such as autonomous driving and video surveillance. To benefit from the effective information of both modalities, a multimodal feature fusion network based on YOLOv5 is proposed in this paper. Multimodal feature fusion adaptive weighting module is designed to perform feature extraction and fusion at three scales in the network to achieve the best utilization of multimodal features. Experiments show that our multimodal object detection network (MFF-YOLOv5) achieves better performance on two public datasets compared with the current state-of-the-art (SOTA) methods.

Keywords:
Computer science Artificial intelligence Robustness (evolution) Feature extraction Feature (linguistics) Computer vision Pattern recognition (psychology) RGB color model Object detection Weighting Pixel Fusion

Metrics

6
Cited By
0.74
FWCI (Field Weighted Citation Impact)
23
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Revisiting Feature Fusion for RGB-T Salient Object Detection

Qiang ZhangTonglin XiaoNianchang HuangDingwen ZhangJungong Han

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2020 Vol: 31 (5)Pages: 1804-1818
JOURNAL ARTICLE

RGB-D salient object detection based on multimodal feature information fusion

Ling‐bing MengMengYa YuanXuehan ShiQingqing LiuWeiwei DuanFei ChengLingli Li

Journal:   Third International Conference on Computer Vision and Data Mining (ICCVDM 2022) Year: 2023 Pages: 4-4
JOURNAL ARTICLE

MFFNet: Multimodal feature fusion network for RGB-D transparent object detection

Li ZhuTuanjie LiYuming NingYan Zhang

Journal:   International Journal of Advanced Robotic Systems Year: 2024 Vol: 21 (5)
JOURNAL ARTICLE

Edge-guided feature fusion network for RGB-T salient object detection

Yuanlin ChenZhenan SunCheng YanMing Zhao

Journal:   Frontiers in Neurorobotics Year: 2024 Vol: 18 Pages: 1489658-1489658
© 2026 ScienceGate Book Chapters — All rights reserved.