JOURNAL ARTICLE

Multi‐modal object detection via transformer network

Wenbing LiuHaibo WangQuanxue GaoZhaorui Zhu

Year: 2023 Journal:   IET Image Processing Vol: 17 (12)Pages: 3541-3550   Publisher: Institution of Engineering and Technology

Abstract

Abstract According to the fact that single‐modal data usually contain limited information, a great deal of effort has been devoted to making use of the complementary information contained in the multi‐modal data on various patterns. Thus, this paper is concerned with an object detection method that can fully utilize multi‐modal data. First, the method introduces the transformer mechanism to realize the fusion of intra‐modal and inter‐modal features of different modal data. The aim is to take advantage of the complementarity of data between modalities, which helps to improve the performance of multi‐modal object detection. Second, a contrastive loss suitable for contrastive learning is applied. This enables the authors to effectively utilize label information. Extensive experiments are conducted on multiple object detection datasets to demonstrate the effectiveness of our proposed method.

Keywords:
Modal Computer science Transformer Artificial intelligence Information loss Data mining Object detection Complementarity (molecular biology) Object (grammar) Pattern recognition (psychology) Machine learning Voltage Engineering

Metrics

4
Cited By
0.73
FWCI (Field Weighted Citation Impact)
38
Refs
0.66
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

BOOK-CHAPTER

Multi-scale Cross-Modal Transformer Network for RGB-D Object Detection

Zhibin XiaoPengwei XieGuijin Wang

Lecture notes in computer science Year: 2022 Pages: 352-363
JOURNAL ARTICLE

Lightweight Transformer for Multi-Modal Object Detection (Student Abstract)

Yue CaoYanshuo FanJunchi BinZheng Liu

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2023 Vol: 37 (13)Pages: 16172-16173
JOURNAL ARTICLE

Multi-Modal Transformer for RGB-D Salient Object Detection

Peipei SongJing ZhangPiotr KoniuszNick Barnes

Journal:   2022 IEEE International Conference on Image Processing (ICIP) Year: 2022 Pages: 2466-2470
© 2026 ScienceGate Book Chapters — All rights reserved.