JOURNAL ARTICLE

Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition

Anran WangJiwen LuJianfei CaiTat‐Jen ChamGang Wang

Year: 2015 Journal:   IEEE Transactions on Multimedia Vol: 17 (11)Pages: 1887-1898   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data.

Keywords:
Computer science RGB color model Artificial intelligence Margin (machine learning) Discriminative model Modal Modality (human–computer interaction) Feature (linguistics) Feature learning Pattern recognition (psychology) Deep learning Object (grammar) Computer vision Feature extraction Modalities Machine learning

Metrics

152
Cited By
13.57
FWCI (Field Weighted Citation Impact)
76
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Video Surveillance and Tracking Methods
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Multi-modal deep feature learning for RGB-D object detection

Xiangyang XuYuncheng LiGangshan WuJiebo Luo

Journal:   Pattern Recognition Year: 2017 Vol: 72 Pages: 300-313
JOURNAL ARTICLE

Deep sensorimotor learning for RGB-D object recognition

Spyridon ThermosGeorgios Th. PapadopoulosPetros DarasGerasimos Potamianos

Journal:   Computer Vision and Image Understanding Year: 2019 Vol: 190 Pages: 102844-102844
JOURNAL ARTICLE

RGB-D based multi-modal deep learning for spacecraft and debris recognition

Nouar AlDahoulHezerul Abdul KarimMhd Adel Momo

Journal:   Scientific Reports Year: 2022 Vol: 12 (1)Pages: 3924-3924
© 2026 ScienceGate Book Chapters — All rights reserved.