JOURNAL ARTICLE

ACM: Adaptive Cross-Modal Graph Convolutional Neural Networks for RGB-D Scene Recognition

Yuan YuanZhitong XiongQi Wang

Year: 2019 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 33 (01)Pages: 9176-9184   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

RGB image classification has achieved significant performance improvement with the resurge of deep convolutional neural networks. However, mono-modal deep models for RGB image still have several limitations when applied to RGB-D scene recognition. 1) Images for scene classification usually contain more than one typical object with flexible spatial distribution, so the object-level local features should also be considered in addition to global scene representation. 2) Multi-modal features in RGB-D scene classification are still under-utilized. Simply combining these modal-specific features suffers from the semantic gaps between different modalities. 3) Most existing methods neglect the complex relationships among multiple modality features. Considering these limitations, this paper proposes an adaptive crossmodal (ACM) feature learning framework based on graph convolutional neural networks for RGB-D scene recognition. In order to make better use of the modal-specific cues, this approach mines the intra-modality relationships among the selected local features from one modality. To leverage the multi-modal knowledge more effectively, the proposed approach models the inter-modality relationships between two modalities through the cross-modal graph (CMG). We evaluate the proposed method on two public RGB-D scene classification datasets: SUN-RGBD and NYUD V2, and the proposed method achieves state-of-the-art performance.

Keywords:
Artificial intelligence Computer science Convolutional neural network RGB color model Modal Pattern recognition (psychology) Modality (human–computer interaction) Graph Feature (linguistics) Computer vision Leverage (statistics) Crossmodal Visual perception Perception

Metrics

38
Cited By
3.01
FWCI (Field Weighted Citation Impact)
44
Refs
0.93
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Advanced Neural Network Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Visual Attention and Saliency Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Cross-Modal Pyramid Translation for RGB-D Scene Recognition

Dapeng DuLimin WangZhaoyang LiGangshan Wu

Journal:   International Journal of Computer Vision Year: 2021 Vol: 129 (8)Pages: 2309-2327
JOURNAL ARTICLE

Cross-Modality Compensation Convolutional Neural Networks for RGB-D Action Recognition

Jun ChengZiliang RenQieshi ZhangXiangyang GaoFusheng Hao

Journal:   IEEE Transactions on Circuits and Systems for Video Technology Year: 2021 Vol: 32 (3)Pages: 1498-1509
JOURNAL ARTICLE

Anisotropic Convolutional Neural Networks for RGB-D based Semantic Scene Completion

Jie LiPeng WangKai HanYu Liu

Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Year: 2021 Vol: 44 (11)Pages: 1-1
© 2026 ScienceGate Book Chapters — All rights reserved.