JOURNAL ARTICLE

A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification

Lingyun SongJun LiuBuyue QianMingxuan SunKuan YangMeng SunSamar Abbas

Year: 2018 Journal:   IEEE Transactions on Image Processing Vol: 27 (12)Pages: 6025-6038   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Deep convolutional neural networks (CNNs) have shown superior performance on the task of single-label image classification. However, the applicability of CNNs to multi-label images still remains an open problem, mainly because of two reasons. First, each image is usually treated as an inseparable entity and represented as one instance, which mixes the visual information corresponding to different labels. Second, the correlations amongst labels are often overlooked. To address these limitations, we propose a deep multi-modal CNN for multi-instance multi-label image classification, called MMCNN-MIML. By combining CNNs with multi-instance multi-label (MIML) learning, our model represents each image as a bag of instances for image classification and inherits the merits of both CNNs and MIML. In particular, MMCNN-MIML has three main appealing properties: 1) it can automatically generate instance representations for MIML by exploiting the architecture of CNNs; 2) it takes advantage of the label correlations by grouping labels in its later layers; and 3) it incorporates the textual context of label groups to generate multi-modal instances, which are effective in discriminating visually similar objects belonging to different groups. Empirical studies on several benchmark multi-label image data sets show that MMCNN-MIML significantly outperforms the state-of-the-art baselines on multi-label image classification tasks.

Keywords:
Artificial intelligence Convolutional neural network Computer science Pattern recognition (psychology) Image (mathematics) Contextual image classification Context (archaeology) Benchmark (surveying) Modal Multi-label classification Machine learning

Metrics

107
Cited By
10.13
FWCI (Field Weighted Citation Impact)
83
Refs
0.98
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Image and Video Retrieval Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Multi-instance multi-label image classification: A neural approach

Zenghai ChenZheru ChiHong FuDagan Feng

Journal:   Neurocomputing Year: 2012 Vol: 99 Pages: 298-306
JOURNAL ARTICLE

Deep Multi-Instance Multi-Label Learning for Image Annotation

Haifeng GuoLixin HanShoubao SuZhoubao Sun

Journal:   International Journal of Pattern Recognition and Artificial Intelligence Year: 2017 Vol: 32 (03)Pages: 1859005-1859005
JOURNAL ARTICLE

Deep Multi-Label Multi-Instance Classification on 12-Lead ECG

Yingjing FengEdward J. Vigmond

Journal:   Computing in cardiology Year: 2020
© 2026 ScienceGate Book Chapters — All rights reserved.