Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels

Rui Wang; Zhengxin Pan; Fangyu Wu; Yifan Lv; Bailing Zhang

doi:10.1145/3651671.3651674

ScienceGate Book Chapters

JOURNAL ARTICLE

Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels

Rui Wang Zhengxin Pan Fangyu Wu Yifan Lv Bailing Zhang

Year: 2024 Pages: 517-524

DOI: 10.1145/3651671.3651674

Get Full-Text PDF Get Analytical Report

Abstract

Multi-label classification is a task with diverse applications, but current algorithms heavily rely on accurately labeled data, leading to time-consuming and labor-intensive data collection. However, multi-label classification with partial labels presents significant challenges. In this study, we propose Multi-modal Contextual Prompt Learning (MCPL), a novel approach that leverages large-scale visual-language models and exploits the strong image-text alignment in CLIP to address the scarcity of label annotations. We pre-train the visual language model's encoder on a large number of image-text pairs. We introduce multi-modal contextual prompt learning in both images and labeled text to better utilize the image-label correspondence within CLIP, resulting in enhanced multi-label classification performance, even when faced with partial labels. We also use the coupling function to couple the two modes and realize the interactive connection of the two modal prompts. Extensive experiments on the MS-COCO and VOC2007 datasets, demonstrating its superiority and achieving competitive performance.

Keywords:

Computer science Exploit Artificial intelligence Modal Encoder Multi-label classification Machine learning Image (mathematics) Contextual image classification Pattern recognition (psychology) Natural language processing

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.07

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Domain Adaptation and Few-Shot Learning

Physical Sciences → Computer Science → Artificial Intelligence

Multi-modal Contextual Prompt Learning for Multi-label Classification with Partial Labels

Abstract

Metrics

Topics

Related Documents

Prompt-guided consistency learning for multi-label classification with incomplete labels

Hierarchical Prompt Learning Using CLIP for Multi-label Classification with Single Positive Labels

Interactive Multi-Label CNN Learning With Partial Labels

Multi label image classification learning with weak labels

Partial Multi-Label Classification Learning