Weakly supervised learning is an essential problem in computer vision tasks, such as image classification, object recognition, etc., because it is expected to work in the scenarios where a large dataset with clean labels is not available. While there are a number of studies on weakly supervised image classification, they usually limited to either single-label or multi-label scenarios. In this work, we propose an effective approach for weakly supervised image classification utilizing massive noisy labeled data with only a small set of clean labels (e.g., 5%). The proposed approach consists of a clean net and a residual net, which aim to learn a mapping from feature space to clean label space and a residual mapping from feature space to the residual between clean labels and noisy labels, respectively, in a multi-task learning manner. Thus, the residual net works as a regularization term to improve the clean net training. We evaluate the proposed approach on two multi-label datasets (OpenImage and MS COCO2014) and a single-label dataset (Clothing1M). Experimental results show that the proposed approach outperforms the state-of-the-art methods, and generalizes well to both single-label and multi-label scenarios.
Jihong OuyangYiming WangXiming LiChangchun Li
Maria Presa-ReyesShu‐Ching Chen
Tao ZhangChen GongWenjing JiaXiaoning SongJun SunXiao‐Jun Wu
Lequan WangJin DualiZiqiang ChenGuangqiu ChenGaotian Liu
Julio Silva-RodríguezArne SchmidtMaría A. SalesRafael MolinaValery Naranjo