Feng DingXiu LiuXinyi WangFangming Zhong
Cross-modal retrieval with deep neural networks heavily relies on accurate annotation. However, existing methods may easily suffer from the scarcity and validity of annotations due to the expensive cost of manual labeling. In addition, it is inevitable that noisy labels are imposed during labeling. To this end, it is worthwhile to explore the potential of noisy labels in cross-modal retrieval. In this work, we propose a novel framework entitled Dual-Mix for Cross-Modal Retrieval with noisy labels (DMCM). It consists of two components, which are mixing the robust loss functions and mixing augmentation for noisy samples. In the first mixing stage, the normalized generalized cross entropy and mean absolute error are combined to boost each other. Then, after separating clean and noisy samples by Beta Mixture Model, we mix these samples via augmentation to further address the scarcity of labeled samples. Extensive experiments demonstrate the significant superiority of our DMCM.
Peng HuXi PengHongyuan ZhuLiangli ZhenJie Lin
Runhao LiZhenyu WengHuiping ZhuangYongming ChenZhiping Lin
Ruitao PuYuan SunYang QinZhenwen RenXiaomin SongHuiming ZhengDezhong Peng
Man WuHengmiao ZhangJing FangYing YangLuo Xiong