Zeping MinJinfeng BaiChengfei Li
Semi-supervised learning algorithms that use pseudo-labeling have become increasingly popular for improving model performance by utilizing both labeled and unlabeled data. In this paper, we offer a fresh perspective on the selection of pseudo-labels, inspired by theoretical insights. We suggest that pseudo-labels with a high degree of local variance are more prone to inaccuracies. Based on this premise, we introduce the Local Variance Match (LVM) method, which aims to optimize the selection of pseudo-labels in semi-supervised learning (SSL) tasks. Our methodology is validated through a series of experiments on widely-used image classification datasets, such as CIFAR-10, CIFAR-100, and SVHN, spanning various labeled data quantity scenarios. The empirical findings show that the LVM method substantially outpaces current SSL techniques, achieving state-of-the-art results in many of these scenarios. For instance, we observed an error rate of 5.41% on CIFAR-10 with a single label for each class, 35.87% on CIFAR-100 when using four labels per class, and 1.94% on SVHN with four labels for each class. Notably, the standout error rate of 5.41% is less than 1% shy of the performance in a fully-supervised learning environment. In experiments on ImageNet with 100k labeled data, the LVM also reached state-of-the-art outcomes. Additionally, the efficacy of the LVM method is further validated by its stellar performance in speech recognition experiments.
Shuangshuang LiZhihui WeiJun ZhangLiang Xiao
Jiachen LiuHaoren KeJianfeng YangTianqi Yu
Gabriel Kirsten MenezesJosé Augusto Correa MartinsGilberto AstolfiEverton Castel�ão TetilaAdair JuniorDiogo Nunes GonçalvesJosé MarcatoJonatham SilvaJonathan LiWesley Nunes GonçalvesHemerson Pistori