JOURNAL ARTICLE

Semi-Supervised Learning with Multiple Imputations on Non-Random Missing Labels

Jason LuZixi XuHuaze XuMichael Ma

Year: 2025 Journal:   Applied and Computational Engineering Vol: 138 (1)Pages: 64-72

Abstract

Semi-Supervised Learning (SSL) is implemented when algorithms are trained on both labeled and unlabeled data. This is a very common application of ML as it is unrealistic to obtain a fully labeled dataset. Researchers have tackled three main issues: missing at random (MAR), missing completely at random (MCAR), and missing not at random (MNAR). The MNAR problem is the most challenging of the three as one cannot safely assume that all class distributions are equal. Existing methods, including Class-Aware Imputation (CAI) and Class-Aware Propensity (CAP), mostly overlook the non-randomness in the unlabeled data. This paper proposes two new methods of combining multiple imputation models to achieve higher accuracy and less bias. 1) We use multiple imputation models, create confidence intervals, and apply a threshold to ignore pseudo-labels with low confidence. 2) Our new method, SSL with De-biased Imputations (SSL-DI), aims to reduce bias by filtering out inaccurate data and finding a subset that is accurate and reliable. This subset of the larger dataset could be imputed into another SSL model, which will be less biased. The proposed models have been shown to be effective in both MCAR and MNAR situations, and experimental results show that our methodology outperforms existing methods in terms of classification accuracy and reducing bias.

Keywords:
Missing data Artificial intelligence Computer science Machine learning Mathematics Pattern recognition (psychology) Statistics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.04
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Class-Aware Pseudo-Labeling for Non-Random Missing Labels in Semi-Supervised Learning

Qian GuiXinting WuBaoning Niu

Journal:   International Journal of Semantic Computing Year: 2023 Vol: 17 (04)Pages: 531-543
JOURNAL ARTICLE

Graph-based semi-supervised learning with multiple labels

Zheng-Jun ZhaTao MeiJingdong WangZengfu WangXian‐Sheng Hua

Journal:   Journal of Visual Communication and Image Representation Year: 2008 Vol: 20 (2)Pages: 97-103
JOURNAL ARTICLE

Robust non-negative sparse graph for semi-supervised multi-label learning with missing labels

Jianghong MaTommy W. S. Chow

Journal:   Information Sciences Year: 2017 Vol: 422 Pages: 336-351
© 2026 ScienceGate Book Chapters — All rights reserved.