Missing categorical data imputation approach based on similarity

Sen Wu; Xiaodong Feng; Yushan Han; Qiang Wang

doi:10.1109/icsmc.2012.6378177

ScienceGate Book Chapters

JOURNAL ARTICLE

Missing categorical data imputation approach based on similarity

Sen Wu Xiaodong Feng Yushan Han Qiang Wang

Year: 2012 Pages: 2827-2832

DOI: 10.1109/icsmc.2012.6378177

Get Full-Text PDF Get Analytical Report

Abstract

Imputation for missing data is an important task of data mining, which may influence the data mining result. In this paper, Missing Categorical Data Imputation Based on Similarity (MIBOS) is proposed to solve this problem. The algorithm defines a similarity model between objects with incomplete data, constructing the similarity matrix of objects and further gets the nearest undifferentiated object sets of each object to impute the missing data iteratively. In the imputing process, the imputed value will be directly applied to the same iteration and the following iterations. Experiments with three UCI benchmark data sets show the improvement of the proposed algorithm from perspectives of complete rate, accuracy and time efficiency.

Keywords:

Imputation (statistics) Categorical variable Missing data Data mining Computer science Similarity (geometry) Data modeling Benchmark (surveying) Artificial intelligence Pattern recognition (psychology) Machine learning

Metrics

Cited By

1.00

FWCI (Field Weighted Citation Impact)

Refs

0.77

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Rough Sets and Fuzzy Logic

Physical Sciences → Computer Science → Computational Theory and Mathematics

Data Management and Algorithms

Physical Sciences → Computer Science → Signal Processing

Data Mining Algorithms and Applications

Physical Sciences → Computer Science → Information Systems

Missing categorical data imputation approach based on similarity

Abstract

Metrics

Citation History

Topics

Related Documents

Latent class based multiple imputation approach for missing categorical data

Categorical missing data imputation approach via sparse representation

A nonparametric multiple imputation approach for missing categorical data

Missing Data Imputation for Categorical Variables

Machine Learning Based Missing Data Imputation in Categorical Datasets