Yu XueYihang TangXin XuJiayu LiangFerrante Neri
Feature selection (FS) is an important research topic in machine learning.\nUsually, FS is modelled as a+ bi-objective optimization problem whose\nobjectives are: 1) classification accuracy; 2) number of features. One of the\nmain issues in real-world applications is missing data. Databases with missing\ndata are likely to be unreliable. Thus, FS performed on a data set missing some\ndata is also unreliable. In order to directly control this issue plaguing the\nfield, we propose in this study a novel modelling of FS: we include reliability\nas the third objective of the problem. In order to address the modified\nproblem, we propose the application of the non-dominated sorting genetic\nalgorithm-III (NSGA-III). We selected six incomplete data sets from the\nUniversity of California Irvine (UCI) machine learning repository. We used the\nmean imputation method to deal with the missing data. In the experiments,\nk-nearest neighbors (K-NN) is used as the classifier to evaluate the feature\nsubsets. Experimental results show that the proposed three-objective model\ncoupled with NSGA-III efficiently addresses the FS problem for the six data\nsets included in this study.\n
Cao Truong TranJun ZhangPeter AndreaeBing Xue
Jiabin LinQi ChenBing XueMengjie Zhang
Mohammed Arif KhanAsif EkbalEneldo Loza MencíaJohannes Fürnkranz
Pradip DhalChandrashekhar Azad