JOURNAL ARTICLE

Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance Based on Neighborhood Rough Sets

Miaomiao HuangLin SunJiucheng XuShiguang Zhang

Year: 2020 Journal:   IEEE Access Vol: 8 Pages: 62011-62031   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Recently, multilabel classification is of increasing interest in machine learning and artificial intelligence. However, the distances of samples in most Relief methods easily result in heterogeneous or similar samples abnormal when the distances are very large. Besides, the classification margin as a neighborhood radius for some reduction algorithms may be meaningless when the margin is too large. To overcome these drawbacks, this paper presents a multilabel feature selection method using the improved Relief and minimum redundancy maximum relevance (MRMR) based on neighborhood rough sets. First, the number of heterogeneous and similar samples is introduced to improve the label weighting method which can eliminate the influence of the large distances of samples. By combining with the new label weighting, the distances between the sample and its nearest-neighbor heterogeneous sample and between the sample and its nearest-neighbor similar sample are presented to develop a new feature weighting method. Second, the number of heterogeneous and similar samples continues to be used to improve the classification margin, thereby constraining the neighborhood radius, based on which the neighborhood approximation accuracy is constructed to effectively measure the uncertainty of samples in the boundary region and the completeness of knowledge. Third, by integrating with the new neighborhood approximation accuracy, two types of mutual information between features and labels and among features are proposed, and then the mutual information-based MRMR model is investigated to evaluate the significance of features. Finally, a multilabel feature selection algorithm is designed for improving the classification performance of multilabel data. Experimental results on thirteen public datasets illustrate the effectiveness of our developed algorithm that can select the significant features and achieve great performance for multilabel datasets.

Keywords:
Weighting Pattern recognition (psychology) Redundancy (engineering) Artificial intelligence Computer science Margin (machine learning) Feature selection Data mining k-nearest neighbors algorithm Support vector machine Relevance (law) Sample (material) Feature (linguistics) Mathematics Machine learning

Metrics

25
Cited By
2.65
FWCI (Field Weighted Citation Impact)
51
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Rough Sets and Fuzzy Logic
Physical Sciences →  Computer Science →  Computational Theory and Mathematics
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Image Retrieval and Classification Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.