Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance Based on Neighborhood Rough Sets

Miaomiao Huang; Lin Sun; Jiucheng Xu; Shiguang Zhang

doi:10.1109/access.2020.2982536

ScienceGate Book Chapters

JOURNAL ARTICLE

Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance Based on Neighborhood Rough Sets

Miaomiao Huang Lin Sun Jiucheng Xu Shiguang Zhang

Year: 2020 Journal: IEEE Access Vol: 8 Pages: 62011-62031 Publisher: Institute of Electrical and Electronics Engineers

DOI: 10.1109/access.2020.2982536

Get Full-Text PDF Get Analytical Report

Abstract

Recently, multilabel classification is of increasing interest in machine learning and artificial intelligence. However, the distances of samples in most Relief methods easily result in heterogeneous or similar samples abnormal when the distances are very large. Besides, the classification margin as a neighborhood radius for some reduction algorithms may be meaningless when the margin is too large. To overcome these drawbacks, this paper presents a multilabel feature selection method using the improved Relief and minimum redundancy maximum relevance (MRMR) based on neighborhood rough sets. First, the number of heterogeneous and similar samples is introduced to improve the label weighting method which can eliminate the influence of the large distances of samples. By combining with the new label weighting, the distances between the sample and its nearest-neighbor heterogeneous sample and between the sample and its nearest-neighbor similar sample are presented to develop a new feature weighting method. Second, the number of heterogeneous and similar samples continues to be used to improve the classification margin, thereby constraining the neighborhood radius, based on which the neighborhood approximation accuracy is constructed to effectively measure the uncertainty of samples in the boundary region and the completeness of knowledge. Third, by integrating with the new neighborhood approximation accuracy, two types of mutual information between features and labels and among features are proposed, and then the mutual information-based MRMR model is investigated to evaluate the significance of features. Finally, a multilabel feature selection algorithm is designed for improving the classification performance of multilabel data. Experimental results on thirteen public datasets illustrate the effectiveness of our developed algorithm that can select the significant features and achieve great performance for multilabel datasets.

Keywords:

Weighting Pattern recognition (psychology) Redundancy (engineering) Artificial intelligence Computer science Margin (machine learning) Feature selection Data mining k-nearest neighbors algorithm Support vector machine Relevance (law) Sample (material) Feature (linguistics) Mathematics Machine learning

Metrics

Cited By

2.65

FWCI (Field Weighted Citation Impact)

Refs

0.92

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Rough Sets and Fuzzy Logic

Physical Sciences → Computer Science → Computational Theory and Mathematics

Text and Document Classification Technologies

Physical Sciences → Computer Science → Artificial Intelligence

Image Retrieval and Classification Techniques

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Multilabel Feature Selection Using Relief and Minimum Redundancy Maximum Relevance Based on Neighborhood Rough Sets

Abstract

Metrics

Citation History

Topics

Related Documents

Feature Selection With Missing Labels Using Multilabel Fuzzy Neighborhood Rough Sets and Maximum Relevance Minimum Redundancy

Maximum relevance minimum redundancy-based feature selection using rough mutual information in adaptive neighborhood rough sets

Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification

Hybrid Multilabel Feature Selection Using BPSO and Neighborhood Rough Sets for Multilabel Neighborhood Decision Systems

Maximum relevance, minimum redundancy band selection based on neighborhood rough set for hyperspectral data classification