JOURNAL ARTICLE

Feature Subset Selection for High-Dimensional, Low Sampling Size Data Classification Using Ensemble Feature Selection With a Wrapper-Based Search

Ashis Kumar MandalMd NadimHasi SahaTangina SultanaMd. Delowar HossainEui‐Nam Huh

Year: 2024 Journal:   IEEE Access Vol: 12 Pages: 62341-62357   Publisher: Institute of Electrical and Electronics Engineers

Abstract

The identification of suitable feature subsets from High-Dimensional Low-Sample-Size (HDLSS) data is of paramount importance because this dataset often contains numerous redundant and irrelevant features, leading to poor classification performance. However, the selection of an optimal feature subset from a vast feature space creates a significant computational challenge. In the domain of HDLSS data, conventional feature selection methods often face challenges in achieving a balance between reducing the number of features and preserving high classification accuracy. Addressing these issues, the study introduces an effective framework that employs a filter and wrapper-based strategy specifically designed to address the classification challenges inherent in HDLSS data. The framework adopts a multi-step approach where ensemble feature selection integrates five filter ranking approaches: Chi-square ( $\chi ^{2}$ ), Gini index (GI), F-score, Mutual Information (MI), and Symmetric uncertainty (SU) to identify the top-ranking features. In the subsequent stage, a wrapper-based search method is utilized, which employs the Differential Evaluation (DE) metaheuristic algorithm as the search strategy. The fitness of feature subsets during this search is assessed based on a weighted combination of the error rate of the Support Vector Machine (SVM) classifier and the ratio of feature cardinality. The datasets, after undergoing dimensionality reduction, are then utilized to construct classification models using SVM, K-Nearest Neighbors (KNN), and Logistic Regression (LR). The approach was evaluated on 13 HDLSS datasets to assess its efficacy in selecting appropriate feature subsets and improving Classification Accuracy (ACC) analog with Area Under the Curve (AUC). Results show that the proposed ensemble with wrapper-based approach produces a smaller number of features (ranging between 2 and 9 for all datasets), while maintaining a commendable average AUC and ACC (between 98% and 100%). The comparative analysis reveals that the proposed method surpasses both ensemble feature selection and non-feature selection approaches in terms of feature reduction and ACC. Additionally, when compared to various other state-of-the-art methods, this approach demonstrates commendable performance.

Keywords:
Feature selection Computer science Support vector machine Artificial intelligence Pattern recognition (psychology) Data mining Classifier (UML) Feature (linguistics) Clustering high-dimensional data Dimensionality reduction Boosting (machine learning) Machine learning Cluster analysis

Metrics

12
Cited By
7.67
FWCI (Field Weighted Citation Impact)
63
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Machine Learning and Data Classification
Physical Sciences →  Computer Science →  Artificial Intelligence
Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Metaheuristic Optimization Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Search space division method for wrapper feature selection on high-dimensional data classification

Abhilasha Chaudhuri

Journal:   Knowledge-Based Systems Year: 2024 Vol: 291 Pages: 111578-111578
JOURNAL ARTICLE

A Wrapper Feature Selection Based on Ensemble Learning Algorithm for High Dimensional Data

Universitas Muhammadiyah Surakarta, Indonesia

Journal:   International Journal of Advanced Trends in Computer Science and Engineering Year: 2019 Vol: 8 (6)Pages: 2782-2787
JOURNAL ARTICLE

Feature Subset Selection for High Dimensional Data

Pavan Mallya PC. K. Roopa

Journal:   International Journal of Engineering Research and Year: 2015 Vol: V4 (05)
© 2026 ScienceGate Book Chapters — All rights reserved.