JOURNAL ARTICLE

Feature selection using Joint Mutual Information Maximisation

Mohamed BennasarYulia HicksRossitza Setchi

Year: 2015 Journal:   Expert Systems with Applications Vol: 42 (22)Pages: 8520-8532   Publisher: Elsevier BV

Abstract

Feature selection is used in many application areas relevant to expert and intelligent systems, such as data mining and machine learning, image processing, anomaly detection, bioinformatics and natural language processing. Feature selection based on information theory is a popular approach due its computational efficiency, scalability in terms of the dataset dimensionality, and independence from the classifier. Common drawbacks of this approach are the lack of information about the interaction between the features and the classifier, and the selection of redundant and irrelevant features. The latter is due to the limitations of the employed goal functions leading to overestimation of the feature significance. To address this problem, this article introduces two new nonlinear feature selection methods, namely Joint Mutual Information Maximisation (JMIM) and Normalised Joint Mutual Information Maximisation (NJMIM); both these methods use mutual information and the ‘maximum of the minimum’ criterion, which alleviates the problem of overestimation of the feature significance as demonstrated both theoretically and experimentally. The proposed methods are compared using eleven publically available datasets with five competing methods. The results demonstrate that the JMIM method outperforms the other methods on most tested public datasets, reducing the relative average classification error by almost 6% in comparison to the next best performing method. The statistical significance of the results is confirmed by the ANOVA test. Moreover, this method produces the best trade-off between accuracy and stability

Keywords:
Mutual information Feature selection Computer science Artificial intelligence Pattern recognition (psychology) Data mining Curse of dimensionality Dimensionality reduction Classifier (UML) Machine learning

Metrics

636
Cited By
22.75
FWCI (Field Weighted Citation Impact)
69
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Face and Expression Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Gene expression and cancer classification
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Machine Learning and Data Classification
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

BOOK-CHAPTER

Feature Selection Using Mutual Information

Brian E. Boyle

Year: 1976 Pages: 287-297
JOURNAL ARTICLE

2DPCA Feature Selection Using Mutual Information

Parinya Sanguansat

Year: 2008 Vol: 7 Pages: 578-581
JOURNAL ARTICLE

Feature selection based on fuzzy joint mutual information maximization

Omar A. M. SalemFeng LiuAhmed SobhyWen ZhangXi Chen

Journal:   Mathematical Biosciences & Engineering Year: 2020 Vol: 18 (1)Pages: 305-327
© 2026 ScienceGate Book Chapters — All rights reserved.