JOURNAL ARTICLE

Feature selection for chemical compound extraction using wrapper approach with Naive Bayes classifier

Abstract

Biomedical Entity extraction is the process of identifying biomedical instances such as disorders, viruses, proteins, genes and others. One of these instances is the chemical compound which caught many researchers' attentions regarding the challenging task of extracting them. In fact, most of the studies that have been proposed for chemical compounds extraction have relied on supervised machine learning techniques regarding its ability to adopt a statistical model rather than handcrafted rules. However, the key characteristic of the use of supervised machine learning techniques lies on the utilized features. There is a wide range of features that have been used in the previous studies for the process of extracting chemical compounds. Hence, the need of accommodating a feature selection task in order to determine the best combination of features is becoming imperative. Therefore, this paper aims to apply a combination of Naïve Bayes classification method with the Wrapper Subset Selection approach to identify the best features. Results showed that the proposed combination has the ability to identify the best combination of features which consists of Capitalization, Punctuation, Prefix and Part-Of-Speech Tagging by achieving 0.72 of f-measure. Such result has been compared to the state of the art and it demonstrated competitive performance.

Keywords:
Computer science Artificial intelligence Naive Bayes classifier Machine learning Feature selection Classifier (UML) Feature extraction Data mining Pattern recognition (psychology) Support vector machine

Metrics

9
Cited By
0.94
FWCI (Field Weighted Citation Impact)
10
Refs
0.71
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Biomedical Text Mining and Ontologies
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Machine Learning in Bioinformatics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.