JOURNAL ARTICLE

Imbalanced Protein Data Classification Using Ensemble FTM-SVM

Hongliang Dai

Year: 2015 Journal:   IEEE Transactions on NanoBioscience Vol: 14 (4)Pages: 350-359   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Classification of protein sequences into functional and structural families based on machine learning methods is a hot research topic in machine learning and Bioinformatics. In fact, the underlying protein classification problem is a huge multiclass problem. Generally, the multiclass problem can be reduced to a set of binary classification problems. The protein in one class are seen as positive examples while those outside the class are seen as negative examples. However, the class imbalance problem will arise in this case because the number of protein in one class is usually much smaller than that of the protein outside the class. To handle the challenge, we propose a novel framework to classify the protein. We firstly use free scores (FS) to perform feature extraction for protein; then, the inverse random under sampling (IRUS) is used to create a large number of distinct training sets; next, we use a new ensemble approach to combine these distinct training sets with a new fuzzy total margin support vector machine (FTM-SVM) that we have constructed. we call the novel ensemble classifier as ensemble fuzzy total margin support vector machine (EnFTM-SVM). We then give a full description of our method, including the details of its derivation. Finally, experimental results on fourteen benchmark protein data sets indicate that the proposed method outperforms many state-of-the-art protein classifying methods.

Keywords:
Support vector machine Artificial intelligence Computer science Machine learning Multiclass classification Classifier (UML) Pattern recognition (psychology) Binary classification Margin (machine learning) Structured support vector machine Ensemble learning Benchmark (surveying) Feature vector One-class classification Data mining

Metrics

19
Cited By
1.89
FWCI (Field Weighted Citation Impact)
63
Refs
0.92
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Machine Learning in Bioinformatics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

SVM ensemble training for imbalanced data classification using multi-objective optimization techniques

Joanna GrzybMichał Woźniak

Journal:   Applied Intelligence Year: 2022 Vol: 53 (12)Pages: 15424-15441
BOOK-CHAPTER

Imbalanced Data Classification Using Weighted Voting Ensemble

Lin LüMichał Woźniak

Advances in intelligent systems and computing Year: 2019 Pages: 82-91
JOURNAL ARTICLE

Improved classification for imbalanced data using ensemble clustering

Sharanjit KaurManju BhardwajAyesha MaqsoodAkhilendra Kumar MauryaMayank KumarNishant Singh

Journal:   TELKOMNIKA (Telecommunication Computing Electronics and Control) Year: 2025 Vol: 23 (5)Pages: 1323-1323
© 2026 ScienceGate Book Chapters — All rights reserved.