JOURNAL ARTICLE

An Ensemble Learning Imbalanced Data Classification Method Based on Sample Combination Optimization

Yuxin Wang

Year: 2019 Journal:   Journal of Physics Conference Series Vol: 1284 (1)Pages: 012035-012035   Publisher: IOP Publishing

Abstract

Abstract Imbalanced data classification is one of the hot topics in data mining and machine learning in recent years. In practice, Imbalanced data classification is very common, such as cancer detection, spam discrimination, credit card fraud detection, etc. Because of the large difference in the number of categories and Imbalanced distribution, traditional classification algorithms have poor classification effect on minority classes, and correct identification of minority classes often brings greater value. Therefore, how to effectively identify minority classes in Imbalanced data is of great importance. Practical significance. Aiming at the problems that the Bagging-based Imbalanced data classification method cannot guarantee the validity and existence of classification boundaries by adding redundant noise information and sampling, an ensemble learning GABagging method based on sample combination optimization is proposed. Firstly, the sample combination optimization algorithm uses genetic algorithm to select a subset from most classes and construct a new data set with a few classes. Subsequently, several sample combinatorial optimization algorithms are used to train and integrate several classifiers. The experimental results show that GABagging can improve the correct recognition ability of minority classes on 19 Imbalanced datasets compared with other similar methods such as TPR and AUC, without excessive loss of recognition ability of majority classes. It is proved that GABagging can compensate for the shortcomings of related Bagging-based methods such as easy loss, increasing samples and not guaranteeing the validity and existence of classification boundaries after sampling.

Keywords:
Machine learning Artificial intelligence Computer science Sample (material) Ensemble learning Identification (biology) Data mining Data classification Oversampling Statistical classification Sampling (signal processing) Set (abstract data type) Pattern recognition (psychology)

Metrics

4
Cited By
0.31
FWCI (Field Weighted Citation Impact)
3
Refs
0.66
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Electricity Theft Detection Techniques
Physical Sciences →  Engineering →  Electrical and Electronic Engineering
Artificial Intelligence in Healthcare
Health Sciences →  Health Professions →  Health Information Management

Related Documents

BOOK-CHAPTER

Imbalanced Data Classification Method Based on Ensemble Learning

Yu XiangYongping Xie

Lecture notes in electrical engineering Year: 2019 Pages: 18-24
JOURNAL ARTICLE

Spark-based Ensemble Learning for Imbalanced Data Classification

Jiaman Ding

Journal:   International Journal of Performability Engineering Year: 2018
JOURNAL ARTICLE

Classification Method for Imbalanced Data using Ensemble Learning System

Sunil ChandoluS.Prasad Babu Vagolu

Journal:   International Journal of Innovative Technology and Exploring Engineering Year: 2020 Vol: 9 (4)Pages: 1845-1848
BOOK-CHAPTER

Ensemble Classification Method for Imbalanced Data Using Deep Learning

Yoon Sang Lee

Lecture notes in business information processing Year: 2019 Pages: 162-170
© 2026 ScienceGate Book Chapters — All rights reserved.