JOURNAL ARTICLE

Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift

Botao JiaoYinan GuoDunwei GongQiuju Chen

Year: 2022 Journal:   IEEE Transactions on Neural Networks and Learning Systems Vol: 35 (1)Pages: 1278-1291   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Ensemble learning, as a popular method to tackle concept drift in data stream, forms a combination of base classifiers according to their global performances. However, concept drift generally occurs in local data space, causing significantly different performances of a base classifier at different locations. Thus, employing global performance as a criterion to select base classifier is inappropriate. Moreover, data stream is often accompanied by class imbalance problem, which affects the classification accuracy of ensemble learning on minority instances. To drawback these problems, a dynamic ensemble selection for imbalanced data streams with concept drift (DES-ICD) is proposed. For data arrived in chunk-by-chunk, a novel synthetic minority oversampling technique with adaptive nearest neighbors (AnnSMOTE) is developed to generate new minority instances that conform to the new concept. Following that, DES-ICD creates a base classifier on newly arrived data chunk balanced by AnnSMOTE and merges it with historical base classifiers to form a candidate classifier pool. For each query instance, the optimal combination is constructed in terms of the performance of candidate classifiers in its neighborhood. Experimental results for nine synthetic and five real-world datasets show that the proposed method outperforms seven comparative methods on classification accuracy and tracks new concepts in an imbalanced data stream more preciously.

Keywords:
Concept drift Classifier (UML) Oversampling Computer science Data stream Random subspace method Data stream mining Artificial intelligence Data mining Ensemble learning Machine learning Selection (genetic algorithm) Pattern recognition (psychology) Bandwidth (computing)

Metrics

77
Cited By
14.88
FWCI (Field Weighted Citation Impact)
52
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Machine Learning and Data Classification
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.