JOURNAL ARTICLE

High-Dimensional Multi-Label Data Stream Classification With Concept Drifting Detection

Peipei LiHaixiang ZhangXuegang HuXindong Wu

Year: 2022 Journal:   IEEE Transactions on Knowledge and Data Engineering Pages: 1-15   Publisher: IEEE Computer Society

Abstract

Multi-label data streams such as Web texts and images have been popular on the Web. These data present the characteristics of multiple label, high dimensionality, high volume, high velocity and especial concept drift etc. Thus, multi-label data stream classification is a very challenging and significant task especially in the handling of high-dimensional data with concept drifts. However, this challenge has received little attention from the research community. Therefore, we propose the max-relevance and min-redundancy based algorithm adaptation approach for the efficient and effective classification on multi-label data streams with high-dimensional attributes and concept drifts .1 Source codes and data sets are available at below. https://github.com/peipeilihfut/MLStreamClassification In order to reduce the impact from the high-dimensional data with noisy attributes, we first refine the minimal-redundancy-maximal-relevance criterion based on mutual information to select qualified features in multi-label data streams. Secondly, we propose the data distribution based concept drifting detection approach to distinguish concept drifts hidden in data streams. Finally, we build an incremental ensemble classification model for efficiently classifying multi-label data streams. Extensive studies show that our approach can get optimal subsets of features while maintaining a good performance in the multi-label classification, as compared to several state-of-the-art multi-label feature selection algorithms using two efficient multi-label classification methods as base classifiers. Meanwhile, our approach is superior to three well-known multi-label data stream classification approaches in the effectiveness and efficiency.

Keywords:
Computer science Concept drift Data stream mining Redundancy (engineering) Data mining Data stream Multi-label classification Relevance (law) Feature selection Curse of dimensionality Data redundancy Pattern recognition (psychology) Artificial intelligence Machine learning Database

Metrics

14
Cited By
2.74
FWCI (Field Weighted Citation Impact)
44
Refs
0.88
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Stream Mining Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems

Related Documents

BOOK-CHAPTER

Subspace Detection on Concept Drifting Data Stream

Lin FengShenglan LiuYao XiaoJing Wang

Proceedings in adaptation, learning and optimization Year: 2014 Pages: 51-59
JOURNAL ARTICLE

Concept drift detection with False Positive rate for multi-label classification in IoT data stream

Pingfan WangNanlin JinGerhard Fehringer

Journal:   2020 International Conference on UK-China Emerging Technologies (UCET) Year: 2020 Vol: 11 Pages: 1-4
JOURNAL ARTICLE

Adaptive Multi-label Classification on Drifting Data Streams

Roseberry, Martha

Journal:   VCU Scholars Compass (Virginia Commonwealth University) Year: 2024
© 2026 ScienceGate Book Chapters — All rights reserved.