Extreme class imbalance is a pervasive challenge in machine learning, where the skewed distribution of data points significantly degrades the performance of standard classifiers, particularly on the minority class. This phenomenon is prevalent in critical domains such as fraud detection, medical diagnosis, and anomaly detection, leading to models that exhibit high accuracy but poor recall and F1-scores for the rare, yet important, events. Traditional resampling techniques, while widely used, often rely on heuristic rules or static parameters, failing to adapt optimally to the diverse and complex characteristics of different datasets. This paper introduces Meta-Resampling, a novel framework designed to learn adaptive resampling strategies for extreme class imbalance. Unlike conventional approaches, Meta-Resampling employs a meta-learner that dynamically selects and parameterizes the most effective resampling technique based on an analysis of the specific dataset's characteristics. This learned adaptability allows the framework to transcend the limitations of fixed strategies, leading to more robust and generalized models. We detail the architecture of the meta-learner, its training methodology using a diverse set of imbalanced meta-datasets, and the integration with base classifiers. Through extensive experimentation on various synthetic and real-world imbalanced datasets, Meta-Resampling demonstrates superior performance across key metrics like F1-score, AUC-ROC, and G-mean, significantly outperforming existing state-of-the-art resampling and ensemble methods. The findings underscore the efficacy of an adaptive, data-driven approach to tackling class imbalance, paving the way for more intelligent and context-aware solutions in challenging classification tasks.
Christopher J. FredericksonRobi Polikar
Kleanthis MalialisChristos G. PanayiotouMarios M. Polycarpou
Shuo WangLeandro L. MinkuXin Yao
Vicente GarcíaJ. Salvador SánchezRamón A. Mollineda