One of the more challenging real-world problems in computational intelligence is to learn from non-stationary streaming data, also known as concept drift. Perhaps even a more challenging version of this scenario is when - following a small set of initial labeled data - the data stream consists of unlabeled data only. Such a scenario is typically referred to as learning in initially labeled nonstationary environment, or simply as extreme verification latency (EVL). In our prior work, we described a framework, called COMPOSE (COMPacted Object Sample Extraction) that works well in this type of environment, provided that the data distributions experience limited drift. The central premise behind COMPOSE is core support extraction, in which α-shapes or density estimation is used to extract the most representative instances - the core supports that typically lie in the center of the feature space for each class - to be used as labeled data in future time-steps. This process, however, is computationally very expensive especially for high dimensional data. In this paper, we describe a modification to COMPOSE that allows the algorithm to work without core support extraction. We call the new algorithm FAST COMPOSE. Several datasets are used to compare the performance of FAST COMPOSE with the original COMPOSE, as well as with SCARGC (another algorithm that can address EVL), both in accuracy and in execution time. The results obtained show the promising potential of using FAST COMPOSE.
Christopher J. FredericksonRobi Polikar
Mobin M. IdreesFrederic StahlAtta Badii
Joel D. CostaElaine R. FariaJonathan de Andrade SilvaJoão GamaRicardo Cerri
María ArostegiAna I. Torre-BastidaJesús L. LoboMiren Nekane BilbaoJavier Del Ser
Muhammad UmerRobi PolikarChristopher J. Frederickson