JOURNAL ARTICLE

Anomaly Detection via Online Oversampling Principal Component Analysis

Yuh‐Jye LeeYi-Ren YehYu-Chiang Frank Wang

Year: 2013 Journal:   IEEE Transactions on Knowledge and Data Engineering Vol: 25 (7)Pages: 1460-1470   Publisher: IEEE Computer Society

Abstract

Anomaly detection has been an important research topic in data mining and machine learning. Many real-world applications such as intrusion or credit card fraud detection require an effective and efficient framework to identify deviated data instances. However, most anomaly detection methods are typically implemented in batch mode, and thus cannot be easily extended to large-scale problems without sacrificing computation and memory requirements. In this paper, we propose an online oversampling principal component analysis (osPCA) algorithm to address this problem, and we aim at detecting the presence of outliers from a large amount of data via an online updating technique. Unlike prior principal component analysis (PCA)-based approaches, we do not store the entire data matrix or covariance matrix, and thus our approach is especially of interest in online or large-scale problems. By oversampling the target instance and extracting the principal direction of the data, the proposed osPCA allows us to determine the anomaly of the target instance according to the variation of the resulting dominant eigenvector. Since our osPCA need not perform eigen analysis explicitly, the proposed framework is favored for online applications which have computation or memory limitations. Compared with the well-known power method for PCA and other popular anomaly detection algorithms, our experimental results verify the feasibility of our proposed method in terms of both accuracy and efficiency.

Keywords:
Oversampling Anomaly detection Principal component analysis Computer science Data mining Outlier Robust principal component analysis Covariance matrix Computation Intrusion detection system Anomaly (physics) Pattern recognition (psychology) Artificial intelligence Algorithm

Metrics

221
Cited By
27.82
FWCI (Field Weighted Citation Impact)
36
Refs
1.00
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Anomaly Detection Techniques and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Network Security and Intrusion Detection
Physical Sciences →  Computer Science →  Computer Networks and Communications
Imbalanced Data Classification Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.