Qiushi WangKee Jin LeeJihoon Hong
Imbalanced datasets are often encountered in process monitoring, where the data reflecting abnormal events like machine failures is less than the data reflecting normal events. The former is called the minority class and the later is referred as the majority class. Classical machine learning algorithms are still facing challenges in solving this problem. In order to improve the classification accuracy, oversampling techniques rebalance the dataset by supplying the minority class with synthetic samples. However, the latent sample spaces of both classes are broad, the majority class might be under-represented as well. In this paper, we propose a dual oversampling strategy (DOSS) to generate samples for both classes. For the majority class, synthetic samples are generated according to the data distribution, which is approximated by conditional Generative Adversarial Network (cGAN). For the minority class, Synthetic Minority Over-sampling Technique (SMOTE) is applied as the oversampling method. The proposed strategy is compared with others that either only the minority class is oversampled or both classes are oversampled with different strategies. Recall, G-mean and F-measure are used as the metrics. The experimental results on 12 benchmark datasets show the improved performance of our proposed strategy. DOSS is further applied to detect the faulty stages of an injection moulding machine where the prediction of DOSS achieves a better accuracy.
Hien M. NguyenEric W. CooperKatsuari Kamei
LIU ZhihanZHANG ZhonglinZHAO Lei
Zhen ZhangHongpeng TianJin-shuai Jin