Visual anomaly detection, an important problem in computer vision, is usually formulated as a one-class classification and segmentation task. The student-teacher (S- T) framework has proved to be effective in solving this chal-lenge. However, previous works based on S-T only empirically applied constraints on normal data and fused multilevel information. In this study, we propose an improved model called DeS TSeg, which integrates a pre-trained teacher network, a denoising student encoder-decoder, and a segmentation network into one framework. First, to strengthen the constraints on anomalous data, we intro-duce a denoising procedure that allows the student net-work to learn more robust representations. From synthet-ically corrupted normal images, we train the student net-work to match the teacher network feature of the same images without corruption. Second, to fuse the multi-level S-T features adaptively, we train a segmentation network with rich supervision from synthetic anomaly masks, achieving a substantial performance improvement. Experiments on the industrial inspection benchmark dataset demonstrate that our method achieves state-of-the-art performance, 98.6% on image-level AUC, 75.8% on pixel-level average precision, and 76.4% on instance-level average precision.
Sheng WangXiaoming HuangHongjuan PeiPing Chai
Ying ZangAnkang LuBing LiWenjun Hu
Ning LiAjian LiuChaohao JiangSuigu TangYongze LiYanyan Liang