With the rapid development of deep neural networks (DNNs), eXplainable AI, which provides a basis for prediction on inputs, has become increasingly important. In addition, DNNs have a vulnerability called an Adversarial Example (AE), which can cause incorrect output by applying special perturbations to inputs. Potential vulnerabilities can also exist in image interpreters such as GradCAM, necessitating their investigation, as these vulnerabilities could potentially result in misdiagnosis within medical imaging. Therefore, this study proposes a black-box adversarial attack method that misleads the image interpreter using Sep-CMA-ES. The proposed method deceptively shifts the focus area of the image interpreter to a different location from that of the original image while maintaining the same predictive labels.
Lifeng HuangShuxin WeiChengying GaoNing Liu
Yang BaiYisen WangYuyuan ZengYong JiangShu‐Tao Xia
Jiaming MuBinghui WangQi LiKun SunMingwei XuZhuotao Liu
Xiaohui KuangHongyi LiuYe WangQikun ZhangQuanxin ZhangJinhua Zheng
Ryuji KawanoKurihara AkimotoSatoshi Ono