Side-channel information leakage can be exploited to reverse engineer critical architectural details of a target DNN model executing on a hardware accelerator. However, using these details to apply a practical adversarial attack remains a significant challenge. In this paper, we first introduce a novel approach to analyze side-channel data and extract detailed architectural information of DNN models, including accurate prediction of layer hyperparameters and inter-layer skip connections. Next, we develop techniques to construct effective proxy models from this information. We then leverage white-box access to these proxies to generate adversarial examples capable of effectively deceiving the target DNN model. We illustrate our techniques using popular DNNs as target models, and demonstrate that the constructed proxy models achieve up to 89.8% similarity in performance compared to the target models. Furthermore, we achieve adversarial transferability rates of up to 72.34% and induce up to 60.4% drop in accuracy in the target models using the crafted adversarial images. Compared to off-the-shelf substitute models, our method improves transferability by as much as 30% in untargeted adversarial attacks. Even when the target model is protected by a state-of-the-art denoiser, our proxy models generate 5.5% more transferable adversarial examples compared to other substitute models in untargeted adversarial attacks.
Tsunato NakaiDaisuke SuzukiFumio OmatsuTakeshi Fujino
Linxi JiangXingjun MaShaoxiang ChenJames BaileyYu–Gang Jiang