JOURNAL ARTICLE

Fine-grained Prompt Screening: Defending Against Backdoor Attack on Text-to-Image Diffusion Models

Abstract

Text-to-image (T2I) diffusion models exhibit impressive generation capabilities in recently studies. However, they are vulnerable to backdoor attacks, where model outputs are manipulated by malicious triggers. In this paper, we propose a novel input-level defense method, called Fine-grained Prompt Screening (GrainPS). Our method is motivated by the phenomenon, i.e., Semantics Misalignment, where the backdoor trigger causes the inconsistency between the cross-attention projections of object words (the key words to determine the main content of the generated image) and their true semantics. In particular, we divide each prompt into pieces and conduct fine-grained analysis by examining the impact of the trigger on object words in the cross-attention layers rather than their global influence on the entire generated image. To assess the impact of each word on object words, we formulate "semantics alignment score'' as the metric with a carefully crafted detection strategy to identify the trigger. Therefore, our implementation can detect backdoor input prompts and localize of triggers simultaneously. Evaluations across four advanced backdoor attack scenarios demonstrate the effectiveness of our proposed defense method.

Keywords:

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.38
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Steganography and Watermarking Techniques
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Digital Media Forensic Detection
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition

Related Documents

JOURNAL ARTICLE

Prompt suffix-attack against text-to-image diffusion models

Siyun XiongYanhui DuZhuohao WangPeiqi Sun

Journal:   Neurocomputing Year: 2025 Vol: 630 Pages: 129659-129659
JOURNAL ARTICLE

Unified Prompt Attack Against Text-to-Image Generation Models

Duo PengQiuhong KeHe HuangPing HuJun Liu

Journal:   IEEE Transactions on Pattern Analysis and Machine Intelligence Year: 2025 Vol: 47 (6)Pages: 4816-4834
JOURNAL ARTICLE

Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models

Yihao HuangFelix Juefei-XuQing GuoJie ZhangYutong WuMing HuTianlin LiGeguang PuYang Liu

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2024 Vol: 38 (19)Pages: 21169-21178
© 2026 ScienceGate Book Chapters — All rights reserved.