JOURNAL ARTICLE

Generating Natural Language Attacks in a Hard Label Black Box Setting

Rishabh MaheshwarySaket MaheshwaryVikram Pudi

Year: 2020 Journal:   arXiv (Cornell University) Pages: 13525-13533   Publisher: Cornell University

Abstract

We study an important and challenging task of attacking natural language processing models in a hard label black box setting. We propose a decision-based attack strategy that crafts high quality adversarial examples on text classification and entailment tasks. Our proposed attack strategy leverages population-based optimization algorithm to craft plausible and semantically similar adversarial examples by observing only the top label predicted by the target model. At each iteration, the optimization procedure allow word replacements that maximizes the overall semantic similarity between the original and the adversarial text. Further, our approach does not rely on using substitute models or any kind of training data. We demonstrate the efficacy of our proposed approach through extensive experimentation and ablation studies on five state-of-the-art target models across seven benchmark datasets. In comparison to attacks proposed in prior literature, we are able to achieve a higher success rate with lower word perturbation percentage that too in a highly restricted setting.

Keywords:
Computer science Benchmark (surveying) Artificial intelligence Black box Adversarial system Natural language Machine learning Word (group theory) Language model Natural language processing

Metrics

4
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Generating Natural Language Attacks in a Hard Label Black Box Setting

Rishabh MaheshwarySaket MaheshwaryVikram Pudi

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2021 Vol: 35 (15)Pages: 13525-13533
JOURNAL ARTICLE

Automatic Selection Attacks Framework for Hard Label Black-Box Models

Xiaolei LiuXiaoyu LiDesheng ZhengJiayu BaiYu PengShibin Zhang

Journal:   IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) Year: 2022 Vol: 31 Pages: 1-7
JOURNAL ARTICLE

Hard-Label Black-Box Adversarial Attacks for Implicit Scene Interactions

Muxue LiangChuan WangSiyuan LiangAishan LiuYanan CaoQingyong LiZeming LiuLiang YangXiaochun Cao

Journal:   IEEE Transactions on Information Forensics and Security Year: 2025 Vol: 20 Pages: 10346-10360
© 2026 ScienceGate Book Chapters — All rights reserved.