JOURNAL ARTICLE

Understanding and Mitigating Poisoning Attacks in Large Language Models

Allika, Krishnakanth

Year: 2025 Journal:   Zenodo (CERN European Organization for Nuclear Research)   Publisher: European Organization for Nuclear Research

Abstract

This paper explores the growing threat of data poisoning and backdoor attacks in large language models (LLMs), revealing that even a small, fixed number of poisoned samples—around 250 documents—can compromise models up to 13B parameters. It synthesizes recent research, explains experimental methodologies from Anthropic and others, and provides actionable defense strategies for AI engineers and enterprises. The work emphasizes the urgent need for trusted data pipelines, anomaly detection, and post-training audits to ensure AI model integrity at scale.

Keywords:
Backdoor Compromise Audit Work (physics) Anomaly detection Poison control

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Explainable Artificial Intelligence (XAI)
Physical Sciences →  Computer Science →  Artificial Intelligence
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

© 2026 ScienceGate Book Chapters — All rights reserved.