JOURNAL ARTICLE

Privacy preserving methods for Natural Language Processing

Plant, Richard

Year: 2025 Journal:   Edinburgh Napier Research Repository (Edinburgh Napier University)   Publisher: Edinburgh Napier University

Abstract

This work investigates the effects of personal data privacy risk and other potential types of attack that threaten privacy or effective learning in the NLP domain, including private information re-identification, membership inference, and data poisoning. Through a survey of the extant literature a comprehensive set of extant attacks on NLP models that pose a threat to privacy have been identified, as well as leading ideas on reducing their impact. This work demonstrates the risk of specific attacks to popular NLP models, as well as carrying out a rigorous empirical evaluation of the impact of proposed mitigation strategies. This thesis proposes a set of privacy-preserving defences for machine learning when applied to the natural language domain, especially applied to large pre-trained language models. This includes several approaches based on local differential privacy, that is, applying a transformation to the data before it is processed that makes breaching privacy more difficult while preserving the utility of the set for learning, as well as other approaches based on adversarial training, such as Gradient Reversal and Cross-Gradient Training. In addition, this research includes the development and empirical demonstration of the effectiveness of a hybrid LDP/adversarial approach on reducing re-identification risk for language models, as well as similar hybrid and combined approaches for reducing membership inference attack risk.

Keywords:
Natural language Extant taxon Set (abstract data type) Adversarial system Differential privacy Inference Information privacy Empirical research

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Adversarial Robustness in Machine Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Privacy-Preserving Technologies in Data
Physical Sciences →  Computer Science →  Artificial Intelligence
Ethics and Social Impacts of AI
Social Sciences →  Social Sciences →  Safety Research
© 2026 ScienceGate Book Chapters — All rights reserved.