Privacy-Preserving Models for Legal Natural Language Processing

Ying Yin; Ivan Habernal

doi:10.18653/v1/2022.nllp-1.14

ScienceGate Book Chapters

JOURNAL ARTICLE

Privacy-Preserving Models for Legal Natural Language Processing

Ying Yin Ivan Habernal

Year: 2022 Pages: 172-183

DOI: 10.18653/v1/2022.nllp-1.14

Get Full-Text PDF Get Analytical Report

Abstract

Pre-training large transformer models with in-domain data improves domain adaptation and helps gain performance on the domain-specific downstream tasks. However, sharing models pre-trained on potentially sensitive data is prone to adversarial privacy attacks. In this paper, we asked to which extent we can guarantee privacy of pre-training data and, at the same time, achieve better downstream performance on legal tasks without the need of additional labeled data. We extensively experiment with scalable self-supervised learning of transformer models under the formal paradigm of differential privacy and show that under specific training configurations we can improve downstream performance without sacrifying privacy protection for the in-domain data. Our main contribution is utilizing differential privacy for large-scale pre-training of transformer language models in the legal NLP domain, which, to the best of our knowledge, has not been addressed before.

Keywords:

Computer science Domain adaptation Scalability Transformer Adversarial system Differential privacy Artificial intelligence Downstream (manufacturing) Information privacy Machine learning Training set Labeled data Data modeling Data mining Computer security Database

Metrics

Cited By

1.57

FWCI (Field Weighted Citation Impact)

Refs

0.81

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Privacy-Preserving Technologies in Data

Physical Sciences → Computer Science → Artificial Intelligence

Adversarial Robustness in Machine Learning

Physical Sciences → Computer Science → Artificial Intelligence

Privacy-Preserving Models for Legal Natural Language Processing

Abstract

Metrics

Citation History

Topics

Related Documents

Privacy-Preserving Natural Language Processing

Privacy preserving methods for Natural Language Processing

Privacy-Preserving Natural Language Processing Techniques in Healthcare Chatbots

Privacy-Preserving Natural Language Processing Techniques in Healthcare Chatbots

Privacy-Preserving Quantum Natural Language Processing for Secure Text Classification