JOURNAL ARTICLE

An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

Lifu TuGarima LalwaniSpandana GellaHe He

Year: 2020 Journal:   Transactions of the Association for Computational Linguistics Vol: 8 Pages: 621-633   Publisher: Association for Computational Linguistics

Abstract

Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset. Intrigued by these results, we find that the key to their success is generalization from a small amount of counterexamples where the spurious correlations do not hold. When such minority examples are scarce, pre-trained models perform as poorly as models trained from scratch. In the case of extreme minority, we propose to use multi-task learning (MTL) to improve generalization. Our experiments on natural language inference and paraphrase identification show that MTL with the right auxiliary tasks significantly improves performance on challenging examples without hurting the in-distribution performance. Further, we show that the gain from MTL mainly comes from improved generalization from the minority examples. Our results highlight the importance of data diversity for overcoming spurious correlations. 1

Keywords:
Spurious relationship Computer science Inference Robustness (evolution) Generalization Paraphrase Artificial intelligence Machine learning Counterexample Language model Natural language processing Mathematics

Metrics

119
Cited By
14.54
FWCI (Field Weighted Citation Impact)
55
Refs
0.99
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
Natural Language Processing Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Robustness of Pre-trained Language Models for Natural Language Understanding

Utama, Prasetya Ajie

Journal:   TUbilio (Technical University of Darmstadt) Year: 2024
JOURNAL ARTICLE

Model Compression vs. Adversarial Robustness: An Empirical Study on Pre-trained Models of Code

Anonymous

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2025
BOOK-CHAPTER

Pre-trained Language Models

Huaping ZhangJianyun Shang

Year: 2025 Pages: 73-90
BOOK-CHAPTER

Pre-trained Language Models

Gerhard PaaßSven Giesselbach

Artificial intelligence: foundations, theory, and algorithms/Artificial intelligence: Foundations, theory, and algorithms Year: 2023 Pages: 19-78
© 2026 ScienceGate Book Chapters — All rights reserved.