JOURNAL ARTICLE

Knowledge Distillation with Source-free Unsupervised Domain Adaptation for BERT Model Compression

Abstract

The pre-training language model BERT has brought significant performance improvements to a series of natural language processing tasks, but due to the large scale of the model, it is difficult to be applied in many practical application scenarios. With the continuous development of edge computing, deploying the models on resource-constrained edge devices has become a trend. Considering the distributed edge environment, how to take into account issues such as data distribution differences, labeling costs, and privacy while the model is shrinking is a critical task. The paper proposes a new BERT distillation method with source-free unsupervised domain adaptation. By combining source-free unsupervised domain adaptation and knowledge distillation for optimization and improvement, the performance of the BERT model is improved in the case of cross-domain data. Compared with other methods, our method can improve the average prediction accuracy by up to around 4% through the experimental evaluation of the cross-domain sentiment analysis task.

Keywords:
Computer science Adaptation (eye) Distillation Enhanced Data Rates for GSM Evolution Language model Domain (mathematical analysis) Domain adaptation Task (project management) Artificial intelligence Machine learning Performance improvement Data mining

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.06
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Speech Recognition and Synthesis
Physical Sciences →  Computer Science →  Artificial Intelligence
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.