JOURNAL ARTICLE

On the transferability of local model‐agnostic explanations of machine learning models to unseen data

Abstract

Numerous methods have been developed to address the critical need to understand the behavior of AI systems. Arguably, the most popular are model-agnostic local explanation techniques, which focus on examining model behavior for individual instances. While several implementations have been proposed, comparatively less attention has been paid to assessing the robustness and transferability of the generated explanations to unseen data. More importantly, most robustness analyzes have focused on differentiable models and deep neural networks. In this paper, we analyze the robustness of two well-known model-agnostic explanation methods, LIME and SHAP, from a methodological perspective and propose a criterion to measure the transferability of explanations from the training to the testing phases. Therefore, the proposed methodology validates explanations not only in terms of model performance but also in terms of their robustness during the learning process. We conclude that the transferability of SHAP explanations is better in sparse or low-density data sets than that of LIME, while the opposite is true for very dense data sets. We also observed that there are no significant differences between the results obtained for different machine learning models combined with these two model-agnostic techniques.

Keywords:
Transferability Computer science Data modeling Artificial intelligence Machine learning Data science Software engineering

Metrics

4
Cited By
2.56
FWCI (Field Weighted Citation Impact)
19
Refs
0.86
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Explainable Artificial Intelligence (XAI)
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.