JOURNAL ARTICLE

Avoiding Shortcut-Learning by Mutual Information Minimization in Deep Learning-Based Image Processing

Louisa FayErick CobosBin YangSergios GatidisThomas Küstner

Year: 2023 Journal:   IEEE Access Vol: 11 Pages: 64070-64086   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Deep learning models are increasingly being used in detecting patterns and correlations in medical imaging data such as magnetic resonance imaging. However, conventional methods are incapable of considering the real underlying causal relationships. In the presence of confounders, spurious correlations between data, imaging process, content, and output can occur that allow the network to learn shortcuts instead of the desired causal relationship. This effect is even more prominent in new environments or when using out-of-distribution data since the learning process is primarily focused on correlations and patterns within the data. Hence, wrong conclusions or false diagnoses can be obtained from such confounded models. In this paper, we propose a novel framework, denoted as Mutual Information Minimization Model (MIMM), that predicts the desired causal outcome while simultaneously reducing the influence of present spurious correlations. The input imaging data is encoded into a feature vector that is split into two components to predict the primary task and the presumed spuriously correlated factor separately. We hypothesize that learned mutual information between both feature vector components can be reduced to achieve independence, i.e., confounder-free task prediction. The proposed approach is investigated on five databases: two non-medical benchmark databases (Morpho-MNIST and Fashion-MNIST) to verify the hypothesis and three medical databases (German National Cohort, UK Biobank, and ADNI). The results show that our proposed framework serves as a solution to address the limitations of conventional deep learning models in medical image analysis. By explicitly considering and minimizing spurious correlations, it learns causal relationships which result in more accurate and reliable predictions. The novel contributions in this work are: 1) the separation of features into the prediction of the primary task and the spuriously correlated factor; 2) MIMM targets the preservation of invariance to counterfactuals, prevents shortcut learning, and enables confounder-free network training; and 3) the mutual information minimization addresses heterogeneous data cohorts as usually encountered in the medical domain.

Keywords:
Computer science Spurious relationship MNIST database Artificial intelligence Machine learning Mutual information Feature (linguistics) Benchmark (surveying) Pattern recognition (psychology) Medical diagnosis Multi-task learning Feature learning Deep learning Data mining Task (project management)

Metrics

11
Cited By
3.40
FWCI (Field Weighted Citation Impact)
99
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Radiomics and Machine Learning in Medical Imaging
Health Sciences →  Medicine →  Radiology, Nuclear Medicine and Imaging
Machine Learning in Healthcare
Physical Sciences →  Computer Science →  Artificial Intelligence
AI in cancer detection
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

Avoiding shortcut-learning by mutual information minimization in deep learning-based MR image processing

Louisa FayBin YangSergios GatidisThomas Kuestner

Journal:   Proceedings on CD-ROM - International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition/Proceedings of the International Society for Magnetic Resonance in Medicine, Scientific Meeting and Exhibition Year: 2024
JOURNAL ARTICLE

Learning Domain-Independent Deep Representations by Mutual Information Minimization

Ke WangJiayong LiuJingyan Wang

Journal:   Computational Intelligence and Neuroscience Year: 2019 Vol: 2019 Pages: 1-14
© 2026 ScienceGate Book Chapters — All rights reserved.