Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

lab-midas/MIMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

17 Commits

Repository files navigation

Mutual Information Minimization Model (MIMM)

Deep learning models are widely applied in medical image analysis, such as magnetic resonance imaging (MRI), to detect patterns and correlations. However, conventional models often fail to account for the underlying causal relationships in the data. In the presence of confounding factors, spurious correlations between the imaging process, image content, and labels may cause the network to learn shortcuts--resulting in biased or incorrect predictions.

This challenge becomes even more severe when applying the model to new environments or out-of-distribution (OOD) data, where these spurious correlations may not hold. As a result, such models risk generating misleading conclusions or diagnoses.

In our work (Fay et al., 2023), we introduce the Mutual Information Minimization Model (MIMM), a novel framework that enhances causal prediction while mitigating the effect of spurious correlations.

Key Idea

MIMM encodes the input image into a feature representation, which is then split into two disjoint components:

  • One for predicting the primary task (e.g., disease classification),
  • One for predicting the spuriously correlated factor (e.g., demographic attribute, scanner type).

We hypothesize that minimizing the mutual information (MI) between these two components encourages their independence--leading to confounder-free and causally meaningful predictions.

We evaluate MIMM on five datasets:

  • Two non-medical benchmarks: Morpho-MNIST and Fashion-MNIST.
  • Three medical imaging cohorts: German National Cohort, UK Biobank, and ADNI.

The results demonstrate that MIMM consistently outperforms conventional models by learning invariant, robust, and causally aligned representations.

Contributions

  1. Feature disentanglement for separate prediction of the primary task and spurious factors.
  2. Mutual information minimization to prevent shortcut learning and enforce invariance to counterfactual scenarios.
  3. Applicability to heterogeneous medical data through principled modeling of spurious correlations.

In standard models, confounding introduces spurious correlations that may mislead the learning process. MIMM explicitly interrupts these shortcuts and enables the model to focus on the true causal signal.


Prerequisites

Install the required Python packages via:

pip install -r requirements.txt

Running the Model

To run the model with a specified configuration file:

python main.py config.yml

Citation

If you use MIMM in your research, please cite our work:

@article{fay2023avoiding,
title={Avoiding shortcut-learning by mutual information minimization in deep learning-based image processing},
author={Fay, Louisa and Cobos, Erick and Yang, Bin and Gatidis, Sergios and K{\"u}stner, Thomas},
journal={IEEE Access},
volume={11},
pages={64070--64086},
year={2023},
publisher={IEEE}
}

About

Mutual Information Minimization Model (MIMM)

Resources

Readme

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages