Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Fairness in medical imaging ML models is dependent on ensuring they are not impacted by sensitive attributes such as such as race and gender. Building on popularly considered in-processing fairness mitigation strategies, we present a novel approach to leveraging mutual information (MI) regularization to learn fairness-aware deep imaging representations. Based on analytical and theoretical justification, we develop a unique gradient-based mutual information penalty which bypasses the need for MI estimation within our Fairness-aware MI (FaMI) framework which avoids unstable approximations and scales effectively to large datasets. FaMI was implemented in conjunction with popular DenseNet and Vision Transformer architectures and evaluated against nine alternative fairness-aware alternatives as well as alternative MI estimators. Experiments on multi-institutional retinal OCT and rectal cancer MRI cohorts demonstrate that FaMI-ViT achieves the highest overall classification AUC (0.83 in distinguishing glaucoma vs non-glaucoma, 0.81 in distinguishing responders vs non-responders) while also improving fairness-related metrics across disparity subgroups, increasing EOM up to 0.84 and reducing EOdd by up to 0.85. These results highlight the potential of fairness-aware MI constraints in developing robust and equitable imaging-based ML models.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1404_paper.pdf

SharedIt Link: https://rdcu.be/eHxcu

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05185-1_39

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{SadAmi_Mutual_MICCAI2025,
        author = { Sadri, Amir Reza AND DeSilvio, Thomas AND Viswanath, Satish E.},
        title = { { Mutual Information Regularization for Fairness-aware Deep Imaging Representations } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15973},
        month = {September},
        page = {399 -- 409}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a novel fairness learning framework, FaMI, which is based on mutual information regularization for medical imaging. By employing a gradient-based MI penalty, it ensures more stable training. Ultimately, by comparing multiple baselines and various fairness metrics across two binary clinical classification tasks, the paper demonstrates the effectiveness of the proposed approach.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed method is based on mutual information, which provides a strong theoretical foundation.
2. The experimental design is well-structured, incorporating comparisons among vanilla baselines, fairness-aware baselines, and fairness-aware MI-based methods. This setup allows readers to compare performance differences between different methods.
3. The experimental results on two datasets clearly demonstrate that the proposed approach enhances fairness.
4. The authors employ pairwise Wilcoxon testing to evaluate significant differences in model performance across various approaches. This enhances the credibility of the experimental results.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The manuscript lacks sufficient citations, especially in the Introduction where discussing the types of bias mitigation strategies. Including more relevant and recent literature would enhance the paper.
2. The notation used in the proof of Theorem 1 is unclear. For example, the definition of f(θ_k) is not clearly explained, and the update rule v_k+1 = v_k + h_k seems to lack justification or a proper reference. All notations should be clearly explained in the paper.
3. Given the complexity of the proposed method, it is essential to provide additional implementation details. Making the code available would enhance reproducibility and enable readers to better understand the approach.
4. Some details need to be addressed. For example, in the Abstract, the phrase “such as” appears twice in line 2. Additionally, in Table C2, the total count across different classes does not match the total count across different genders, which should be aligned. Please verify and correct these numbers.
5. The experimental setup only considers binary classification scenarios. It would be beneficial to evaluate the method in multi-class or more complex real-world settings to demonstrate its generalizability.
6. The selection of fairness-aware baselines is limited and outdated, such as FairBatch. The authors should justify this choice or consider incorporating more recent and diverse baseline methods to facilitate a more comprehensive comparison.
7. No ablation study is provided to analyze the effects of different μ values.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper lacks detail in certain areas, such as citations in the Introduction part and comparisons of methods in the Experimental Design section. Additionally, the theoretical induction is unclear. Furthermore, the evaluation is limited to the binary case. Therefore, I recommend a weak reject.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

Based on the current scope and contributions of the paper, I believe it would be more appropriate for a journal such as Medical Image Analysis (MIA) or IEEE Transactions on Medical Imaging (TMI). While the method is theoretically grounded and well-evaluated, its focus is limited to binary classification, which restricts its applicability in clinical settings where multi-class or multi-label classification is often required. In the context of a conference submission, this limitation reduces the perceived contribution. I recommend that the authors consider extending their method and experimental validation to more complex classification scenarios and submit the improved version to a journal venue. As a result, I recommend Reject.

Review #2

Please describe the contribution of the paper

The authors introduced anapproach to leveraging mutual information regularization to learn fairness-aware deep imaging representations
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novel gradient-based MI Regularization.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. First sentence in abstract “Fairness in medical imaging ML models is dependent on ensuring they are not impacted by sensitive attributes such as such as race and gender.” one more “such as”.
2. It is better to add a figure to show how SFD works.
3. Why µ = 0.5?
4. Is the Score Function Difference (SFD) sensitive to batch size, or embedding dimensionality? Show the Vit architecture.
5. How well the method generalizes to unseen demographic subgroups?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

method
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

Authors propose a novel regularization technique prioritizing fairness. Experiments performed in two different applications shows that the proposed framework presents the best efficient scores (AUC) and fairness related scores simultaneously (a current challenge in the area, as there is commonly a tradeoff in between such evaluations).
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Paper presents a highlight relevant topic, and shows limitations from using MI, related to a priori knowledge from the data distribution and training instability.
- The proposed solution is a gradient-based MI penalty during training;
- Implementation in two commonly used architectures for medical imaging: DenseNet and Vision Transformers
- Methodology is well described and mathematically formalized -Comprehensive comparison with fairness aware models using evaluation metrics considering both efficiency metrics and fairness metrics (and stability during training)
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- although the proposal will be evaluated in two different applications, the focus is only into a single sensitive feature. Not clear why? Would be possible to consider more than one in the proposed loss function?
- not clear if sex or gender is being assessed
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Paper is well motivated, presents a novel regularization for fairness awareness, developed into 2 commonly used DL architectures; experiments include 2 different medical imaging classification problems; comprehensive methodology, results assessing efficiency, multiple fairness metrics and training stability. Minor comments only.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

My comments were properly addressed

Author Feedback

Considering multiple sensitive attributes: The final manuscript will acknowledge that the FaMI framework can be extended to consider multiple attributes, which is being explored for an extended journal submission.

Score Function Difference (SFD): SFD, estimated using polynomial kernels, provides a stable, differentiable gradient for MI minimization. The associated schematic was omitted due to space limitations and changes in supplementary material guidelines. Our initial analysis suggests that SFD is robust to appropriately chosen ranges for batch sizes and embedding dimensions, which we will discuss in the final manuscript. More detailed analysis will be explored in future work.

Regularization parameter (µ): The choice of µ = 0.5 was guided by validation performance and prior work, which will be noted in the experimental analysis. Ablation over values of µ and its impact on training will be explored in future work.

Citations & literature coverage: Our current Introduction attempted to contextualize FaMI in the broad context of both fairness evaluation and unfairness mitigation strategies. Our final paper will include a dedicated related work section to discuss these in more detail.

Theoretical notation: We apologize for the oversight. Notation in Theorem 1 and Algorithm 1 will be clarified, including definitions for f(θ_k) and the v_k update rule.

Evaluation considered binary tasks: Based on the two different clinical problems considered, our MICCAI submission focused on binary classification alone as an initial validation. The FaMI framework can be generalized to multi-class settings, which will be acknowledged in Discussion and addressed in an extended journal submission. We will also explore whether accounting for a given set of sensitive attributes will allow for generalization to unseen subgroups in future work.

Experimental comparisons: Please note the current MICCAI submission already includes testing of 2 different SOTA architectures, 3 different fairness-aware approaches, as well as 2 alternate MI-estimators; totaling 11 models for each of the two classification tasks. Statistical evaluation included 3 classifier performance measures and 2 convergence measures for robust validation.

Reproducibility: Our submission included a detailed algorithm description already. Additional architectural schematics and parameter details (e.g., batch size, optimizer) will be added in the final manuscript. Code and pretrained models will be publicly released upon acceptance.

Minor Issues : All typographical and formatting errors (e.g., duplicated words) will be corrected. The omitted SFD diagram and ViT architecture schematic will be restored in the final version, pending space availability. Sex (as recorded clinically) was considered for the rectal cancer use-case. Regarding Table C2, the numbers are correct; class and gender distributions are shown independently, with possible overlap. This will be clarified in the caption.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Mutual Information Regularization for Fairness-aware Deep Imaging Representations

Author(s):