Abstract

Multi-label classification (MLC) of medical images aims to identify multiple diseases and holds significant clinical potential. A critical step is to learn class-specific characteristics for accurate diagnosis and improved interpretability effectively. However, current works focus primarily on causal attention to learn class-specific features, yet they struggle to interpret the true cause due to the inadvertent attention to class-irrelevant features. To address this challenge, we propose a new structural causal model (SCM) that treats class-specific attention as a mixture of causal, spurious, and noisy factors, and a novel Information Bottleneck-based Causal Attention (IBCA) that is capable of learning the discriminative class-specific attention for MLC of medical images. Specifically, we propose learning Gaussian mixture multi-label spatial attention to filter out class-irrelevant noise information and capture each class-specific attention pattern. Then a contrast enhancement-based causal intervention is proposed to gradually mitigate the spurious attention and reduce the noise by aligning the multi-head attention with the Gaussian mixture multi-label spatial. Quantitative and ablation results on Endo and MuReD show that IBCA outperforms all basic and causal methods. Compared to the second-best results for each metric, IBCA achieves improvements of 6.35\% in CR, 7.72\% in OR, and 5.02\% in mAP for MuReD, 1.47\% in CR, and 1.65\% in CF1, and 1.42\% in mAP for Endo.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4323_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/rabbittsui/IBCA

Link to the Dataset(s)

N/A

BibTex

@InProceedings{CuiXia_Information_MICCAI2025,
        author = { Cui, Xiaoxiao and Li, Yiran and He, Kai and Jiang, Shanzhi and Xue, Mengli and Li, Wentao and Leng, Junhong and Liu, Zhi and Cui, Lizhen and Li, Shuo},
        title = { { Information Bottleneck-based Causal Attention for Multi-label Medical Image Recognition } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},

}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a novel framework for multi-label classification in medical image, which combines causal learning and information bottleneck theory to capture class-related features while excluding redundant noise features. Experiments on two datasets demonstrate the effectiveness of the proposed method.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper focuses on using information bottleneck theory to exclude class-irrelevant features, while combining causal learning to reduce spurious and noisy attention on causal class- specific features, which is among the first to consider in this task.
    2. Experiments demonstrate the superiority of the proposed method compared to SOTA baselines. Ablation studies also confirm the effectiveness of the individual components introduced in the model.
    3. The visual analysis in the paper provides an intuitive demonstration of the method’s superiority. Additionally, the class-specific attention generated by the method offers better interpretability for medical diagnosis.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The notation in the method section of the paper is somewhat confusing. The dimensions of the matrices should be explicitly stated, and the meanings of symbols used in different contexts need to be clarified. For example, the meaning of the symbol “C” in the dimension of Z_p on page 5 is not explained. These unclear writing conventions significantly increase the difficulty of reading.
    2. The implementation of the do-calculus mentioned in Equation 2 for causal intervention in the subsequent stages of the model is not clearly explained in the paper. Additionally, this aspect is not explicitly indicated in the framework diagram.
    3. The combination of Variational Information Bottleneck and causal learning is effective, but the methods used are largely based on existing techniques, with the novelty primarily lying in their integration.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see above.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    Based on the concept that class-specific attention is a mixture of causal, spurious, and noisy factors, the paper introduces a novel information bottleneck-based causal attention to filter out the class-irrelevant information by gaussian mixture multi-label spatial attention. Experiments illustrate the effectiveness of the proposed method.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • This work is the first to introduce the information bottleneck framework into causal learning for multi-label classification, offering a novel perspective on disentangling causal and non-causal factors in attention mechanisms.
    • The proposed method enhances both interpretability and predictive performance.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The authors do not provide any information about the availability of the code, which limits the reproducibility of the results and the potential for follow-up work by the research community.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method demonstrates a significant improvement in performance, along with enhanced interpretability, highlighting its potential impact in advancing multi-label classification tasks.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper proposes IBCA, a framework that combines Gaussian mixture variational information bottleneck (GM-VIB) and contrastive enhancement-based causal intervention (CECI) for medical image MLC. Key contributions include:

    GM-VIB: A novel approach to model class-specific spatial attention using GMM, dynamically filtering class-irrelevant noise while preserving discriminative features. CECI: A causal intervention mechanism that aligns multi-head attention with GM-VIB through contrastive learning, mitigating spurious correlations and noise. Structural Causal Model (SCM): The first integration of information bottleneck principles with causal inference in medical MLC, explicitly decomposing attention into causal, spurious, and noisy components. Experiments on Endo and MuReD datasets demonstrate state-of-the-art performance, with improvements of up to 7.72% in overall recall (OR) and 5.02% in mAP.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel Formulation: The fusion of GMM with variational information bottleneck (VIB) is innovative, addressing limitations of single-Gaussian priors in MLC. The contrastive alignment between causal attention and GM-VIB is also original. Strong Evaluation: Comprehensive experiments on two datasets show consistent outperformance over SOTA methods (e.g., 73.43% mAP on MuReD). Ablation studies validate the necessity of each component (GM-VIB, CECI). Interpretability: Visualizations (Fig. 3) and t-SNE analyses demonstrate clearer class-specific feature separation and more precise lesion localization compared to baselines like IDA and ML-C. Clinical Feasibility: The framework’s ability to filter anatomical noise and highlight diagnostically relevant regions aligns with real-world clinical needs.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Multi-Region Attention Limitation: The current GMM design assigns one Gaussian component per class, which may struggle with multiple disjoint lesions (e.g., polyps in separate colon regions). Extending to multiple components per class (K>1) could address this.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper makes a significant contribution by effectively modeling multi-label classification tasks through the novel integration of Gaussian Mixture Models (GMM) and causal analysis. This approach breaks through the limitations of traditional frameworks, which often struggle with class-irrelevant noise and spurious correlations in medical image analysis. By combining GMM-based spatial attention with causal intervention, the method not only achieves superior performance (e.g., 5.02% mAP improvement on MuReD) but also greatly enhances interpretability during training. The clear visualization of class-specific attention maps and t-SNE analysis of feature distributions provide strong evidence for the model’s ability to discriminate and localize multiple diseases, making it both technically rigorous and clinically persuasive.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

Thank you for your valuable comments on our work. Special thanks to R2 and R3 for accepting the paper directly.

  1. Novelty (R1: “a novel framework for multi-label classification in medical image”; R2: “a novel perspective on disentangling causal and non-causal factors in attention mechanisms”; R3: “the first integration of information bottleneck principles with causal inference in medical MLC”)
  2. Sufficient experiments (R1 & R2 & R3: “superiority of the proposed method compared to SOTA baselines”) Q&A: Code will be released publicly for reproducibility, along with detailed implementation.
  3. Clarity and organization (R2 & R3: “Good”) Q3: Unclear explanation of implementation of do-calculus in Eq. 2 (R1). A3: 1) Eq. 2 is implemented by Sec. 2.3 Contrastive Enhancement-based Causal Intervention, which applies multiple sampling of spatial attentions via MHFS for causal intervention. Specifically, \text{Sigmoid}(\text{Clf}(a_k^nx^n))} is the final prediction of each spatial attention sample $ a_k^n$ derived from $A_l$. $a_k^n x^n $ denotes the class-specific features from $Z_t$. The term $\frac{P(a_k^n)}{P(a_k^n \mid c)}$ indicates the weight of each spatial attention sample, which is set to $1/N$ due to uniform sampling in the multi-head attention mechanism. We will clarify this in the camera-ready version. Q4: Novelty primarily lies in the combination of VIB and causal learning (R1). A4: While the integration of VIB and causal attention is one component, our contribution is not a simple combination. Existing VIB applies a spherical Gaussian prior to latent class-specific features, whose uniformity reduces the discriminability of multi-label attentions in MLC tasks (see 3rd paragraph of Sec. 1). We propose a Gaussian Mixture VIB to better filter out class-irrelevant information and enhance class-specific attention. These attentions are further used in our MHFS-based Causal Intervention, which enforces multi-head attentions to be both discriminative and causal. This integrated framework is non-trivial and leads to significant performance enhancement. Q&A4: symbol C in the dimension of Z_p on page 5 means the number of classes $N_c$, we will correct this typo.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top