Abstract

Medical image segmentation is essential for diagnosis and treatment planning, however fully supervised deep learning methods require expensive pixel-level annotations. Weakly supervised semantic segmentation (WSSS) using class activation mapping (CAM) reduces this burden by utilizing image-level labels. While binary CAM has shown promising results, multiclass CAM remains under-explored and suffers from reduced accuracy due to weak localization signals. To address this, we propose a novel approach that improves multiclass WSSS by leveraging binary CAM to guide multiclass CAM, enhancing feature representation, inter-class boundary segmentation and prediction accuracy. Additionally, we introduce novel inter-class separability loss and agreement loss designed to enhance multiclass CAM learning by enforcing spatial consistency and class separability. Experimental results on brain tumor segmentation (BraTS) datasets demonstrate that our approach significantly enhances multiclass weakly supervised segmentation accuracy, outperforming existing methods. Our code is available at https://github.com/Vivek-Dhamale/WSS-Interclass-Sep.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1332_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/Vivek-Dhamale/WSS-Interclass-Sep

Link to the Dataset(s)

https://www.med.upenn.edu/cbica/brats2020/data.html

BibTex

@InProceedings{DhaViv_Interclass_MICCAI2025,
        author = { Dhamale, Vivek and Sundaresan, Vaanathi},
        title = { { Inter-class separability loss for weakly supervised mutually exclusive multiclass segmentation of brain tumor lesions } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {246 -- 256}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    To address the limitations of CAM in generating mutually exclusive activations, the authors propose an enhanced multiclass CAM approach that leverages binary CAMs as guidance to improve class-specific localization and the overall quality of multiclass CAMs. Additionally, two novel loss functions—inter-class separability loss and agreement loss—are introduced to enforce stronger inter-class discrimination and ensure spatial consistency between binary and multiclass CAMs.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper proposes a meaningful strategy by leveraging binary CAMs to guide the generation of multiclass CAMs, while employing two loss functions to enforce consistency between them. This approach is both theoretically sound and practically valuable.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Although the proposed method is effective, similar approaches have already been explored in recent literature—for instance, the work titled “WS-MTST: Weakly Supervised Multi-Label Brain Tumor Segmentation With Transformers” also addresses weakly supervised multiclass segmentation. This overlap somewhat diminishes the novelty of the contribution. Moreover, including more recent CAM-based methods (from the past two years) in the comparative evaluation would strengthen the evidence supporting the proposed approach. Lastly, the use of only a single dataset (BraTS) limits the generalizability of the findings; validating the method on additional segmentation datasets would significantly enhance its credibility and practical value.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method in this paper is effective and feasible, and the overall writing is well-structured with clear logic. However, the work presents limited novelty, and the experimental validation is insufficient, lacking broader evaluation across datasets and comparative methods.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes an extension to CAM-based weak supervised semantic segmentation to make it suitable for multi-class segmentation. This consists of the addition of a loss component that enforces class separation in CAM-maps and one that promotes consistency between the binary CAM and the individual class maps. Furthermore, they adapt the architecture proposed for single class CAM to enable accurate multi-class segmentation. They compare their method to several other baselines and do an ablation study for each of the added components.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is very well written and fairly easy to follow along. The paper is well structured, presents the contributions, methods and results in a coherent way.

    2. The paper presents novel loss components and architectural changes that enable multi-class CAM-based weak supervised segmentation. This would be of interest to the MICCAI community.

    3. The paper compares the results to four different previous methods, showing good comparison of the results to previous baselines.

    4. The paper contains an ablation study demonstrating the effect of each of the loss components, showing that each component adds to the final performance.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The evaluation is limited to a single dataset with only two different classes (+ background). It would strengthen the paper to evaluate on more datasets, ideally with more than two classes to demonstrate whether the proposed method is also effective for a higher number of classes. Even if this is not the case, it would be interesting to see what the limits of the method are (i.e. how many classes can it handle).

    2. A few implementation details are missing: On page 5 a projection network is mentioned, but I can not find any further reference in the paper to this network. It would be helpful to state somewhere in the paper what this network consists of, or a citation to previous work. The weights of the losses (lambda_c, sep and agree) are also introduced, but their value is not mentioned anywhere in the text. This would be good to add in the implementation details.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Minor points:

    1. Page 4: “the attention network A(.), …. “, this is mentioned without any previous reference to A(.) or attention network. I would recommend rephrasing to make the connection to previous work clearer. For example, something like: “the attention network (as proposed in [8]), defined by A(.) … “

    2. Page 4: bottom of page the “class-specific separation loss”. To understand L_pos and L_neg the reader now has to go to the citation to read up on these loss components. I appreciate not rewriting the whole loss definition here due to space constraints, but it would be helpful for the reader to add one or two sentences on what these losses intuitively do.

    3. Page 4: I would rename L_c to L_class (to make clear that there is not a loss for each class) and f and b as superscript for the two L_pos losses to stay consistent with the paper defining them.

    4. Page 4: Last sentence before the Experiment Setup “Once the CAMs are obtained …”, I assume this refers to Equation 3, it would be helpful to refer to that equation here.

    5. It would be great to see significance testing for table 1, comparing the proposed method to the best previous method (AME-CAM).

    6. I find the predicted masks in Fig. 2 quite hard to interpret with the current coloring, I would recommend exploring whether there are ways to visualise this in a better way (i.e. outlines, having the ground-truth in a separate column).

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-written and contains a clear novel contribution that will be interesting to the MICCAI community. The results contain comparisons to several baselines and ablation studies demonstrating the effectiveness of each of the proposed loss components.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents an approach for multiclass weakly supervised semantic segmentation (WSSS) that leverages binary Class Activation Mapping (CAM) to guide multiclass CAM generation. The authors specifically address the challenge of overlapping activations in co-occurring, spatially adjacent classes in medical imaging. The key contributions include: (1) a multiclass CAM approach with binary CAM guidance, (2) a novel inter-class separability loss designed to reduce overlap between different foreground classes, and (3) an agreement loss that ensures consistency between binary and multiclass CAMs. The method is evaluated on brain tumor segmentation (BraTS) data, demonstrating improved segmentation of tumor core and edema regions compared to existing CAM-based techniques.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The binary CAM guidance for multiclass CAM is innovative and effectively addresses the challenge of class overlap in weakly supervised segmentation.
    • The inter-class separability loss specifically targets the multiclass problem, while the agreement loss ensures coherence between binary and class-specific segmentations.
    • The method shows significant improvements over state-of-the-art CAM-based techniques (GradCAM, LayerCAM, ScoreCAM, and AME-CAM) across multiple metrics (Dice, IoU, HD95) for both tumor core and edema segmentation. -The stepwise evaluation of each loss component provides clear insights into their individual contributions, with statistical significance testing supporting the findings.
    • Unlike AME-CAM which requires separate models for each class, the proposed method is more scalable, using only two multi-exit classifiers (multiclass and binary) with minimal additional convolution layers for extra classes.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The paper demonstrates effectiveness on a two-class foreground problem (core and edema), but doesn’t discuss how the approach would scale to scenarios with many more classes, where the binary guidance approach might become less effective.

    While mentioned as more efficient than AME-CAM, a detailed analysis of training time, inference time, and memory requirements would provide more concrete evidence of computational advantages.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents a significant contribution to weakly supervised multiclass segmentation in medical imaging. The proposed method effectively addresses the challenge of overlapping activations in co-occurring, spatially adjacent classes - a common issue in medical image segmentation. The novel loss functions and binary CAM guidance approach are well-motivated and technically sound, with comprehensive experiments demonstrating clear improvements over existing methods.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

Even though WS-MTST also addresses weakly supervised multiclass segmentation, in WS-MTST, the aggregation loss encourages spatial compactness which might result in predicted regions in each category being clustered together. However, this assumption may not hold in datasets where instances of the same class appear as multiple disjoint regions. In contrast, the main novelty in our method is that our method multiplies each class-specific CAM with the binary CAM, ensuring that the multi-class localization is guided within the binary foreground region. This strategy, using agreement and inter-class separability losses, avoids the issue of forced spatial compactness, potentially allowing the flexibility to represent multiple disconnected regions of the same class, making it more suitable for multi-object scenarios.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    The reviewers acknowledged the novelty and good ablation studies in this work, and they also pointed out that the approach is both theoretically sound and practically valuable. The paper is well written. Two reviewers suggest accept. The main concerns from the remaining reviewer who suggested weak reject include not clarifying well the overlap with existing works and experiment with a single dataset. These issues are minor, as the novelty of this work is obvious in the figure and method description. Though the experiment used a single dataset, the authors provided detailed results and visualization to show the effectiveness, making the results convincing. The authors could consider using multiple datasets for validation when extending this work to a journal version.



back to top