List of Papers Browse by Subject Areas Author List
Abstract
Existing semi-supervised learning methods have been used to leverage the vast amounts of unlabeled data in several medical imaging tasks. These methods operate under the assumption that the classes present in the unlabeled data are exactly same as the labeled data. However, in real-world medical scenarios, the unlabeled dataset often contains novel categories not found in the labeled dataset. To address this problem, we present MedGCD (Generalized Category Discovery for Medical Images), which recognizes the categories observed in the labeled data and additionally clusters novel categories found in the unlabeled data. Specifically, MedGCD introduces a dual stream of strong views in a weak-to-strong framework coupled with a confidence-aware pairwise objective for discovering novel categories. The dual views enables the model to extract superior features from the unlabeled data, while the confidence-aware pairwise objective facilitates the selection of reliable samples, leading to an effective grouping of novel categories. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed model in discovering novel categories while maintaining consistent performance on seen categories, with improvements in novel category ranging from 4\% to 15\%, leading to an overall accuracy improvement of 2\% to 8\%.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2995_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/Chandan-IITI/MedGCD
Link to the Dataset(s)
BibTex
@InProceedings{DasAnk_MedGCD_MICCAI2025,
author = { Das, Ankit and Gautam, Chandan and Agrawal, Pritee and Yang, Feng and Liu, Yong and Savitha, Ramasamy},
title = { { MedGCD: Generalized Category Discovery in Medical Imaging } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15965},
month = {September},
page = {435 -- 445}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper tackles a critical gap in medical image analysis: semi-supervised learning (SSL) under class-mismatched conditions, where unlabeled data may contain novel categories absent in labeled data. This is a realistic but underexplored scenario in medical imaging (e.g., rare diseases or atypical pathologies). The work extends Generalized Category Discovery (GCD) to the medical domain
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed MedGCD framework introduces several key innovations to address generalized category discovery in medical imaging. At its core is a novel weak-to-strong framework employing dual strong views, where weak augmentations guide two distinct strongly augmented views of the data. This dual-stream perturbation strategy, inspired by contrastive learning but specifically adapted for GCD, enhances feature diversity while reducing bias by capturing complementary aspects of medical images (e.g., texture versus structural features). A critical component is the confidence-aware pairwise objective, which prioritizes high-confidence sample pairs for clustering - particularly crucial in medical data where subtle inter-class similarities (such as between tumor subtypes) can easily confuse models.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The proposed MedGCD framework suffers from several critical limitations that undermine its claims of novelty and clinical applicability. Methodologically, the work appears to be an incremental combination of existing techniques rather than a substantive advance: (1) The “weak-to-strong” augmentation paradigm directly borrows from established SSL methods (e.g., FixMatch) without novel adaptation for GCD, while the dual-stream mechanism merely extends contrastive learning principles without theoretical innovation; (2) The confidence-aware pairwise objective functionally resembles pseudo-label filtering in SSL, yet the paper fails to demonstrate how it meaningfully improves upon prior work (e.g., via ablation studies or gradient analysis). Empirically, the evaluation is compromised by outdated comparisons (omitting 2023 SOTA like SimGCD or PCR) and lacks parameter sensitivity analysis – a critical omission given that augmentation intensity and confidence thresholds could disproportionately impact medical feature extraction and rare-class discovery.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
see above weakness
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Reject
- [Post rebuttal] Please justify your final decision from above.
Aligning dual strong augmentations with shared weak views is claimed to reduce confirmation bias and improve clustering of ambiguous medical categories without negative sampling. However, the reduction in confirmation bias is not evident, and there is a lack of supporting evidence. Where is the confidence score from FixMatch reflected in your method? Moreover, SimGCD does not rely on natural image pretraining and can also be applied to medical imaging tasks.
Review #2
- Please describe the contribution of the paper
The paper proposed a new method, MegGCD, for generalized category discovery in medical images. The authors provided an in-depth analysis of various components that, when combined, achieved new top scores on three datasets. Furthermore, the analysis itself could be useful in many other research areas.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Major strengths of the paper are:
1.) A novel framework built mostly from known components that is properly evaluated and well organized, achieving top scores on three separate datasets.
2.) An ablation study that thoroughly evaluates the influence of different parameters and components, leaving little room for unanswered questions regarding the experiment.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
Main weaknesses of the paper are:
1.) The selected datasets are versatile but relatively trivial (28×28 pixel images). While this makes the conducted experiments valid, it may also hide the true capabilities/limitations of the proposed method. The real question is: how would the proposed method scale on larger datasets? More precisely, what are its limitations? For instance, ResNet18 is an established neural network and is more than sufficient for 28×28 images, but what if we had a serious X-ray dataset with 256×256 images—how would the proposed method scale? Additionally, one of the limitations is that the number of labels must be known in advance (even for unlabeled datasets). This issue could potentially be overcome by testing different numbers of clusters or by visually inspecting t-SNE-reduced embeddings, but it needs to be discussed.
2.) Based on the provided descriptions, I sometimes found it hard to follow exactly how the method is implemented. For example, is the number of classes in the supervised part of the algorithm set to the total number of labels in the dataset, or only to the number of seen labels? How is the novel accuracy measured? If the algorithm produces a cluster of unseen labels, how is that cluster paired with a real class?
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Although I am aware of the text limitations due to the template, more details regarding prediction, evaluation, and method limitations should be provided to eliminate any misinterpretations. Based on the provided information, the process for obtaining the final prediction remains somewhat unclear.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
I believe that the proposed method has good potential. It demonstrates novelty and properly evaluates its components, and this component evaluation could stimulate engaging discussions at the conference.
However, the method still requires further polishing and clarification so that readers have no doubts regarding its implementation and design. Additionally, it remains unclear how well the proposed method will scale to more demanding datasets.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors adressed all my concerns: (i) They have provided limitations, (ii) They explained that due to the limitations in manuscript length, they can not provide additional information regarding training, but that information will be provided in the GIThub repository. This seems as a valid solution. Hence, I recommend accept.
Seeing other reviewers comments, I have to point out that there is a valid concern raised by R3 regarding comparison with SOTA model (SimGCD or PCR). I wish that authors provided a bit more details about it in their response. But all in all, I think this paper deserves to be published.
Review #3
- Please describe the contribution of the paper
MedGCD enhances medical imaging category discovery using dual views and confidence-aware clustering.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
Novel category discovery is a real-world problem.
-
Unlike single strong-view methods (e.g., FixMatch), MedGCD uses two strong augmentations with a shared weak view.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
Details of margin m is directed to [5], but searching the term “margin” in [5] yields nothing (except in references).
-
Eq. 7 is a log of the dot‑product between two probability vectors. This is probably extremely small (or even zero) most of the times, and it lacks a clear probabilistic interpretation (Why not cross‑entropy on hard labels?).
-
Eq. 5 seems to be missing from the overall objective in Eq 1
-
Single‐run results, no variance, no statistical tests.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Novelty is questionable, but its application is meaningful. Statistical test is missing.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
Thank you for the rebuttal. All of my concerns has been thoroughly answered.
Author Feedback
We thank the Reviewers R1, R2, and R3 for their valuable feedback. All suggested changes will be included in the final draft.
28×28 Resolution and Limitation (R1): We used ResNet-18 to suit the 28×28 resolution of benchmark datasets, but our method is backbone-agnostic and extendable to deeper models like ResNet-50. Our aim is to establish a medical GCD benchmark; but we agree high-resolution scalability is important and plan to explore it. Limitations: (i)the proposed and most existing methods require knowing the number of unknown classes in advance (as noted by the reviewer),(ii)not suitable for cross-domain generalization. Both will be addressed in future works. As suggested by the reviewer, one limitation could be mitigated using t-SNE or by discarding very small or empty clusters
Implementation and Evaluation (R1): Due to space constraints, detailed implementation wasn’t included, but we will provide full code and documentation upon acceptance. We don’t use explicit clustering; instead, our CAPO loss transforms cluster learning into pairwise similarity prediction, aligning known labels and assigning novel ones to remaining output nodes. Evaluation follows the GCD (CVPR 2022) setup, and we also report NMI. While we understand the reviewer’s concern, full details will be shared with the code
margin ‘m’ (R2): Standard cross-entropy leads to faster convergence and tighter clusters for seen classes, causing novel samples to be misclassified. To address this, we introduce an adaptive margin m, initially large to increase seen class variance and gradually reduced as novel clusters stabilize. This balances learning-pace and improves pseudo-label reliability. Margin is computed as m = 1 - mean(softmax confidence) on unlabeled samples
Small dot_product and CE on hard labels (R2): For labeled classes, we use standard cross-entropy with hard labels (Eq. 2). Eq. 7 computes the log dot product between probability vectors to softly align predictions across high-confidence pairs. While dot products can be small, we apply this only to highly similar pairs—selected via 1-NN retrieval and a threshold θ on cosine similarity—ensuring stable and meaningful alignment
Clarification on use of Eq. 5, Single‐run results and statistical tests (R2): Equation 5 is not an extra loss to optimize but a theoretical explanation showing that our dual-view objective (Eq. 4) implicitly follows contrastive learning principles, aligning strong augmentations like InfoNCE. Results are averaged over three runs; standard deviations and Z-tests will be included in the final version
Difference from weak-to-strong augmentation of SSL(R3): Though inspired by SSL, our weak-to-strong argumentation is uniquely adapted for GCD. Unlike closed-world SSL, MedGCD aligns dual strong augmentations to a shared weak view, enabling consistency across both seen and novel categories. Unlike standard contrastive setups, our dual-stream design reduces confirmation bias and improves clustering of ambiguous medical classes without negative sampling, making it novel and effective for GCD in medical imaging
CAPO loss vs pseudo-label filtering and SOTA comparison (R3): While CAPO superficially resembles pseudo-label filtering in SSL, it is specifically designed for the GCD (open-world) setting. Unlike closed-world SSL, where filtering avoids low-confidence errors, CAPO regulates the learning pace between seen and unseen classes, which lack direct supervision. By reinforcing only high-confidence, structurally consistent pairs, it stabilizes training and supports robust clustering under asymmetric supervision. CAPO reframes clustering as a pairwise similarity task and, as shown in Table 2, outperforms SSL-style filtering (FixMatch), which struggles with unreliable confidence scores on unseen classes. We will include comparisons with 2024 methods. We obtained results on SimGCD and it underperforms on medical images due to reliance on natural image pretraining
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper received two accept and one reject recommendations after the rebuttal. The main remaining weakness is the lack of sufficient comparison with recent state-of-the-art methods such as SimGCD, as pointed out by one reviewer. However, the other concerns were well addressed in the rebuttal, and two reviewers expressed confidence in the paper’s contribution and clarity. Upon reviewing the manuscript and the responses, I agree that it is suitable for publication. In the camera-ready version, the authors should include a more detailed discussion of the method’s limitations and address the concern regarding comparison with recent SOTA methods like SimGCD and PCR, as raised by Reviewer 3.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The key designs, including dual views and confidence-aware designs, are appreciated by the reviewers. Therefore, an acceptance is given.