Abstract

Few-Shot Medical Image Segmentation (FSMIS) aims to segment novel classes of medical objects using only a few labeled images. Prototype-based methods have made significant progress in addressing FSMIS. However, they typically generate a single global prototype for the support image to match with the query image, overlooking intra-class variations. To address this issue, we propose a Self-guided Prototype Enhancement Network (SPENet). Specifically, we introduce a Multi-level Prototype Generation (MPG) module, which enables multi-granularity measurement between the support and query images by simultaneously generating a global prototype and an adaptive number of local prototypes. Additionally, we observe that not all local prototypes in the support image are beneficial for matching, especially when there are substantial discrepancies between the support and query images. To alleviate this issue, we propose a Query-guided Local Prototype Enhancement (QLPE) module, which adaptively refines support prototypes by incorporating guidance from the query image, thus mitigating the negative effects of such discrepancies. Extensive experiments on three public medical datasets demonstrate that SPENet outperforms existing state-of-the-art methods, achieving superior performance.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0184_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{FanCha_SPENet_MICCAI2025,
        author = { Fan, Chao and Jia, Xibin and Xiao, Anqi and Yu, Hongyuan and Yang, Zhenghan and Yang, Dawei and Xu, Hui and Huang, Yan and Wang, Liang},
        title = { { SPENet: Self-guided Prototype Enhancement Network for Few-shot Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15964},
        month = {September},
        page = {586 -- 595}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces SPENet, a framework for few-shot medical image segmentation with MPG module that generates global and adaptive local prototypes, and QLPE module that uses optimal transport to refine support prototypes based on query information, reducing interference from anomalous regions

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    N/A

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Limited Novelty: The idea of generating multi-local prototypes for few-shot segmentation has been extensively explored in prior studies[1,2] and authors fail to compare their work with these studies. Additionally, the paper lacks sufficient experimental validation on whether the number of local prototypes should be fixed or dynamic. For instance, the paper does not specify the exact number of fixed local prototypes or provide experimental results for different fixed numbers. Also, the choice of starting $k_max$ at 12 in Fig. 3 is not well justified, with no explanation on whether an optimal performance will exist within the 0-12 range.

    2. Unclear Description of ALPG Module: The description of how the ALPG module dynamically generates local prototypes is unclear. It seems that the cluster method likes superpixel, but the details are not adequately explained.

    3. Lack of Experiments Supporting QLPE Motivation: Many previous studies have shown that using only a global prototype for query image segmentation yields poor dice scores (around 30%-50%)[1-2]. This raises questions about the reasonableness of the obtained $M_q*$ and the generated $p_q^l$ based on it. Moreover, the paper lacks visual examples to illustrate the “anomalous regions exist in the target object of the support image but are absent in the query image.” There is also no information on the frequency of such cases in the three datasets, i.e., how often the support images have abnormal lesions while the query images show normal organs.

    [1]CHENG Z, WANG S, XIN T, et al. Few-shot medical image segmentation via generating multiple representative descriptors[J]. IEEE Transactions on Medical Imaging, 2024. [2]HUANG S, XU T, SHEN N, et al. Rethinking Few-Shot Medical Segmentation: A Vector Quantization View[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see the weakness part

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    Thanks for author’s detailed response. However, several of my primary concerns regarding the paper remain insufficiently addressed. I maintain that these omissions significantly hinder a comprehensive understanding of the proposed methodology, a thorough assessment of its contributions, and the potential for other researchers to reproduce the work. Specifically, lingering questions about key components affect the paper’s clarity and perceived robustness. For instance, the justification for the reasonableness of the initial mask prediction derived solely from the global prototype warrants more explicit, in-paper support, as this prediction is foundational to the subsequent generation of local prototypes. Furthermore, the precise mechanisms of local prototype generation (within the ALPG module, which the authors state uses Voronoi partitioning) are not detailed adequately in the paper itself. Finally, the core motivation for the QLPE module, particularly concerning its ability to handle specific discrepancies like “anomalous regions,” lacks convincing experimental validation across the three datasets employed, especially given the authors’ concession that such scenarios might not even be prevalent in these datasets.



Review #2

  • Please describe the contribution of the paper
    1. Multi-level Prototype Generation (MPG) Module: Traditional methods (e.g., ALPNet, DSPNet) use a single global prototype or fixed-number local prototypes, failing to adapt to varying sizes/shapes of medical objects (e.g., tumors, organs). This module simultaneously generates a global prototype for semantic consistency and adaptive local prototypes for fine-grained details. And it introduces Adaptive Local Prototype Generation (ALPG) to dynamically adjust the number of local prototypes based on target object size , overcoming the rigidity of fixed-grid/clustering approaches.
    2. Query-guided Local Prototype Enhancement (QLPE) Module: Existing methods ignore “anomalous” local prototypes (e.g., a tumor in the support image but absent in the query), leading to mismatches. This module leverages Optimal Transport (OT) to re-weight local prototypes by measuring their relevance to the query image and uses the Sinkhorn algorithm to compute prototype importance scores, suppressing irrelevant/abnormal regions.
    3. Comprehensive Validation: Experiments demonstrate state-of-the-art performance on three medical datasets (Abd-MRI, Abd-CT, Card-MRI), with significant DSC improvements (e.g., +1.21–7.08% over prior arts like RPT/PAMI).
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The Multi-level Prototype Generation (MPG) module generates an adaptive number of local prototypes based on target object size, unlike prior works (e.g., ALPNet’s fixed grids or DSPNet’s fixed clusters). This is the first method to explicitly link prototype granularity to anatomical scale in medical images. MPG’s adaptability is critical for medical images where lesion/organ sizes vary drastically (e.g., small tumors vs. large livers).

    2. The QLPE module uses optimal transport (OT) to reweight prototypes by modeling their pairwise relationships with query features. Prior works (e.g., CRAPNet) only use cosine similarity, which ignores contextual dependencies. OT’s entropy-regularized solution robustly handles anomalies (e.g., tumors in support but not query images). QLPE’s OT formulation provides a theoretically grounded way to filter noisy prototypes, advancing beyond heuristic attention mechanisms.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. In Equation (3), the calculation of the number of local prototypes k requires clarification on how the parameter Cs (number of pixels per local region) is determined. Is it related to image resolution or lesion size?

    2. In the QLPE module, why is the cost matrix (1-S) (complement of cosine similarity) chosen for optimal transport? A comparison with other distance metrics (e.g., Euclidean distance) would strengthen the justification. The entropy regularization parameter ϵ=0.1 in optimal transport—was this determined via grid search? Include sensitivity analysis for this parameter.

    3. Optimal transport may increase computational overhead. Include inference time comparisons (e.g., vs. ALPNet/DSPNet).

    4. Discuss SPENet’s adaptability to extremely small targets (e.g., lesions spanning only a few pixels). Could ALPG generate invalid prototypes in such cases? Consider adding visualizations of typical success/failure cases.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a thoughtful and technically sound approach to medical image segmentation by proposing two key modules: Multi-level Prototype Generation (MPG) and Query-guided Local Prototype Enhancement (QLPE). The MPG module offers an innovative mechanism to adaptively generate prototype granularity based on anatomical scale, which is particularly important in the medical imaging domain where target sizes vary significantly. This represents a meaningful step beyond fixed-grid or fixed-cluster strategies used in prior work. The QLPE module introduces an optimal transport formulation to reweight prototype importance, providing a principled alternative to conventional similarity-based or attention-based mechanisms. The use of entropy-regularized OT adds robustness in challenging scenarios, such as support-query mismatch. While some implementation and ablation details could be further clarified, the method’s conceptual novelty and its demonstrated effectiveness warrant a weak accept.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    To address the challenge of intra-class variations in prototype-based methods of few-shot medical image segmentation tasks, this paper proposes a self-guided prototype enhancement network which contains a multi-level prototype generation module to extract multi-granularity measurements with an adaptive number, and a query-guided local prototype enhancement module based on the Optimal Transport algorithm to improve the effectiveness of support.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • In Local Feature Generation (LFG), after clustering the features, it is novel that the average is extracted as a representative prototype. This paper also introduce a dynamic strategy based on foreground to choose how many the features to be averaged.
    • To address the problem of overlooking relationships within pairs of local prototypes, this paper designs a method based on the Optimal Transport algorithm to optimize the global optimal objective.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Using the optimal transport algorithm to evaluate and support local prototypes in images for optimization adds an additional computational burden. The same is true for the process in Local Feature Generation (LFG). There may be bottlenecks in the application of larger scale data, which the authors do not discuss.
    • In the ablation study, the parameter k seems to have a large impact on the performance, suggesting that the model may lack in robustness.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The methods introduced in this paper is novel but some of them are not explained in a clear format, for example, the pseudo-code representation. The writing is detailed but could be improved. As mentioned in the drawbacks, the proposed method is questionable in terms of efficiency and robustness, something that the authors do not obviously mention.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I would like to thank the authors for their efforts in preparing the rebuttal. The explanation regarding the impact of k on the performance is reasonable and well-justified. My main concerns have been adequately addressed. Based on the revised assessment, I recommend Accept.




Author Feedback

Reviewer #1 Q1: (1) Why is the cost matrix (1-S) (complement of cosine similarity) chosen for Optimal Transport (OT)?  (2) The ϵ=0.1 in OT is determined via grid search? A1: (1) Cosine similarity avoids scale issues and focuses on direction, making it generally more robust in high-dimensional feature spaces (e.g., prototypes). Therefore, using 1-S as the cost matrix is more suitable for prototype matching than Euclidean distance. (2 )We experiment different ϵ values via grid search and find that around 0.1 yield better results.

Q2: Whether ALPG generate invalid prototypes for extremely small targets. A2: As shown in Eq. 3, when sum(Ms)/Cs<1(small targets), ALPG ensures at least one prototype for the foreground, i.e., the entire foreground itself, thus preventing the generation of invalid prototypes.

Q3: How the parameter Cs (Eq. 3) is determined. Is it related to image resolution or lesion size? A3: The Cs is a hyperparameter, which we set to a fixed value of 50 in our experiments, independent of image size.

Q4:Sensitivity analysis, inference time, visualizations, etc. A4: Due to space limits, we regret that the above content was not included. We will add it in the final version if it is accepted.

Reviewer #2 Q1: Limited Novelty, fail to compare with GMRD (TMI’24) and VQ (CVPR’23), which also explore multi-local prototypes. A1: We believe our paper differs significantly from GMRD and VQ in terms of novelty and motivation. GMRD generates a fixed number of local prototypes using a multi-layer perceptron, which lacks flexibility. VQ fuses grid-based and self-organized clustering (SOC)-based methods to generate local prototypes. However, it suffers from slow inference speed (a characteristic of SOC) , a fixed number of prototypes, and a lack of consideration for intra-class variations between support and query images. In contrast, our SPENet adopts a faster Voronoi partition method to generate local prototypes, with the number of prototypes dynamically adjusted according to the object size. Moreover, we incorporate optimal transport to effectively mitigate intra-class variations between support and query images. For the results, our method outperforms GMRD on all three datasets. Although it is slightly inferior to VQ, it achieves faster inference speed, making it more suitable for clinical applications.

Q2:Reasonableness of the obtained M_q and the generated p_ql. A2: In our experiment, we set a threshold (>0.7) to ensure the quality of M_q and the resulting P_ql.

Q3:(1) Fixed or dynamic; (2) why k_max start from 12; (3)visual examples of anomalous regions. A3: (1) Experiments show that dynamic prototype numbers outperform fixed ones, but due to space limits, we did not include this comparison. (2)Experiments show k_max in 0-12 has similar trends to 12-24, without optimal results. For brevity, we omitted the 0-12 part.
(3) This situation occurs in clinical practice and shows intra-class variations. It may not appear in the three public datasets.Sorry for any confusion caused by unclear wording.

Q4:Detail description of generates local prototypes. A4: For efficiency, we use a Voronoi partition method. The paper briefly covers this in LPG, but details were omitted due to space limits.

Reviewer #3 Q1: computational burden of Optimal Transport (OT) and Local Feature Generation (LFG). A1:The computational burden of OT and LFG mainly depends on the number of local prototypes. As shown in Eq. 3, we set an upper bound k_max and analyze it through ablation studies. Therefore, the number of local prototypes does not exceed k_max for large-size images,ensuring fast inference.

Q2: k seems to have a large impact on the performance. A2: Though k affects model performance, our results still surpass other methods. Moreover, the model does not collapse or behave erratically outside the optimal k, but rather shows predictable trends, which allows for effective tuning.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This work has received mixed scores. After reading the paper, reviewers concerns and authors rebuttal, I feel that this work can be accepted at MICCAI in its current form. I personally agree with several points raised by R2. Nevertheless, as pointed out by other reviewers, and my own criteria, I found that the proposed approach brings some novelty compared to existing methods, and the empirical validation is sufficient (even though there are few methods from 2024, while there were many presented methods in MICCAI’24 on this topic). Thus, I recommend its acceptance.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    SPENet targets intra-class variation in few-shot medical segmentation through two key ideas: (i) Multi-level Prototype Generation that adapts the number of local prototypes to organ size, and (ii) query-guided prototype re-weighting via entropy-regularised optimal transport. Reviewers 1 and 3 recommend acceptance post-rebuttal; Reviewer 2 remains sceptical about novelty but does not challenge the consistent Dice gains over strong baselines on three public datasets. The empirical evidence, methodological soundness, and majority-positive verdicts outweigh the residual clarity and presentation issues. I therefore recommend ACCEPT, and strongly urge the authors to release code and add fuller implementation details to maximise the work’s impact and reproducibility.



back to top