Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS) is a potential solution for segmenting medical images with limited annotation using knowledge from other domains. The significant performance of current CD-FSMIS models relies on the heavily training procedure over other source medical domains, which degrades the universality and ease of model deployment. With the development of large visual models of natural images, we propose a training-free CD-FSMIS model that introduces the Multi-center Adaptive Uncertainty-aware Prompting (MAUP) strategy for adapting the foundation model Segment Anything Model (SAM), which is trained with natural images, into the CD-FSMIS task. To be specific, MAUP consists of three key innovations: (1) K-means clustering based multi-center prompts generation for comprehensive spatial coverage, (2) uncertainty-aware prompts selection that focuses on the challenging regions, and (3) adaptive prompt optimization that can dynamically adjust according to the target region complexity. With the pre-trained DINOv2 feature encoder, MAUP achieves precise segmentation results across three medical datasets without any additional training compared with several conventional CD-FSMIS models and training-free FSMIS model. The source code is available at: https://anonymous.4open.science/r/MAUP-3451.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3542_paper.pdf

SharedIt Link: https://rdcu.be/eHwWx

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04981-0_31

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/YazhouZhu19/MAUP

Link to the Dataset(s)

N/A

BibTex

@InProceedings{ZhuYaz_MAUP_MICCAI2025,
        author = { Zhu, Yazhou AND Zhang, Haofeng},
        title = { { MAUP: Training-free Multi-center Adaptive Uncertainty-aware Prompting for Cross-domain Few-shot Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {326 -- 336}
}

Reviews

Review #1

Please describe the contribution of the paper
The paper introduces MAUP, a training-free method for Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS). Its key contributions are:
1. No Training Required: Adapts the Segment Anything Model (SAM) to medical images without fine-tuning, reducing reliance on annotated data.
2. Smart Prompting: Uses multi-center clustering, uncertainty-aware selection, and adaptive optimization to generate precise prompts for SAM.
3. Boundary Refinement: Leverages negative prompts to improve segmentation in low-contrast regions.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

MAUP achieves segmentation without any training, making it efficient and scalable for clinical use.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Unfair experimental comparisons. This paper proposes a few-shot segmentation method based on the large foundation models SAM and DINOv2, while the comparison methods use smaller models (such as ResNet-50). Therefore, the good performance achieved in this paper may simply be due to the more powerful pretraining rather than the proposed strategies. The results in Table 3 also support this speculation; even after removing MMP and NP, the performance of MAUP still exceeds most of the comparison methods.
2. For fairness, this paper should discuss and compare some one-shot methods based on SAM, such as PerSAM[1] and Matcher[2]. [1] Personalize Segment Anything Model with One Shot [2] Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
3. The term “cross-domain” in the title is confusing. Generally, “cross-domain” refers to pretraining networks where the source domain and the target task do not belong to the same domain (e.g., MRI => CT) or multi-center setting. In this paper, the authors do not clarify the settings for this task. May the source domain refers to natural image pretraining? In this case, using the term “cross-domain” is wired. This implies that many common algorithms applying ImageNet-pretrained models (such as ResNet-50, ViT) to the medical field could all be considered “cross-domain” algorithms. Or the one-shot setting can be regarded as a kind of cross-domain method?
4. Since this paper can also be seen as a method for automatic prompting of SAM, it is recommended to supplement the automatically generated positive and negative prompts in Figure 3.
5. In Sec 2.1, “Segment Anything Mode” should be corrected to “Segment Anything Model.”
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

See weaknesses.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

The authors provided new experimental results in their response, which violates the rebuttal guidelines: “New/additional experimental results in the rebuttal are not allowed, and breaking this rule is grounds for automatic desk rejection”

In addition to the above issues, the other two reviewers also raised serious concerns about the computational efficiency of this paper in their comments, which also caused the unfairness of the experimental method comparison in this paper (only compared with methods of high computational efficiency). Therefore, the reviewer will maintain the score as Reject.

Review #2

Please describe the contribution of the paper
The paper proposes MAUP (Multi-center Adaptive Uncertainty-aware Prompting), a novel training-free framework for Cross-domain Few-shot Medical Image Segmentation (CD-FSMIS). The key contributions are:
1. Innovative prompt design for SAM adaptation: MAUP leverages the Segment Anything Model (SAM) and DINOv2 backbone to achieve cross-domain few-shot segmentation without additional training. This addresses the limitations of traditional CD-FSMIS methods that require domain-specific training.
2. Multi-center prompting via K-means clustering: Generates spatially diverse point prompts by clustering high-similarity regions, ensuring comprehensive anatomical structure coverage.
3. Uncertainty-aware prompt selection: Identifies challenging regions using variance across similarity maps, prioritizing areas with high decision uncertainty.
4. Adaptive prompt optimization: Dynamically adjusts the number of prompts based on target region complexity (e.g., area and perimeter), enhancing robustness.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Novel formulation of prompting strategies: The K-means-based multi-center prompting effectively addresses spatial coverage challenges in complex anatomical structures. Unlike prior work that randomly samples prompts, MAUP ensures diverse prompt distribution by clustering regions of high similarity to the support image.
2. The uncertainty-aware selection mechanism focuses on ambiguous regions, improving segmentation accuracy in low-contrast areas. This aligns with clinical needs where boundary delineation is critical (e.g., cardiac or abdominal tissues).
3. The adaptive prompt quantity adjustment (Equation 7) dynamically balances computational efficiency and coverage, a feature absent in static prompting approaches. Effective use of pre-trained models:
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The evaluation relies solely on Dice score, which measures volumetric overlap but neglects other critical metrics for medical segmentation, such as the Hausdorff distance (assessing boundary accuracy). These metrics are essential for clinical validation, especially for irregularly shaped organs.
2. The K-means clustering for spatial diversity resembles prior work in few-shot segmentation. For instance, [29] uses Voronoi partitioning for regional prototyping, and [3] employs spatial sampling based on regional features. The paper should clarify how MAUP’s dynamic clustering and uncertainty weighting differ from such approaches.
3. The computational cost of MAUP (e.g., K-means clustering and similarity map generation) is not discussed, which is critical for deployment in resource-constrained clinical settings. Model’s size, inference time, GPU cost should be analysized.
4. The adaptive prompt count mechanism (Equation 7) is theoretically sound but lacks quantitative validation of its impact. For instance: Does adaptive adjustment reduce prompts in simple cases without sacrificing accuracy?
5. Given the stochastic nature of point generation, it is critical to evaluate whether the model’s performance remains consistent across multiple inference runs. The stability of the model under repeated testing should be empirically validated, particularly in terms of key metrics such as Dice scores and computational efficiency.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

MAUP presents a promising training-free approach for CD-FSMIS with innovative prompting strategies. Its strengths lie in spatial diversity, uncertainty-guided adaptation, and strong empirical results. However, addressing concerns around evaluation metrics, computational cost, and the stability of the model’ performance could further enhance its impact.

Recommendation: Accept with minor revisions to address the above weaknesses.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have addressed my concerns in their response. I have decided to maintain my initial recommendation to accept this paper.

Review #3

Please describe the contribution of the paper

This paper proposes MAUP, a training-free prompting framework for cross-domain few-shot medical image segmentation based on SAM and DINOv2. The method includes a multi-center region-aware prompting, boundary-aware negative sampling, uncertainty-based prompt selection, and complexity-aware prompt number control. It achieves strong performance on three diverse datasets under the 1-shot setting.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Training-free design: The method operates without any additional training on medical datasets, making it highly practical for low-resource or fast-deployment scenarios.
2. Systematic prompting strategy: MAUP integrates four well-motivated components—multi-center prompting, boundary-aware negative sampling, uncertainty-aware selection, and adaptive prompt quantity—which together enhance segmentation robustness across anatomical complexities.
3. Strong empirical results: Despite being training-free, the method outperforms multiple training-based few-shot segmentation models across three diverse datasets (Abd-MRI, Abd-CT, and Card-MRI).
4. Forward-looking approach: The adaptation of foundation models like DINOv2 and SAM to medical segmentation reflects a promising direction aligned with recent trends in medical AI.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The “training-free” claim heavily depends on pretrained natural image models (SAM + DINOv2), which may face domain shift issues in medical imaging.
2. The method relies on heuristic rules (e.g., K-means prompt count, fixed dilation) without adaptive or learnable components.
3. Baselines are not fully comparable; lacks evaluation against other SAM-based training-free methods.
4. High inference cost not reported, undermining the claimed deployment advantage.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

see strengths
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

R4.Q1:MAUP showed better performance in Hausdorff Distance tests across Abd-MRI (13.21 mm), Abd-CT (13.45 mm), and Card-MRI (12.98 mm) datasets compared to FAMNet (14.89/14.02/14.12 mm), PANet (15.34/15.88/15.67 mm), SSL-ALP (14.56/14.35/14.32 mm), RPT (14.12/13.95/14.53 mm), PATNet (14.25/13.78/14.21 mm), and IFA (15.81/16.12/15.95 mm). R4.Q2:Our approach differs from previous methods in: (1) We adapt prompt counts dynamically based on target complexity (Eq. 7), unlike [29]’s fixed partitioning; (2) We select prompts based on uncertainty (Eq. 8-9), targeting ambiguous regions and improving Dice by 1.3%; (3) Our DINOv2 integration enables cross-domain transfer (MRI→CT) without the domain-specific training needed by [29] and [3]. R4.Q3&R5.Q4:MAUP operates without training, using frozen DINOv2-ViT-L/14 (~1.2GB) and SAM-ViT-H (~2.4GB). The inference process includes: (1) single DINOv2 forward pass, (2) similarity map calculation, (3) K-means clustering for prompts, (4) uncertainty map calculation, and (5) SAM inference with minimal point prompts. Testing used an NVIDIA 3090. We add analysis of model sizes, GPU requirements, and inference times in revision. And we’ll discuss this in future work. R4.Q4:Our adaptive prompt mechanism for Abd-MRI outperforms the fixed 8 prompt approach, achieving higher accuracy (67.09% vs 66.82% DSC) with lower latency (151ms vs 168ms). The system intelligently uses fewer prompts (3) for simple cases while maintaining acceptable accuracy (65.91%), and more prompts (11) for complex cases to improve results (66.71%). R4.Q5:Abd-MRI tests across 10 runs with different random seeds show high consistency. Average Dice score is 66.94% with minimal variation (±0.21%, range 66.58-67.21%). Inference time remains stable (±3.8ms). All four organs maintain consistent scores (STD < 0.26%). R5.Q1:”Training-free” means no medical dataset training is needed, unlike other CD-FSMIS methods. Our contribution is prompting strategy that bridges domain gaps without further training. Results show our approach overcomes domain shift without need of medical annotations and additional computational resources. R5.Q2:While our method uses algorithmic components like K-means, it includes several adaptive elements: prompt count dynamically adjusts based on target complexity (Eq. 6-7), uncertainty-aware selection adapts to challenging regions, and the prompting strategy automatically adjusts to different anatomical structures. R5.Q3&R6.Q2:Comparisons with training-free SAM-based models: Abd-MRI (MAUP: 67.09%, PerSAM: 63.28%, Matcher: 65.83%), Abd-CT (MAUP: 67.46%, PerSAM: 64.05%, Matcher: 65.92%), and Card-MRI (MAUP: 73.13%, PerSAM: 69.41%, Matcher: 71.24%). R6.Q1:MAUP uses foundation models training-free, with our prompting strategies improving performance (67.09% vs 65.08%). It addresses vanilla SAM’s weaknesses in complex anatomy, low-contrast boundaries, and complexity adaptation. Comparing other approaches offers insights for limited-annotation settings. Additional experiments against vanilla SAM across complexity levels would highlight MAUP’s contributions. R6.Q3:”Cross-domain” is accurate because we transfer knowledge from natural images (where SAM and DINOv2 are trained) to medical imaging. As stated in our introduction:”SAM, trained on natural images, for the CD-FSMIS task.” Unlike standard ImageNet pretraining, we address few-shot learning using our MAUP strategy specifically designed to bridge domain gaps, consistent with established cross-domain transfer definitions in research literature. R6.Q4:We improved Fig.3 to show both positive prompts (green) from our multi-center clustering and uncertainty-aware selection, and negative prompts (red) from periphery similarity maps across all datasets. R6.Q5: We correct the writing in revision.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

MAUP: Training-free Multi-center Adaptive Uncertainty-aware Prompting for Cross-domain Few-shot Medical Image Segmentation

Author(s):