Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Establishing pixel/voxel-level or region-level correspondences is the core challenge in image registration. The latter, also known as region-based correspondence representation, leverages paired regions of interest (ROIs) to enable regional matching while preserving fine-grained capability at pixel/voxel level. Traditionally, this representation is implemented via two steps: segmenting ROIs in each image then matching them between the two images. In this paper, we simplify this into one step by directly ‘‘searching for corresponding prompts’’, using extensively pre-trained segmentation models (e.g., SAM) for a training-free registration approach, PromptReg. Firstly, we introduce the ‘‘corresponding prompt problem’’, which aims to identify a corresponding Prompt Y in Image Y for any given visual Prompt X in Image X, such that the two respectively prompt-conditioned segmentations are a pair of corresponding ROIs from the two images. Secondly, we present an ‘‘inverse prompt’’ solution that generates primary and optionally auxiliary prompts, inverting Prompt X into the prompt space of Image Y. Thirdly, we propose a novel registration algorithm that identifies multiple paired corresponding ROIs by marginalizing the inverted Prompt X across both prompt and spatial dimensions. Comprehensive experiments are conducted on five applications of registering 3D prostate MR, 3D abdomen MR, 3D lung CT, 2D histopathology and, as a non-medical example, 2D aerial images. Based on metrics including Dice and target registration errors on anatomical structures, the proposed registration outperforms both intensity-based iterative algorithms and learning-based DDF-predicting networks, even yielding competitive performance with weakly-supervised approaches that require fully-segmented training data.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4020_paper.pdf

SharedIt Link: https://rdcu.be/eHaYr

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04965-0_44

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{HuaShi_Register_MICCAI2025,
        author = { Huang, Shiqi AND Xu, Tingfa AND Yan, Wen AND Barratt, Dean AND Hu, Yipeng},
        title = { { Register Anything: Estimating “Corresponding Prompts” for Segment Anything Model } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {467 -- 477}
}

Reviews

Review #1

Please describe the contribution of the paper
This paper introduces PromptReg, a training-free image registration method that directly identifies corresponding regions of interest (ROIs) between two images using pretrained segmentation models (e.g., SAM), without relying on manual segmentation labels or traditional optimization steps. The authors say that typically, this problem is solved in two separate steps:
1. Segmenting ROIs in each image
2. Matching them across images PromptReg simplifies this by performing both steps simultaneously through:
3. Formulating a new problem: finding a prompt in one image that corresponds to a prompt in another.
4. Solving it using an inverse prompting technique to generate the corresponding prompt.
5. Applying marginalization over prompt and spatial variations to improve robustness. Which could also be rephrased to: • The segmentation model is used to extract a feature-based representation of the ROI (called the prototype). • The method then searches the second image for a prompt that leads to similar features when passed through the same model. • Matching is done via cosine similarity and mathematical inversion, which allows estimating what the corresponding prompt in the second image should be.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The method is evaluated across a variety of 2D and 3D medical datasets and shows promising results.
- Comparisons are made with conventional, deep-learning-based, and prompt-based registration methods.
- The conceptual shift from independent segmentation followed by matching, to directly finding corresponding regions, is intuitive.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Clarity and readability: The paper is difficult to follow. The method description feels unnecessarily complex, especially given that the underlying idea is relatively straightforward. A clearer and more structured explanation at the beginning, summarizing the main steps before diving into equations, would significantly improve accessibility. o For instance, Figure 2 is hard to interpret, even with the supporting text. Without explanation, the figure is not self-explanatory at all.
- Unclear evaluation setup: It’s unclear how some of the baseline methods are evaluated: o For conventional registration methods, is the displacement field used to propagate the segmentation maskfrom the first image to the second, and then Dice is computed? o For TRE, is the centroid of the propagated mask used, or is the centroid of the original ROI propagated? o Or is the displacement field used to propagate the prompt, which is then used to re-segment the ROI with SAM? If the first approach is used (propagating masks), this may not be a fair comparison — traditional registration methods are not designed to produce segmentations. In contrast, if prompts are propagated and segmentation is performed again, the comparison would be more appropriate, especially since the core goal is to find corresponding regions, not necessarily to produce accurate masks.
- The paper could benefit from reframing the task more clearly: if the goal is to match corresponding ROIs, then the evaluation should reflect how well a propagated prompt retrieves the same region in the other image, rather than how well the segmentation overlaps.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

The link to the shared repo is not working.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper introduces a potentially useful idea — reframing image registration as a prompt correspondence problem, using pretrained segmentation models like SAM. In principle, directly identifying corresponding ROIs instead of segmenting and matching separately is a sound and intuitive direction. Using prototype features for matching and inversion is a straightforward application of known ideas (e.g., prototypical networks and cosine similarity).

However, this idea, while conceptually simple, is made unnecessarily complex in the paper. The formulation is overloaded with mathematical detail and lacks a clear high-level explanation of the pipeline, making the actual contribution hard to extract. The method seems like it could be described in much simpler terms.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper proposes a new promptable image registration framework based on SAM, which uses prompts of the fixed image to identify corresponding anatomy in the moving image.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The framework of promptable image registration is novel and significant. It finds corresponding anatomy by only prompts of ROI coordinates.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Details on how to solve for Z_k^y is missing. What is the meaning of inverting the Jacobian matrix? Is it solved by gradient descent? If so, how many iterations does it need?
2. How is the computational cost of the proposed framework? It seems that prompt marginalization improves accuracy but also increases the time cost. How is that compared to previous approaches?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The framework of promptable image registration is novel and significant. However, some details of the method should be explained in the paper.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I recommend acceptance of the paper due to its potential applicability.

Review #3

Please describe the contribution of the paper
This paper proposes PromptReg, a novel way of prompting Segment Anything Models and its derivatives to establish corresponding prompts, thereby matching same class ROIs. Main contributions of this paper includes:
1. reformulating the image registration problem into a ROI matching problem
2. proposed a prompt searching mechanism that finds corresponding ROIs between two images
3. Sanity check (marginalization) on prompt and ROI quality
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Main strengths of this paper includes:
1. Thorough mathematical explanation and justification of the method
2. A novel approach that significantly extends a previous paper (SAMReg)
3. Thorough benchmarking on a number of datasets
4. Published anonymous github repo (though the link later expired)
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The clarity of method explanation is bad. Although notations and equations are important, the paper contains now workflow visualization nor pseudocode to explain how the algorithm is carried out. Verbal explanations could sometimes be ambiguous.
2. Reducing image content into ROIs is a great technique, and would be most helpful to bring together images across different modalities. This could’ve been the main benefit of the proposed algorithm. However, no inter-modality registration was evaluated in this paper.
3. It is unclear how the 2D model handles 3D inputs. Is the prompting process performed per slice?
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Regardless of all the clarity issues, I belive this is a thorough work and deserves to be presented at the conference. This is a valuable contribution to the research community. I hope the authors come up with better illustration when presenting or extending to journals.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The rebuttal addressed my concerns.

Just to back R2 up, the link also didn’t work for me during review. It is now working.

Author Feedback

Response to R#2

First, we would like to confirm that the anonymous repository is accessible.

Clarity and readability (Q1):

We appreciate the reviewer’s comments regarding the complexity of the methodology. Our intention was to ensure technical details of our novel approach - inverting the corresponding prompt to the image space - are distinct from existing methods (e.g. using separate segmentation and matching steps [16]). This requires a clear definition of “corresponding prompt” (Sec 2) and our proposed PromptReg (Sec 3) detailing a specific prompt inverting algorithm. These are not trivial or clear to explain in words alone, hence the technical depth.

That said, we agree that the clarity can be improved. As suggested, we will revise the subheading and section summary for each section. This will highlight that PromptReg comprises two steps: “(inverted) prompt searching” and “prompt marginalization”. The prompt searching involves (1) inferring the corresponding target prompt Z^y_k on the target image and verifying shape consistency, with auxiliary prompts if needed. The prompt marginalization is implemented via an augmentation-based strategy to enhance robustness.

Evaluation Details (Q2):

We would like to clarify that the details in question follow practices consistent with those used in the compared prior work.

For Q2.1, yes, DDF is used to propagate the segmentation for computing Dice, for all registration methods.
For Q2.2, we warped the mask before computing centroid, for calculating centroid-based TREs. For Q2.3, To clarify, DDF in PromptReg is used solely as a post-hoc evaluation step applied after ROI correspondence is established. The inverted prompts directly produce aligned ROIs; these are not propagated by DDF.

Evaluation Reframing (Q3):

One of our evaluation methods measures the overlaps, between the segmentation generated by the proposed inverted prompts and ground-truth segmentation. This is indeed to “reflect how well a propagated prompt retrieves the same region in the other image” - as correctly understood by the reviewer.

Response to R#1

Method Details (Q1):

Z^y_k is computed via Eq.(1), in Page 5, in which the Jacobian matrix captures local variations and is computed via a single backward pass (i.e. the chain rule), without iteration.

Computational Cost Concern (Q2):

On the MR-prostate dataset, processing a 256×256 image pair without marginalization uses 3021MB of memory, takes 38ms, and achieves a Dice score of 70.19 ± 3.22. Adding one marginalization step keeps memory usage similar, increases runtime to 77ms, and improves performance to 73.33 ± 2.08 Dice. This accuracy-speed trade-off is fully user-controlled. Importantly, PromptReg is training-free and does not require labels at inference time, while maintaining a runtime comparable to SAMReg [16] (64ms) and GBMReg [Liu 2024] (87ms).

Response to R#3

Method illustration and details (Q1&3):

We appreciate the reviewer’s suggestion and plan update the PrmoptReg part of Fig.1 to highlight the workflow of the method. For a fair comparison, we apply 2D models following the protocol in [16]: for each slice in I^x , prompts are inverted across a slice range δs in I^y , and the result with the minimal Hausdorff distance is selected. That said, we recommend the use of 3D foundation models for volumetric data, as they show superior performance (Table 1).

Multi-modality extension (Q2):

This is an insightful suggestion. PromptReg currently relies on a shared encoder, which can limit its effectiveness in multi-modality settings. Future work shall explore modality-specific encoders or prior-informed adaptations to better extend PromptReg to multi-modality registration tasks.

Refs
[Liu 2024] Liu, Xueyu, et al. “Feature-Prompting GBMSeg.” MICCAI, 2024.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

This paper introduces a genuinely novel and potentially impactful approach to image registration by leveraging SAM for prompt correspondence. The core idea is well-received. However, the execution, particularly in explaining the methodology, is severely flawed according to most reviewers. The paper is described as overly complex and difficult to follow, lacking clear visualizations or high-level explanations despite its mathematical detail. The author should give more clarification in the rebuttal.
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

I am happy to accept this work that may bring potential fundamental change to the field in the era of LLMs.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

Despite the clarity issues, this paper introduces a genuinely novel, well-motivated, and promising paradigm for medical image registration. It bridges state-of-the-art segmentation models with core MICCAI tasks, performs solid empirical evaluations, and inspires new avenues for research. The paper meets the bar for acceptance and is likely to stimulate valuable discussion at the conference.

Recommendation: Accept

back to top

Register Anything: Estimating “Corresponding Prompts” for Segment Anything Model

Author(s):