List of Papers Browse by Subject Areas Author List
Abstract
Medical image segmentation is a crucial and time-consuming task in clinical care, where precision is extremely important. The Segment Anything Model (SAM) offers a promising approach, providing an interactive interface based on visual prompting and edition. However, this model and adaptations for medical images are built for 2D images, whereas a whole medical domain is based on 3D images, such as CT and MRI. This requires one prompt per slice, making the segmentation process tedious. We propose RadSAM, a novel method for segmenting 3D objects with a 2D model from a single prompt, based on an iterative inference pipeline to reconstruct the 3D mask slice-by-slice. We introduce a benchmark to evaluate the model’s ability to segment 3D objects in CT images from a single prompt and evaluate the models’ out-of-domain transfer and edition capabilities. We demonstrate the effectiveness of our approach against state-of-the-art 2D and 3D models using the AMOS abdominal organ segmentation dataset.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4118_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/raidium-med/radsam
Link to the Dataset(s)
TotalSegmentator: https://zenodo.org/records/10047292
AMOS: https://amos22.grand-challenge.org/
BibTex
@InProceedings{KhlJul_RadSAM_MICCAI2025,
author = { Khlaut, Julien and Ferreres, Elodie and Tordjman, Daniel and Philippe, Helene and Boeken, Tom and Manceron, Pierre and Dancette, Corentin},
title = { { RadSAM: Segmenting 3D radiological images with a 2D promptable model } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15963},
month = {September},
page = {432 -- 442}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper extends the use of the segment-anything-model to work with 3D images by a slice-propagating method
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The major strength is the adaptation of a 2D slice based method to do 3D volume segmentation. The method has been evaluated on an open dataset with promising results.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The major weakness of the paper is that the method has limited novelty. The methodological contribution which is the slice propagation of previous estimations, is somewhat limited. While it is a valid paper with promising result, I do believe it would be better suited to one of the MICCAI workshops than to the main conference.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(2) Reject — should be rejected, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The methodological contribution of the paper is limited.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
I do believe that the authors have some fair points in their rebuttal.
Review #2
- Please describe the contribution of the paper
- The authors proposed a radiological SAM (RadSAM) for 3D medical segmentation using one single prompt.
- They also introduced an iterative inference strategy based on the mask prompt in a slice-to-slice manner. The proposed RadSAM supports prompt edition for segmentation correction.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The proposed method is easy but effective.
- The motivation and writing of this paper are good and easy-to-follow.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Novelty of this paper seems moderate.
- Some key experimental settings (e.g., training details) are missing. Details refer to Q10
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- “SAM-2 was recently introduced for interactive video segmentation, requiring an iterative pipeline during training.”- - however, the proposed RadSAM also requires iterative process.
- In my opinion, the whole process of RadSAM is very similar to SAM2, thus may limiting the method novelty. For example, both them require only one prompt for the whole volume or video. Besides, both them support different prompt types (i.e., mask, point and box). Moreover, like RadSAM, SAM2 also allows prompt edition to prevent error accumulation.
- No training details (e.g., how to train the 2D segmentation model with a mask prompt as the first input, or how to finetune the existing model to better fit the required tasks) can be found in the manuscript, including batch, learning rate, optimizer, training strategies, etc.
- “We generate a bounding box around this mask for models that do not support the initial mask prompting, such as SAM or MedSAM.”- -why? I have checked the codes, and think previous SAMs (at least the original SAM) should have the ability to support mask prompt via the mask prompt branch (SamPredictor-> predict_torch-> mask_input).
- The authors compared SAM, MedSAM, nnU-Net and SAM-Med3D. For fair comparison, all of them should be finetuned on AMOS dataset instead of using their original frozen weights.
- SAM2 should be considered as a key competitor (including medical SAM2).
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Limited innovation, lack of key experiments.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Reject
- [Post rebuttal] Please justify your final decision from above.
I appreciate the reviewer’s response. While some of my concerns have been addressed, I still find the novelty of this work (e.g., new prompt type, etc.) to be limited. Therefore, I recommend rejection.
Review #3
- Please describe the contribution of the paper
The authors propose RadSAM, a method for segmenting 3D radiological images using a 2D promptable model that overcomes the tedious requirement of providing prompts for each slice. Their key innovation is an iterative inference pipeline that reconstructs 3D masks slice-by-slice from a single initial prompt, using a new mask-prompting capability where predictions from one slice guide segmentation of adjacent slices. The approach achieves improved performance on CT organ segmentation tasks, surpassing existing 2D models (MedSAM) and 3D models (SAM-Med3D) while maintaining lower computational requirements.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The authors introduce an innovative approach that enables a 2D model to segment 3D structures from a single prompt.
-
The paper presents a new type of prompt (mask prompt) that enables passing segmentation information between adjacent slices.
-
The paper provides extensive benchmarking on multiple datasets (AMOS and TotalSegmentator), with detailed comparisons across different prompt types, editing capabilities, and transfer learning scenarios.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
The approach is only validated on CT scans, despite mentioning MRI in the introduction. This narrow focus leaves questions about performance across other important 3D medical imaging modalities.
-
The paper doesn’t provide computational efficiency metrics (inference time, memory usage) compared to alternatives, which is important for clinical deployment considerations.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
In general, the strengths sightly outweight the weaknesses.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
Q1 (R1, R2): Limited novelty A: Our contribution is not limited to an iterative inference pipeline. We also propose a new type of prompt (binary mask), and a training procedure to make the model learn to correct those input masks, any additional prompt. Those have a huge impact on the final results, as shown in Table 3a. Using the iterative pipeline only without this mask prompt, the dice score drops from 84.99 to 66.56.
Q2 (R2): Experiment details on mask prompting and training parameters are missing. A: For mask prompting, as said in Sec. 3.2 we apply a set of random mask perturbations given as input (rotation, scaling, translation, erosion, and dilation) and train the model to correct the input mask. We will add exact augmentation parameters in the final version for reproducibility: rotation 5.0 degrees, relative translation 15\%, scaling 10\%, shear 5, dilation 5 steps, and erosion 5 steps. The scale of those perturbations is very important because it prevents the model from collapsing and reproducing the input mask exactly. It also enables the model to segment a structure with a mask prompt from a neighbor slice. We will also add all the training hyperparameters for reproducibility (lr: 1e-5, batch size: 2, trained on 16 V100s for 7 epochs with the Adam optimizer).
Q3 (R2): “SAM2 requires iterative training but not RadSAM” A: SAM2 requires a set of consecutive images as input to the model during training because it forwards a tensor iteratively and backwards through the whole set. This requires more memory than our training pipeline, where we only train in 2D by perturbing the ground truth mask as input. The iterative process is applied only during inference. This makes the training more efficient; the main achievement of the paper is to segment 3D objects without specifically training the model in 3D. [see Q8 for efficiency metrics].
Q4 (R2): SAM / MedSAM should also work with mask A: These models are not trained to segment objects from only a mask prompt. The mask is only used during the editing pipeline. We evaluated SAM with only a mask input, and the output is close to random.
Q5 (R2): Compared models should be fine-tuned on AMOS dataset A: For the scores we report, nn-UNet is fully trained on AMOS. For SAM-Med3D and MedSAM, AMOS is part of the training set. Only SAM is not trained on AMOS. However, evaluating RadSAM with bounding boxes for the iterative prompt is equivalent to a “SAM fine-tuned on AMOS’’, and the result can be seen in Table~3a: the dice score drops from 84.99 to 66.56.
Q6 (R2): “SAM2 should be considered a key competitor” A: We agree with this. The main difference is the simpler training pipeline, as explained in Q3, so re-training SAM2 is more costly, as it requires feeding a volume. We will make this more straightforward, and if necessary, we can add a comparison to MedSAM2 in the final version. Other works have adapted SAM2 for medical usage (ex, MedSAM2) but do not provide detailed scores on academic datasets such as AMOS or TotalSegmentator.
Q7 (R3): Only valid for CT, not MRI A: We only evaluate CT in this paper for simplicity, but the model could also be easily trained for MRI data with an adapted pre-processing. We will consider this in future work and evaluate the model on more diverse and challenging tasks, such as tumor segmentation.
Q8 (R3): Compute Efficiency metrics missing. A: RadSAM has the same compute metrics as SAM, meaning it can process 23.42 images per second, according to our calculations on an RTX 4090. To infer a volume, it needs N/23.42 seconds, where N is the number of slices. According to our calculations, SAM2-b can process 33,37 images per second. However, during its training, it needs to do that N times for a video of size N, making it harder to train. According to their paper, SamMed3D processes volumes in around 2 seconds (0.5 volumes per second).
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
Some major concerns remained even after the rebuttal by the authors. Due to limited technical novelty and lack of key experiments, I recommend rejection of this paper.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
After a careful review of all the comments and the authors’ rebuttal, I recommend acceptance of this paper.
Strengths of the submission include:
A novel prompting strategy: The paper introduces the mask prompt to propagate segmentation information across 2D slices, enabling a 2D model to handle 3D volume segmentation with just a single prompt. This is a creative and practically valuable contribution, especially in medical imaging where annotation burden is high.
Comprehensive empirical validation: The authors present extensive experiments across multiple datasets (e.g., AMOS and TotalSegmentator) and scenarios, including different prompt types, editing strategies, and transfer learning settings. These evaluations support the robustness and applicability of the proposed method.
Practical utility and plug-and-play design: The method is compatible with various foundation models (e.g., SAM, MedSAM), enabling real-world adaptability without requiring architectural changes.
Reviewer Concerns:
Some reviewers expressed concerns about the methodological novelty due to similarities with SAM-2. However, the rebuttal clarifies that SAM-2 is designed for video segmentation and not tailored to the medical domain, while RadSAM innovatively adapts a 2D segmentation model for volumetric tasks with a single prompt and enables editing for error mitigation—features that are distinct in implementation and motivation.
The lack of training detail is noted and should be improved in the camera-ready version for better reproducibility. Similarly, while a comparison to SAM-2 would strengthen the paper, the current benchmarking against established baselines still provides strong empirical evidence.