Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Hyperspectral imaging (HSI) provides rich spectral information for medical imaging, yet encounters significant challenges due to data limitations and hardware variations. We introduce SAMSA, a novel interactive segmentation framework that combines an RGB foundation model with spectral analysis. SAMSA efficiently utilizes user clicks to guide both RGB segmentation and spectral similarity computations. The method addresses key limitations in HSI segmentation through a unique spectral feature fusion strategy that operates independently of spectral band count and resolution. Performance evaluation on publicly available datasets has shown 81.0% 1-click and 93.4% 5-click DICE on a neurosurgical and 81.1% 1-click and 89.2% 5-click DICE on an intraoperative porcine hyperspectral dataset. Experimental results demonstrate SAMSA’s effectiveness in few-shot and zero-shot learning scenarios and using minimal training examples. Our approach enables seamless integration of datasets with different spectral characteristics, providing a flexible framework for hyperspectral medical image analysis.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1453_paper.pdf

SharedIt Link: https://rdcu.be/eHw2a

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05114-1_46

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/CVRS-Hamlyn/SAMSA

Link to the Dataset(s)

HeiPorSPECTRAL dataset: https://heiporspectral.org HSI Brain Dataset dataset: https://hsibraindatabase.iuma.ulpgc.es/

BibTex

@InProceedings{RodAlf_SAMSA_MICCAI2025,
        author = { Roddan, Alfie AND Czempiel, Tobias AND Xu, Chi AND Elson, Daniel S. AND Giannarou, Stamatia},
        title = { { SAMSA: Segment Anything Model enhanced with Spectral Angles for Hyperspectral Interactive Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {478 -- 488}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a novel framework named SAMSA, which extends the Segment Anything Model 2 (SAM2) by integrating spectral analysis. The proposed method combines an RGB-based foundation model with spectral information through a unique spectral feature fusion strategy.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well written, clearly structured, and presents a well-motivated technical problem. The authors effectively communicate their contributions and the relevance of their work.
- A key strength lies in the proposed integration of spectral information into SAM2 via the SAMSA framework, which introduces an spectral feature fusion strategy. The way user interaction (clicks) is used to guide the RGB-based model while simultaneously serving as a reference for spectral comparison is interesting.
- The methodology seems to be sound and logically designed.
- The evaluation is thorough (with comparison between baseline models and fusion models) and demonstrates the method’s potential.
- The results demonstrate that the proposed method generalizes well to previously unseen labels and across datasets
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- While the integration of spectral data with an RGB-based foundation model is a valuable direction, the authors should clarify how their feature fusion strategy differs from or improves upon existing approaches in the literature.
- The authors did not address the influence of the location of the clicks in the method. This should at least be discussed in the paper.
- The comparison with other state-of-the-art segmentation methods could be more comprehensive.
- Although the validation demonstrates strong performance, Table 1 lacks statistical analysis to support the significance of the reported improvements.
- The paper would benefit from the inclusion of more qualitative visual examples of the segmentation outputs.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper addresses a relevant problem and presents a well-structured methodology that combines a foundation RGB model with spectral analysis through an original fusion strategy. The results are promising and show good generalization across unseen labels and datasets. However, it is important that the authors more clearly describe the novelty and advantages of their proposed fusion approach in comparison to existing methods.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper proposes an interactive segmentation approach for intraoperative hyperspectral imaging (HSI). The approach is SAM2-based and requires users to provide clicks. It uses two input modalities, hyperspectral and pseudo RGB images, and overcomes data limitations by leveraging RGB foundation models. The authors performed experiments and compared segmentation metrics across different input modalities, models, clicks, and datasets to demonstrate the effectiveness of their method.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Pipeline: The pipeline is complete and clear, with good illustrations of the input, model design, and output.
2. Method: The hyperspectral and pseudo RGB images are fused to improve the segmentation performance. A set of fusion methods are compared.
3. Training: Training of this model only needs a small amount of data, which can overcome the data scarcity faced in this field.
4. Experiments: The experiments are complete and the results are good. The authors have a comprehensive discussion on their experiments.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Applicability: The applicability of intraoperative HSI segmentation is unclear. How can this technique help surgeons? How can this interactive segmentation be applied in the clinic if it needs clicks during surgery?
2. Method: Will the clicks serve in a period or only one frame? Can this method be used during a real surgery where the whole video cannot be accessed? What is the robustness to clicked points - how sensitive is the technique to picking the appropriate points? How is the learning taking place - this was unclear.
3. Speed: The inference speed is not reported. Can this method perform inference in real-time?
4. Dataset: The equipment for data collection is not mentioned. Were the images in one dataset captured with the same device? Is there sensitivity depending on the acquisition device?
5. Results: The tumor segmentation results in Table 2 are not good (only 0.576) after removing this class from the training set. It cannot demonstrate that the method can overcome the data limitation when encountering unseen classes.
On a clinical note: at the introduction, the method is overselling HSI as available in the operating room, while it is not.

On a methodology note: the results are not up to what a clinical standard would be, and the inference speed is not provided.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The pipeline is nice, but there are limitations. See comments above.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

SAMSA, an interactive hyperspectral medical image segmentation framework, integrates an RGB-based foundation model with spectral analysis, thereby utilizing the advantages of both approaches. It supports few-shot and zero-shot learning, requiring only a small number of training instances to address the limitations of hyperspectral imaging data. Its spectral feature fusion technique is independent of spectral band count and resolution, effectively handling hardware variability and ensuring stable performance across diverse acquisition settings. Segmentation is performed interactively—fusing expert knowledge during the process—guided by user clicks. The system demonstrates competitive performance on both neurosurgical and intraoperative porcine datasets.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The interactive nature of SAMSA enables incorporation of expert knowledge during segmentation, facilitated by the prompt-based capability of SAM2. As a foundation model trained on a large dataset, SAM2 effectively captures spatial and contextual features.
- Integrating spectral similarity maps with spatial RGB features improves segmentation robustness, particularly in low-contrast or spectrally ambiguous regions. This complementary use of spectral and spatial cues helps overcome limitations of RGB-only models like SAM2, especially in challenging medical imaging scenarios.
- The paper evaluates both late fusion (multiplicative and UNet-based) and early fusion, ultimately proposing an early fusion strategy where spectral information is incorporated directly into the decoder’s feature space. This design is well-motivated and empirically shown to outperform late fusion, validating the architectural choice.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- While the figure provides high-level intuition of where fusion happens (i.e., early, inside the decoder), it still lacks the specific architectural or implementation detail necessary for clear technical reproducibility
- SAMSA integrates spectral information into the upscaling path of the SAM2 mask decoder by fusing the spectral similarity map with high-resolution encoder features (S₀). However, the paper does not explore whether this is the most effective stage for fusion. An ablation study across different decoder levels (e.g., low-, mid-, and high-resolution stages) would strengthen the architectural justification and clarify whether earlier or deeper fusion might yield better performance.
- Using vanilla histogram equalization to enhance spectral images seems rudimentary; exploring more advanced techniques like CLAHE would have been beneficial. Methodological details are missing regarding this. If there are other feature enhancement techniques available, please mention them.
- The paper lacks direct comparisons with existing state-of-the-art HSI segmentation methods on the same datasets (e.g., HiB or HeiPor), which would help contextualize the performance improvements of SAMSA.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- The references are inconsistently formatted and should be revised to conform with the LNCS style. For example, authors cite the original U-Net paper as an arXiv preprint instead of the MICCAI publication. Similarly, they have only included DOIs for some references.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- A more detailed description of the proposed fusion module is needed.
- Clarification of the specific spectral feature enhancement techniques used would help understand the work.
- Although the paper compares several internal baselines (e.g., SAM2Base, SAM2Tuned), it lacks quantitative comparisons with prior published methods on the same datasets (HiB or HeiPor). This makes it difficult to contextualize how much improvement SAMSA offers over existing HSI segmentation literature.
- The references are inconsistently formatted and should be revised to conform with the LNCS style. For instance, the original U-Net paper is cited as an arXiv preprint rather than the MICCAI publication, and DOIs are included only for some references.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We appreciate the reviewers’ thoughtful and constructive comments. Below, we address each point and describe changes made to the manuscript accordingly.

Reviewer 1 Clarification on feature fusion strategy: We clarified how our spectral fusion differs from prior work, highlighting that existing approaches typically convert other modalities to grayscale, whereas we preserve modality-specific information and apply spectral fusion techniques not yet explored in multimodal segmentation. Click location influence: To standardize input and isolate the impact of spectral fusion, we place the first click at the center of the largest connected component and seed clicks identically across models. A detailed study of click placement is outside our scope. Comparison with SOTA segmentation methods: Interactive segmentation depends on user input, making direct comparison to automated methods non-trivial. We agree that bridging this gap is an important direction for future work. Statistical significance of results: We chose not to include formal statistical testing, as the improvements are consistent and large across conditions, and further analysis would not meaningfully alter our conclusions.

Reviewer 2 Clinical applicability: Our method supports two primary use cases: efficient annotation and real-time surgical guidance. Clicks can be delivered through existing sterile-compatible interfaces such as foot pedals or voice commands. Click duration and learning: Our method is compatible with single-frame and video settings. Learning is not performed online; instead, the pretrained model is guided by click-based masks and spectral similarity. Robustness to click location is discussed in R1. Inference speed: The additional spectral operations introduce only ~0.037% computational overhead relative to SAM2’s ViT-H encoder, preserving real-time performance. Dataset consistency: We used two datasets acquired with different systems to evaluate robustness across hardware variations. Details are provided in the Dataset section. Zero-shot tumor segmentation: This experiment demonstrates the benefit of spectral fusion for generalization to unseen classes, rather than clinical readiness. Claims about HSI in the OR: We already noted limited clinical adoption and have further softened language to reflect this more clearly. Clinical readiness: Thank you for your comment. Regarding performance, our model achieves a macro DICE of approximately 90% with five clicks and around 80% with a single click. While these results may not yet meet the threshold for clinical deployment, especially when compared to rigorous clinical standards, we believe they demonstrate strong potential. Given that the current gold standard for intraoperative tissue characterization is frozen section analysis—a time-and labour-intensive process—our findings suggest that interactive segmentation models, even at this stage, could offer meaningful support in surgical decision-making.

Reviewer 3 Reproducibility and architecture detail: We included the GitHub repository in the camera-ready version for full reproducibility. Justification for fusion location: We compared early and late fusion strategies. Decoder-level fusion performed best, likely due to better preservation of pretrained encoder features. Deeper exploration of fusion stages is a promising future direction. Spectral enhancement: We used simple histogram equalization to isolate and highlight the benefits of spectral fusion without additional complex preprocessing. Comparison with HSI segmentation methods: As discussed in R1, aligning evaluation protocols for interactive vs. automated methods remains a challenge, and we clarify this in the manuscript. Reference formatting: All references have been updated to follow LNCS style. The U-Net citation was corrected, and DOIs were added where available.

We will also address all minor comments, including formatting and reference corrections, in the final version.

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

SAMSA: Segment Anything Model enhanced with Spectral Angles for Hyperspectral Interactive Medical Image Segmentation

Author(s):