Abstract

The Segment Anything Model (SAM) is a powerful foundation model which has shown impressive performance for generic image segmentation. However, directly applying SAM to liver tumor segmentation presents challenges due to the domain gap between nature images and medical images, and the requirement of labor-intensive manual prompt generation. To address these challenges, we first investigate text promptable liver tumor segmentation by Couinaud segment, where Couinaud segment prompt can be automatically extracted from radiology reports to reduce massive manual efforts. Moreover, we propose a novel CouinaudSAM to adapt SAM for liver tumor segmentation. Specifically, we achieve this by: 1) a superpixel-guided prompt generation approach to effectively transform Couinaud segment prompt into SAM-acceptable point prompt; and 2) a difficulty-aware prompt sampling strategy to make model training more effective and efficient. Experimental results on the public liver tumor segmentation dataset demonstrate that our method outperforms the other state-of-the-art methods.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0139_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Lyu_SuperpixelGuided_MICCAI2024,
        author = { Lyu, Fei and Xu, Jingwen and Zhu, Ye and Wong, Grace Lai-Hung and Yuen, Pong C.},
        title = { { Superpixel-Guided Segment Anything Model for Liver Tumor Segmentation with Couinaud Segment Prompt } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15008},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a novel text prompt utilizing Couinaud segments and proposes a new method called CouinaudSAM to fully leverage the Couinaud segment prompts and adapt them for SAM. Experimental results on a liver tumor segmentation dataset demonstrate that this approach can achieve superior performance compared to other state-of-the-art methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The method is novel overall. Firstly, the authors explore the application of a recent foundation model, SAM, in liver tumor segmentation.

    2. The authors utilize couinaud segments as prior knowledge and convert them into point prompts through superpixel-guided prompt generation.

    3. The paper is well-written and the figures are well-illustrated, making it easy to understand.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In the experimental section, the comparison of methods used by the authors is unfair. Other methods do not accept prompts as input. Although the authors claim that the results of other methods will be refined within the couinaud segmentation masks derived from the prompts, this comparison remains unfair. Providing region-level information as guidance at the input stage can significantly boost the model’s performance. Considering that the performance improvement of CouinaudSAM is actually limited (around 2% on the MSD08-161 dataset and 5% on the MSD08-142 dataset), the effectiveness of CouinaudSAM is uncertain.

    2. Although many SAM-based methods are mentioned in the introduction section of the paper, none of them are compared in the experimental stage, especially SAM-based methods that can achieve automatic segmentation.

    3. The comparisons in the ablation study are insufficient. The authors should compare the case where region-level guidance based on Couinaud segmentation is provided and used as a box prompt, which is then input into the fine-tuned SAM. The core idea of this paper is to convert region-level guidance into more precise point prompts. Without this comparison, also, the effectiveness of the proposed method remains uncertain.

    4. The method seems to be very time-consuming during inference. The time required for SLIC and couinaud segment mask generation is unclear, and when the number of N_t is large, multiple inferences are needed to aggregate the final results. The authors do not report the time comparison with other methods. In clinical applications, this method may not be cost-effective, as it trades more time and region-level guidance information for a small performance improvement.

    5. The authors only perform comparisons on a single dataset. They should consider using the LiTS dataset for further evaluation.

    6. The implementation of Slice Prompt and Volume Prompt is unclear for other prompt-unsupported networks, and does not seem to be explicitly explained in other parts of the paper.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. the experimental comparison should either provide the same region-level guidance to other methods, like crop the foreground regions of the couinaud segmentation as the network input, or clearly state the limitations of their comparison.

    2. Comparing CouinaudSAM with other SAM-based automatic segmentation methods mentioned in the introduction would further validate the proposed method’s effectiveness.

    3. The ablation study should also include a comparison with region-level guidance provided as a box prompt to the fine-tuned SAM, as this is crucial to demonstrate the core idea of converting region-level guidance into point prompts.

    4. The authors should provide more details about the inference time for each step and compare the overall inference time with other methods, considering the trade-off between time, region-level guidance, and performance improvement.

    5. Evaluating the method on additional datasets like LiTS would also demonstrate its generalizability and robustness.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Based on the strengths and weaknesses discussed, I recommend a weak reject for this paper. While the paper is well-written and presents a novel approach, there are several major factors that led to this decision. Firstly, the experimental comparisons with other methods are unfair due to the difference in input information, making it difficult to assess the true effectiveness of CouinaudSAM. Additionally, the lack of comparisons with other SAM-based automatic segmentation methods and the absence of a crucial ablation study comparing region-level guidance as a box prompt to the fine-tuned SAM raise concerns about the validity of the proposed method. Furthermore, the uncertainty regarding inference time and the limited evaluation on a single dataset suggest that more comprehensive experiments are needed to demonstrate the method’s clinical feasibility and generalizability.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors have justified their technical contributions in their rebuttal. Other unclear details have also been addressed. With the fair technical contributions and good results, I raised the score to weak accept. The authors mentioned in Q8 that the results were “refined within the Couinaud segmentation masks derived from the prompts.” However, it remains unclear to me what the difference is between these refinements under the “slice” and “volume” conditions. It would be helpful if the authors could provide further clarification on this point in the camera-ready version of the paper.



Review #2

  • Please describe the contribution of the paper

    This paper proposes CouinaudSAM, which introduces a novel text prompt, enabling SAM to adapt to liver tumor segmentation tasks. Specifically, This paper first presents a superpixel-guided prompt generation approach to convert Couinaud segment text prompt into SAM-acceptable point prompt. Then, this paper introduces a difficulty-aware prompt sampling strategy to improve the efficiency of model training. Trained and validated on MSD08 dataset, CouinaudSAM outperforms the other state-of-the-art methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. It is innovative to convert Couinaud segment text prompt into dual point prompt, which can provide position information.
    2. Couinaud segment prompt can be automatically extracted from radiology reports. Compared with manual annotation, CouinaudSAM can reduce the annotation cost and maintain competitive performance.
    3. This paper is well-presented, and the figures provide a clear understanding of the workflow, data, and Couinaud segment prompt.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The proposed method is only evaluated on the MSD08 dataset, which may not be sufficient to generalize the results to other datasets or other organs.
    2. Need more explanation in the comparison with the state-of-the-art method.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. It would be beneficial to explore the generalizability of the proposed framework to other segmentation tasks and to use larger publicly available datasets for evaluation.
    2. This paper should provide an analysis of hyper-parameters, such as the balanced sampling ratio.
    3. Discuss more about the pros and cons of each methods.
    4. In this paper, the authors claim that Superpixel can avoid missing small tumors. Could you please provide some visualization results?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents good results that demonstrate the effectiveness of the proposed approach. This indicates the potential impact of the method in the field of liver tumor segmentation. Overall, this paper has more contributions than the weakness points.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    This paper demonstrates the effectiveness of the proposed approach through promising results, suggesting its potential impact on liver tumor segmentation.



Review #3

  • Please describe the contribution of the paper

    The authors propose a new type of text prompt using the Couinaud segment, aiming to avoid laborious manual prompt creation. Specifically, they propose CouinaudSAM to take full advantage of the Couinaud segment prompt and make it adaptable to SAM. Experimental results on the liver tumor segmentation dataset demonstrate the effectiveness of CouinaudSAM.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The motivation is natural and aligns with medical anatomy knowledge.
    2. CouinaudSAM is novel and effective.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In Figure 3, the improvement of the proposed method over the grid-based generation of point prompts does not seem very significant. Additionally, there remains a certain gap compared to the manual approach.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    no

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. In the introduction, the paper claims “it can avoid missing small tumors and also benefit the model training with more informative point prompts.” Will this approach lead to more false positives?

    2. The task of liver tumor segmentation is relatively simple. It would be better if the method could be extended to the segmentation of other types of tumors.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is well-presented in terms of motivation, experimental design, and organization, so I am giving it a weak accept.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The rebuttal addressed some of my concerns. Considering the overall presentation of the paper, I am maintaining my rating as weak accept.




Author Feedback

We would like to thank the reviewers for their insightful comments. In what follows are the major concerns and our responses.

•Q1(R1,R3,R4). The generalizability to other segmentation tasks and to use more datasets for evaluation. #A1: Additional experiments were performed but not listed due to page limit. (1) Experiments on more datasets: We also performed liver tumor segmentation on LiTS dataset, and the results support the same conclusion that CouinaudSAM is superior. We will add some conclusion of these experiments in result section. (2) Experiments on other segmentation task: We also worked on lung tumor segmentation, and find that CouinaudSAM is applicable since lung can be divided into 5 lobes, and shows good result. We did not add it because this work focuses on liver tumor, but the extension to other tasks was discussed in the original conclusion section.

•Q2(R1,R4). More comparison with the SOTA methods, such as SAM-based automatic methods. #A2: In Fig.3(b), we have provided the results of original SAM-based automatic methods with a 32x32 or 64x64 regular grid of points, and CouinaudSAM shows superiority. For other SAM variants, the major difference is different parameter-efficient transfer learning (PETL) approaches: SAMed uses LoRA (DSC: 66.26%), MA-SAM uses FacT(DSC: 67.93%), and MSA uses Adapter(DSC:64.59%). Since PETL is not the major contribution of our work, we follow MA-SAM since FacT shows the best result. Moreover, we cannot directly compare pre-trained SAM variants, because the test dataset is already involved in their training, which may lead to unfair comparison.

•Q3(R1,R3). Superpixel can avoid missing small tumors. #A3: Our motivation is to obtain point prompts that represent meaningful regions based on superpixel rather than regular grid. A small tumor may be missed when the grid is sparse, but it may still be caught using superpixel, since the tumor region is different from its surrounding region. Thank you for your suggestion, we will change it to ‘Superpixel can help reduce the risk of missing small tumors ’, since our previous expression may be too strong.

•Q4(R1). Analysis of hyper-parameters. #A4: Hyper-parameter analysis is not listed due to page limit. For example, we conducted experiments using different balanced sampling ratio from 0.1-0.9, and 0.6 is selected since it gives the optimal performance.

•Q5(R3). In Fig.3, the improvement over grid-based point prompt is not significant, and there exists a gap compared to manual approach. #A5: Our method is superior to grid-based methods from 3 perspectives: (1) Better performance. (2) Less inference time. (3) When the resolution of the grid is higher, it brings higher false positive rate, and benefits more from the refinement of the Couinaud segment mask. Compared to the manual approach, our method shows inferior results, but also greatly reduces the manual efforts of point prompt generation, which is the goal of this work.

•Q6(R4). Convert Couinaud segmentation into a box prompt. #A6: We have performed your suggested experiments but the results are not as good as expected. When the tumor exists in all segments of the liver, then the bounding box derived from the prompt covers the whole liver region. The region-level guidance may not be significant since the tumor segmentation result is finally refined within the liver region.

•Q7(R4). Inference time comparison. #A7: We use SLIC to generate superpixels within liver, and it is performed in a CPU machine then saved locally, which takes around 20 seconds each volume. The total inference time for 32 test volumes in MSD08-142 is: CouinaudSAM (12 min), 32x32 grid (19 min), and our method takes less time compared to original SAM-based automatic segmentation methods.

•Q8(R4). Slice Prompt and Volume Prompt. #A8: It was explained in Section 3.3(second last sentence). Other prompt-unsupported networks are refined within the Couinaud segmentation masks derived from the prompts.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper introduces CouinaudSAM, a novel method that uses Couinaud segment text prompts to adapt SAM for liver tumor segmentation. The approach employs a superpixel-guided prompt generation technique to convert text prompts into SAM-acceptable point prompts and a difficulty-aware prompt sampling strategy to enhance training efficiency. Validated on the MSD08 dataset, CouinaudSAM shows superior performance compared to state-of-the-art methods. Strengths of the paper include the innovative prompt generation, automation of annotation from radiology reports, comprehensive presentation, and the novel application of SAM aligned with medical anatomy knowledge. However, weaknesses include limited evaluation on only the MSD08 dataset, insufficient comparative analysis with other SAM-based methods, gaps in the ablation study, potentially time-consuming inference, unclear implementation details for other networks, and the need for further evaluation on additional datasets like LiTS.

    Given the consensus in accept among all reviews and that the merits outweigh the weaknesses, the meta-reviewer recommends accept.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This paper introduces CouinaudSAM, a novel method that uses Couinaud segment text prompts to adapt SAM for liver tumor segmentation. The approach employs a superpixel-guided prompt generation technique to convert text prompts into SAM-acceptable point prompts and a difficulty-aware prompt sampling strategy to enhance training efficiency. Validated on the MSD08 dataset, CouinaudSAM shows superior performance compared to state-of-the-art methods. Strengths of the paper include the innovative prompt generation, automation of annotation from radiology reports, comprehensive presentation, and the novel application of SAM aligned with medical anatomy knowledge. However, weaknesses include limited evaluation on only the MSD08 dataset, insufficient comparative analysis with other SAM-based methods, gaps in the ablation study, potentially time-consuming inference, unclear implementation details for other networks, and the need for further evaluation on additional datasets like LiTS.

    Given the consensus in accept among all reviews and that the merits outweigh the weaknesses, the meta-reviewer recommends accept.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper receives a mixed initial review: 2 WA, 1WR. After rebuttal, the WR raises to WA. The paper has merits such as clinical meaningful to incorporate liver couinaud segment text as prompt in SAM to segment liver tumor, well presenting, showing improved results. However, some experimental results concerns initially raised by R4 are still valid. 1) Lack of comparison to models that using couinaud segment mask as guidance at the input stage, which usually significantly boosts the model’s performance. Considering that the performance improvement of CouinaudSAM is relative minor (~2% on the MSD08-161 dataset), this comparison is important to validate the contribution of the proposed method. 2) The proposed couinaud segment prompt has a limited improvement over the regular grid prompt generation scheme (71.22% vs 72.08%). 3) It is unknown if the comparing methods such as nnUNet is trained in 3D or 2D.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This paper receives a mixed initial review: 2 WA, 1WR. After rebuttal, the WR raises to WA. The paper has merits such as clinical meaningful to incorporate liver couinaud segment text as prompt in SAM to segment liver tumor, well presenting, showing improved results. However, some experimental results concerns initially raised by R4 are still valid. 1) Lack of comparison to models that using couinaud segment mask as guidance at the input stage, which usually significantly boosts the model’s performance. Considering that the performance improvement of CouinaudSAM is relative minor (~2% on the MSD08-161 dataset), this comparison is important to validate the contribution of the proposed method. 2) The proposed couinaud segment prompt has a limited improvement over the regular grid prompt generation scheme (71.22% vs 72.08%). 3) It is unknown if the comparing methods such as nnUNet is trained in 3D or 2D.



back to top