Abstract

Source-Free Domain Adaptation (SFDA) is important for dealing with domain shift without access to source data and labels of target domain images for medical image segmentation. However, existing SFDA methods have limited performance due to insufficient supervision and unreliable pseudo labels. To address this issue, we propose a novel Iterative Pseudo Label Correction (IPLC) guided by the Segment Anything Model (SAM) SFDA framework for medical image segmentation. Specifically, with a pre-trained source model and SAM, we propose multiple random sampling and entropy estimation to obtain robust pseudo labels and mitigate the noise. We introduce mean negative curvature minimization to provide more sufficient constraints and achieve smoother segmentation. We also propose an Iterative Correction Learning (ICL) strategy to iteratively generate reliable pseudo labels with updated prompts for domain adaptation. Experiments on a public multi-site heart MRI segmentation dataset (M&MS) demonstrate that our method effectively improved the quality of pseudo labels and outperformed several state-of-the-art SFDA methods. The code is available at https://github.com/HiLab-git/IPLC.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1958_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1958_supp.pdf

Link to the Code Repository

https://github.com/HiLab-git/IPLC

Link to the Dataset(s)

https://www.ub.edu/mnms/

BibTex

@InProceedings{Zha_IPLC_MICCAI2024,
        author = { Zhang, Guoning and Qi, Xiaoran and Yan, Bo and Wang, Guotai},
        title = { { IPLC: Iterative Pseudo Label Correction Guided by SAM for Source-Free Domain Adaptation in Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed a new SFDA model for medical image segmentation. Given the fact that many SFDA models relies on pseudo labels, to increase the quality of these labels, the author suggested an iterative correction mechanism guided by segment anything model. They further randomly sampled prompts and compute their entropy weights. They also proposed a new mean negative curvature minimization for smooth edge detection. The model was evaluated on a public dataset and the results and ablation studies support the claims.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Very straightforward and easy to read
    2. The proposed model is well described
    3. Many of current SFDA models rely on pseudo labels and for many cases, these labels can misguide the target training, therefore improving the quality of pseudo labels is indeed necessary.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. One single dataset for segmentation is somehow limited
    2. The fact that the author chose one of the scanner vendors as source domain and the rest as the target domain is questionable. A fair comparison would be when each vendor treated once as source and once as target.
    3. In the ablation studies, the average improvement of using different components to SAM was small.
    4. Did author run any statistical experiments to evaluate the shift between vendors? (rather than reporting the Target only comparison with Source only)
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Did the authors try using any other medical pertaining instead of SAM-Med2D?
    2. In the tables, determining whether higher or lower values per metric is better can be helpful
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses a very important problem in medical where there is domain shift but the access to the source data is limited for the adaptation purposes due the privacy reasons. Throughout this framework, the quality of pseudo labels can be very important. Using existing medical SAM models and improving the pseudo labels for further performance gain on target training are impactful.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces a new framework for source-free domain adaptation in medical image segmentation. This method enhances pseudo label robustness through advanced sampling and entropy weighting, incorporates curvature minimization for detailed results, and employs an iterative strategy to refine labels and improve adaptation. The validation of the proposed framework on a public dataset, in comparison with existing methodologies, has demonstrated its effectiveness and impressive performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The manuscript is clearly composed and straightforward to understand.
    2. Problem is well-formulated, and the extensive results effectively demonstrate the effectiveness.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The experimental details on the comparison and ablation study are unclear.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The author provides adequate implementation details, facilitating the reproduction of the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. There are some typos and errors. For instance, the background has not been removed in Figure 1. There is a typo in “Lable” in Figure 1.
    2. It is unclear whether the background component is included in Equation 5.
    3. In Table 1,the use of the t-test to compare ‘source only’ with other methods, including the proposed methods, raises questions regarding its suitability. What is the rationale for this choice?
    4. In Section 3.2 regarding recent works appears to be a duplication of what is already covered in the introduction. 5.In Table 2, it’s not entirely clear how the methodology was applied when iterations are excluded, especially regarding the inclusion of L_curv in the ICL methodology as described in Section 2.4.
    5. Figure 3 displays varying trends across different domains, raising concerns about whether the random seed was fixed during the experiments to ensure consistent and comparable conditions.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a novel method and solves a practical problem.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    To address the negative effects of insufficient supervision signals and noisy pseudo-labels of existing Source-Free Domain Adaptation (SFDA) methods. The paper exploited a SAM-Med2d-guided SFDA pipeline to adapt the pre-trained model to the target domain. Additionally, the paper introduced a curvature-based complementary loss as a shape priori constraint in the domain adaptation process. The proposed method was evaluated on the multi-site heart MRI segmentation (M&MS) dataset and demonstrated to outperform other SFDA methods on segmentation tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper introduces a novel prompt generation method for the SAM-Med2d model, enabling SAM-Med2d to provide pseudo labels for target organs without additional fine-tuning.
    2. This paper conducted experiments to demonstrate their method outperforms the other recent SFDA method on the multi-site heart MRI segmentation dataset.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The predicted pseudo-label of the proposed methods lacks consistency and explainability, which hinders its application in real-world scenarios. According to the methodology part of this paper, the pseudo-label generated by the SAM-Med2D model directly depends on the prompts generated by the Multiple Random Sampling (MRS) strategy, which refers to random sample X points in the foreground binary map for K times to generate K different prompts. It seems that the generation of prompts is partially based on uncontrollable randomness, which makes it difficult to ensure reproducibility. Moreover, the prompts sampled by the MRS strategy are hard to interpret, there are no principles and constraints to guide the selections of foreground point prompts.
    2. The paper did not clarify the medical data utilized to train the SAM-Med2D model and the possible relevance of the proposed method. As far as I know, SAM-Med2D has been fine-tuned in the MRI dataset for the Myocardium segmentation task, and there is a potential risk of testing data leakage.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I did not see the code or link contained in the submission, not sure how to comment on reproducibility. However, I am deeply concerned about the reproducibility and consistency of the pseudo-label generated by the proposed method.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The authors could consider conducting pseudo-label analysis with a comprehensive set of evaluation metrics to examine the consistency of the pseudo-labels.
    2. The authors could consider replacing the SAM-Med2D model with another medical SAM method based on few-shot fine-tuning to avoid concerns about data leakage.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation for leveraging the domain generalization ability of SAM to help the pre-trained model overcome the domain shift makes sense. However, I am still concerned about the data leakage problem and the consistency of the pseudo-label generation of the proposed method. Therefore, I decided to weakly accept the paper, the final decision will highly depend on the rebuttal.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We warmly thank the reviewers for their positive and constructive comments. They say that our method is “novel” (R3, R4), “impactful” (R1), and our paper is “well-written and easy to follow” (R1, R3), and “experiments comprehensive” (R3, R4). In future research, we will conduct experiments using more pre-trained SAMs and extend our evaluation to other datasets to further demonstrate our proposed method’s effectiveness. We thank the reviewers for pointing out the typos and we will correct them in the final manuscript. More experimental details will be included in the upcoming code release.




Meta-Review

Meta-review not available, early accepted paper.



back to top