Abstract

Conventional 3D medical image segmentation methods typically require learning heavy 3D networks (e.g., 3D-UNet), as well as large amounts of in-domain data with accurate pixel/voxel-level labels to avoid overfitting. These solutions are thus extremely time- and labor-expensive, but also may easily fail to generalize to unseen objects during training. To alleviate this issue, we present MSFSeg, a novel few-shot 3D segmentation framework with a lightweight multi-surrogate fusion (MSF). MSFSeg is able to automatically segment unseen 3D objects/organs (during training) provided with one or a few annotated 2D slices or 3D sequence segments, via learning dense query-support organ/lesion anatomy correlations across patient populations. Our proposed MSF module mines comprehensive and diversified morphology correlations between unlabeled and the few labeled slices/sequences through multiple designated surrogates, making it able to generate accurate cross-domain 3D segmentation masks given annotated slices or sequences. We demonstrate the effectiveness of our proposed framework by showing superior performance on conventional few-shot segmentation benchmarks compared to prior art, and remarkable cross-domain cross-volume segmentation performance on proprietary 3D segmentation datasets for challenging entities, i.e. tubular structures, with only limited 2D or 3D labels.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3732_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3732_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zhe_FewShot_MICCAI2024,
        author = { Zheng, Meng and Planche, Benjamin and Gao, Zhongpai and Chen, Terrence and Radke, Richard J. and Wu, Ziyan},
        title = { { Few-Shot 3D Volumetric Segmentation with Multi-Surrogate Fusion } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a novel method, MSFSeg, for volumetric segmentation given only a few labeled support images. This can be used to reduce the time- and labor-expensive work of image segmentation for multiple clinical tasks. The MSFSeg pipline uses a multi-surrogate fusion-informed few-shot network, using a few 2D slices or 2D sequences as support. The method is evaluated on abdominal CT and MRI segmentation benchmarks and few-shot weakly-supervised 3D segmentation, and compared with several previous methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    A novel method for 3D segmentation is presented, outperforming previous methods. Thorough evaluation is done on several tasks and comparing with several methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Details are lacking about implementation and architecture. In the paper, it’s referred to “supplementary material”, but no supplementary material is available except the video with segmentations (which is not referred to in the paper).
    • Testing for statistically significant differences has not been done.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    In the paper, it’s stated that details for implementation and architecture are given in the supplementary material, but no supplementary material is available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Table 1: abbreviations LK, RK are not introduced. The horizontal line between “Super-pixel” and “or -voxel” doesn’t make sense and can be removed.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-written and the method is thoroughly evaluated, being compared to multiple other methods. However, details about implementation are missing (the supplementary file?).

  • Reviewer confidence

    Not confident (1)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces a few-shot 3D segmentation pipeline that leverages Multi-Surrogate Fusion (MSF) to enhance the prediction of 3D masks with limited labeled support data. By exploring dense foreground/background pixel relationships and mining semantic features across multiple support slices/sequences, MSFSeg excels in predicting unseen 3D objects with just a few support 2D slices or sequences.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.This paper employs a novel multi-surrogate fusion approach to achieve 1-shot or 5-shot few-shot 3D segmentation, and it supports 2D or 3D input as support set.

    2.The method proposed by the authors achieved the best results on the Abdomen-CT and CHAOS-MRI datasets and also outperformed VAT on their proprietary dataset which was unseen during training.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.Is there any relevant research that serves as a theoretical basis for multi-surrogate fusion?

    2.Why was VAT not included as a baseline in the comparative experiments of Table 1?

    3.In the section discussing innovative aspects, the paper mentions that it successfully segments challenging objects such as tubular structures across different domains. Can you provide visual results to demonstrate this capability?

    4.The figure caption for Fig. 2 should explain the excessive use of different colors for the connecting lines.

    5.The paper initially provides the dimensions of the variables, but fails to specify sizes in later sections (for example, the input and output dimensions in the multi-surrogate fusion part).

    6.Can the experimental section provide evidence that the multi-surrogate fusion is capable of extracting information related to coherence, diversity, and stability?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    no

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See main weaknesses

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The introduction of MSFSeg with Multi-Surrogate Fusion addresses the challenges of few-shot 3D segmentation by leveraging diverse morphology correlations and comprehensive anatomy information across patient populations.

    2. MSFSeg’s ability to perform well in cross-domain segmentation tasks, including challenging entities like tubular structures, with only limited 2D or 3D labels, highlights its robustness and generalizability.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposed a few-shot weakly-supervised 3D segmentation framework with the self-attention mechanism and the designed multi-surrogate fusion solution. Experiments on cross-domain datasets reveal the effectiveness of the framework.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The query features are aggregated by the designed MSF, to consistent and patient-specific morphology correlations across different patient data.

    Sufficent experiments demonstrate the effectiveness of the method.

    The organization of the paper is very well and is easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The analysis on the selection of the surrogates in the MSF method is insufficient, and the presentation of the method could be better as seen in the detailed comments below.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Figure 1 and Figure 2 used the lung lobe as examples, however, no experiments on the lung lobe datasets were found in this paper, Why did the authors choose this way to present the method? Did the authors conduct the experiments on lung datasets?

    My main concern is the selection of the surrogates, when few shot is 1. the coherence, diversity, and the stablization are identically the same surrogate. How to tackle this situation to acqure more morphological surrogates information?

    The selection criterion of the support images/sequences is not clear, is it fully random selection slices in the volumes or the inter-patients?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty of the paper is satisfactory and the experiments are very sufficient to prove the effectiveness of the framework. Include more technical details and erase the misleading presentation in this paper could improve the quality. Aforementioned points led this overall score.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper proposes MSFSeg, a novel few-shot 3D segmentation framework with a lightweight multi-surrogate fusion (MSF). MSFSeg can automatically segment 3D objects/organs (unseen during training) provided with one or a few annotated 2D slices or 3D sequence segments, via learning dense query-support organ/lesion anatomy correlations across patient populations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper presents a novel Multi-Surrogate Fusion (MSF) method for few-shot 3D segmentation, significantly enhancing the ability to handle limited annotated data by capturing diversified morphology correlations across patients. This method demonstrates clinical feasibility through robust validation on complex anatomical structures in medical imaging, showing superior performance over existing models. The innovative use of annotated slices from different scans and patients as support during training showcases an original approach to improving generalization in medical image analysis.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The network lacks some details required for reproducibility. This may be due to the missing supplementary information.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?
    • The authors refer to additional details in the supplementary materials, but the supplementary materials again contain the same paper. This may be an oversight. Please add it for reproducibility.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Does MSFseg take 2D query images from 3D volumes? In the introduction, the authors contrast their method with existing methods as follows: “Existing few-shot 3D-segmentation solutions [2, 19, 20, 24, 31] often rely on 2D FSS networks that treat each slice individually”. However, in 2.1, the authors say “we base the proposed MSFSeg on existing few-shot segmentation pipelines [18,24], which takes one 2D query image pf size hxw”
    • Minor - please switch the colors of cross and ticks in Table 1 and other places.
    • Missing supplementary information.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has novelty, strong experimental validation and is very well-written with detailed results.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We are grateful to the reviewers (@R1,R3,R4,R5) for the constructive feedback, as well as the recognition of the strengths of our work such as the satisfying novelty (@R3,R4,R5), extensive experiment evaluations(@R1,R3,R4,R5), and state-of-the-art performance on few-shot segmentation tasks(@R1,R3,R5).

Concerns and Questions:

– Supplementary material, implementation Details (@R1,R3,R4,R5). We apologize for the confusion. We accidentally uploaded the wrong file in our initial submission and will add implementation details back in the final version of our main paper.

– Statistically significant difference (@R1). Thank you for your suggestion. We will add statistic analysis of our experiments in the final version.

– Typos, formats and figure presentation (@R1,R3,R5). We apologize for the confusion. We will revise accordingly in the final version.

– VAT as 3D few-shot segmentation baseline (@R3). 1. In Table 1, we are evaluating against conventional 1-/5-shot (2D) segmentation methods on Abdomen-CT and CHAOS-MRI datasets and our method is not based on VAT. On the other hand, VAT does not directly provide evaluation results on these two benchmarks. Thus we omitted reproducing VAT on conventional 2D few-shot segmentation tasks. 2. While most of existing few-shot segmentation methods do not have open-source codebase available, we choose to adapt VAT (code available) to 3D few-shot segmentation case as our baseline for performance evaluation on few-shot 3D segmentation tasks.

– Visualization of tubular data segmentation (@R3). The visualization of few-shot 3D segmentation on tubular data is presented in the videos included in the uploaded supplementary zip file.

– One-shot case (@R4). We apologize for omitting the details in our main paper. For 1-shot case, we randomly apply data augmentation techniques, e.g. horizontal flipping, affine transformations or color jittering and input “n” number of augmented support images (as pseudo n-shot case) to explore comprehensive morphology information from the single support.

– Selection mechanism of the supports (@R4). For conventional 2D FSS task (Tab. 1), we randomly select n support slices within same data volume for each query image (intra-patient), for fair comparison with existing FSS methods. For few-shot 3D segmentation on our proprietary data (Tab. 2), intra-volume and inter-volume refers to selecting support images/sequences across same and different patient volumes respectively.

– Support in 2D/3D (@R5). We apologize for the confusion. Our proposed MSFSeg is compatible with support in both 2D and 3D format. Though conventional FSS methods typically take 2D support (n-shot refers to n number of slices), we design our MSFSeg starting from conventional 2D FSS baseline network and employ multi-head attention followed by multi-surrogate fusion to distill information across different support sequences. Specifically, our MSFSeg takes n support sequences, with each support sequence containing d_i (c.f. Section 2.1A, d_i >= 1) slices, where the task downgrades to conventional 2D few-shot segmentation when d_i=1. The multi-head attention module in MSFSeg allows flexible length of input for each support sequence (d_i >= 1), which aggregates intra-sequence information by attention computation. We then apply multi-surrogate fusion to explore inter-patient morphology information across different support sequences for improved 3D few-shot segmentation.




Meta-Review

Meta-review not available, early accepted paper.



back to top