Abstract

Medical ultrasound imaging is ubiquitous, but manual analysis struggles to keep pace. Automated segmentation can help but requires large labeled datasets, which are scarce. Semi-supervised learning leveraging both unlabeled and limited labeled data is a promising approach. State-of-the-art methods use consistency regularization or pseudo-labeling but grow increasingly complex. Without sufficient labels, these models often latch onto artifacts or allow anatomically implausible segmentations. In this paper, we present a simple yet effective pseudo-labeling method with an adversarially learned shape prior to regularize segmentations. Specifically, we devise an encoder-twin-decoder network where the shape prior acts as an implicit shape model, penalizing anatomically implausible but not ground-truth-deviating predictions. Without bells and whistles, our simple approach achieves state-of-the-art performance on two benchmarks under different partition protocols. We provide a strong baseline for future semi-supervised medical image segmentation. Code is available at https://github.com/WUTCM-Lab/Shape-Prior-Semi-Seg.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2948_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/WUTCM-Lab/Shape-Prior-Semi-Seg

Link to the Dataset(s)

https://github.com/haifangong/TRFE-Net-for-thyroid-nodule-segmentation https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset

BibTex

@InProceedings{Che_Striving_MICCAI2024,
        author = { Chen, Yaxiong and Wang, Yujie and Zheng, Zixuan and Hu, Jingliang and Shi, Yilei and Xiong, Shengwu and Zhu, Xiao Xiang and Mou, Lichao},
        title = { { Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    1. The author proposed a simple yet effective semi-supervised segmentation model for pesudo-labeling.
    2. The author incorporated shape-prior regularization into the network framework, which serves to evaluate and enhance the plausibility of the segmentation map, thereby improving segmentation performance.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The model is simple and straightforward, featuring only a shared encoder, twin decoders, and a pre-trained GAN, facilitating its reproducibility.
    2. The array of comparison methods is rich and comprehensive, providing a solid benchmark for future research.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper does not sufficiently explain each component of the formulation and the model; for instance, it does not clarify the terms m_i, or define the variables x, \sim{x}, and \hat{x}.
    2. The method’s description lacks clarity; e.g., the 2.2 section stating, “At training time, the discriminator aims…while the generator… Therefore, we can define our DSR…” conflates the general mechanism of GANs with the specific reasoning for introducing DSR, without a clear causal link between the two.
    3. The presentation of results is not easily interpretable. The choice to use a radar chart in Fig. 4 to summarize the findings is puzzling, as it does not clearly display the dice scores and there is only one variable, the ration of labeled samples, with no observable trend across different ratios. This raises questions about the usefulness of employing a radar chart.
    4. The algorithm lacks novelty. The encoder-twin-decoder architecture is already well-established in pseudo-labeling methods while and the designed DSR module appears to contribute minimally, enhancing performance by 1% shown in the ablation study.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The author could provide detailed explanations of the method, specifically defining each symbol used in the formulation.
    2. The author could reformat the results to enable a clearer comparison.
    3. I have some questions regarding the DSR module. Can the discriminator effectively perform on the \sim{x} generated by the semi-supervised image segmentation network since the discriminator is pre-trained alongside its own generator? Can the framework integrate the segmentation model as the generator instead of training the GAN separately? These clarifications could enhance the understanding and potential application of the DSR module in the study.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Although the model is simple, it lacks novelty. Additionally, The methods section is inadequately written as several components of the formulas are not clearly defined, making it difficult to understand their function.
    2. The paper lacks explanations and interpretations of the hyper-parameters and the results, which undermines its clarity and utility.
  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    The author responded to the question; however, I still think the technical contribution is not sufficient.Based on this, I am inclined to maintain my original opinion.



Review #2

  • Please describe the contribution of the paper

    This paper devises an encoder-twin-decoder network where the shape prior acts as an implicit shape model, penalizing anatomically implausible but not ground-truth-deviating predictions. Without bells and whistles, our simple approach achieves state-of-the-art performance on two benchmarks under different partition protocols.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-organized and written.
    2. The designed architecture leverages both labeled and unlabeled data.
    3. The proposed method achieves good performance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Could you please elaborate on how to conduct dropout on the feature representations and pass the perturbed features through to predict a segmentation mask?
    2. What’s the benefit of using the dropout and perturbation strategy instead of EMA (Exponential Moving Average) which is usually used in semi-supervised scenarios?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see both the strengths and weaknesses sections.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The rebuttal addressed most of my concerns and I tend to vote for acceptance.



Review #3

  • Please describe the contribution of the paper

    The paper introduces a straightforward, pseudo-labeling-based network that diverges from recent trends of complex methodologies, aiming to improve ultrasound image segmentation. It centers on a clean and straightforward encoder-twin-decoder architecture that outperforms more complex designs. The key innovation is incorporating an adversarially learned shape prior, which acts as a regularizer for the segmentation process. This prior penalizes unrealistic network outputs rather than deviations from ground truths, enhancing the robustness of semi-supervised learning. The network provides a strong baseline for future research, validated through extensive experiments and ablation studies on two public ultrasound datasets, demonstrating its effectiveness in handling artifacts and ambiguous lesion boundaries.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The simple method makes significant improvements over previous work.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The GAN model may determine whether a shape is plausible. How can the GAN model identify the location of the mask? The segmentation mask should be accurate in the location
    2. Can you use the diffusion model to do the same thing?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please refer the weakness.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please refer the weakness.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely thank the reviewers for and providing detailed and constructive comments.

Code (R1&R3&R4) We promise to make our code publicly available.

Reviewer #1 Q1 How to perturb feature representations? We randomly zero out elements in feature maps output by the encoder \mathcal{E} with a probability of 0.3, obtaining perturbed feature representations, which are then fed into the decoder \mathcal{D}_p.

Q2 EMA In previous experiments, we found that EMA is unsuitable for our architecture. Our model includes a shared encoder \mathcal{E} and two decoders \mathcal{D}_l and \mathcal{D}_p. During training, \mathcal{D}_l and \mathcal{E} are updated with labels, while \mathcal{D}_p and \mathcal{E} are updated with pseudo-labels and DSR. With EMA, \mathcal{D}_l does not actively update, preventing \mathcal{E}.

Reviewer #3 Q1 How can the GAN model identify the location of the mask? GAN’s discriminator is designed to assess the plausibility of shape masks without considering their locations. In our semi-supervised segmentation network, it serves as a shape constraint, while the location constraint is handled by two cross entropy losses (see Fig. 2).

Q2 Can you use the diffusion model to do the same thing? We cannot do that. The GAN’s discriminator scores masks, which we use to guide the segmentation model. Diffusion models, on the other hand, lack this component.

Reviewer #4 Q1 The paper does not sufficiently explain each component of the formulation and the model. In “2 METHOD”, these variables have been defined. For example, in Eq. (1), we mention x~P_r, \sim{x}~P_g, and \hat{x}~P_x. We state that P_g and P_r represent distributions of generated and true masks, respectively, while P_x is an additional term from [14]. After Eq. (4), we clarify that m_i is a Dropout mask.

Q2 The clear causal link between the general mechanism of GAN and the specific rationale for introducing DSR. We will polish the expression as follows: “… After training, the discriminator gains the ability to assess the plausibility of shape masks. Therefore, we can define DSR as …”

Q3 Radar charts. In the final version, we will replace the two radar charts in Fig. 4 with a table to better present results of ablation studies.

Q4 Algorithmic novelty. 1.Our encoder-twin-decoder network is not widely used in pseudo-labeling-based semi-supervised image segmentation (CPS, CVPR’21; U2PL, CVPR’22; R. Yi, et al., TIP, 2022; CU2L, MICCAI’23; H. Basak, et al., CVPR’23). 2.We aim to counter the trend of complex architectures in semi-supervised image segmentation and explore a simple yet effective model for ultrasound images. 3.In our experiments, complex models performed poorly. Compared to SOTA methods, our approach achieves better results on two ultrasound datasets. 4.Our DSR is designed to enhance the visual quality of segmentation results. It corrects some errors that do not conform to anatomical principles, as shown in Fig. 1. Although these improvements are not very evident in Dice and IoU (because of the limited number of pixels involved), they significantly enhance the visual quality of segmentations.

Q5 Can the discriminator effectively perform on the \sim{x} generated by the semi-supervised image segmentation? Although the discriminator is pre-trained, it is trained on the same data distribution as the segmentation network, and thus can effectively work on \sim{x}. Also, the visualization results in Fig. 5 confirm this.

Q6 Can the framework integrate the segmentation model as the generator instead of training the GAN separately? In fact, we tried to do this, but experimental results were not satisfactory. We observed that this end-to-end training process is very unstable. Therefore, in order to achieve better results, we opt for a pre-training approach to obtain DSR. Of course, this is an interesting problem, and we plan to further explore optimization strategies for end-to-end learning of our semi-supervised segmentation framework in future work.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper introduces an encoder-twin-decoder network for ultrasound image segmentation, incorporating an adversarially learned shape prior to improve the plausibility of segmentation maps. The approach is straightforward yet effective, achieving state-of-the-art performance on two benchmarks. Key strengths include its simplicity, effective use of semi-supervised learning, and comprehensive evaluation on public datasets. The innovative use of a shape prior is noteworthy. However, the paper lacks detailed explanations of the training process, comparisons with alternative models, and analysis of computational efficiency. After the rebuttal, most of the concerns seem to have been well addressed, leading to a borderline accept overall.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This paper introduces an encoder-twin-decoder network for ultrasound image segmentation, incorporating an adversarially learned shape prior to improve the plausibility of segmentation maps. The approach is straightforward yet effective, achieving state-of-the-art performance on two benchmarks. Key strengths include its simplicity, effective use of semi-supervised learning, and comprehensive evaluation on public datasets. The innovative use of a shape prior is noteworthy. However, the paper lacks detailed explanations of the training process, comparisons with alternative models, and analysis of computational efficiency. After the rebuttal, most of the concerns seem to have been well addressed, leading to a borderline accept overall.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top