Abstract

Semi-supervised medical image segmentation, crucial for medical research, enhances model generalization using unlabeled data with minimal labeled data. Current methods face edge uncertainty and struggle to learn specific shapes from pixel classification alone. To address these issues, we proposed two-stage knowledge distillation approach employs a teacher model to distill information from labeled data, enhancing the student model with unlabeled data. In the first stage, we use true labels to augment data and sharpen target edges to make teacher predictions more confident. In the second stage, we freeze the teacher model parameters to generate pseudo labels for unlabeled data and guide the student model to learn. By feeding the original background image to the teacher and the enhanced image to the student, The student model learns the information hidden under the mantle and the overall shape of hidden information of the segmented target. Experimental results on the Left Atrium dataset surpass existing methods. Our overlay mantle-free training method enables segmentation based on learned shape information even in data loss scenarios, exhibiting improved edge segmentation accuracy. The code is available at https://github.com/vigilliu/OMF.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0481_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0481_supp.pdf

Link to the Code Repository

https://github.com/vigilliu/OMF

Link to the Dataset(s)

https://github.com/yulequan/UA-MT/tree/master/data

BibTex

@InProceedings{Liu_Overlay_MICCAI2024,
        author = { Liu, Jiacheng and Qian, Wenhua and Cao, Jinde and Liu, Peng},
        title = { { Overlay Mantle-Free for Semi-Supervised Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15010},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors propose a way to perform semi-supervised segmentation using a student-teacher approach. Model learns an induced shape prior allowing it to perform better within context of missing data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Approach is presented with good clarity and is understandable.
    2. Good comparisons with other approaches showcasing trust in method.
    3. Smart use of mixup data augmentation to make the model more aware about the shape of the data.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Major Weaknesses.

    Lack of novelty

    Although authors have come up with a smart way to performing augmentation so that unlabelled data can be leveraged for improving the segmentation performance the novelty is lacking. Both the augmentation + student teacher model ideas have been done before.

    Lack of evaluation

    1. authors have compared with several different methods on left atrium dataset which is great, however since this method is showcased as a general purpose method for semi-supervised segmentation, evaluation on different datasets (eg: femur bone, pancreas etc) + different modalities (CT, Xray) is important to showcase that the method is indeed general.

    2. In general the motivation of getting a good segmentation even with missing portions is not coming through in results. Instead of adding masks in LA data and showcasing results it will be very important to show this in actual data where portion of the anatomy is degraded/occluded.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Since the method is presented as a general purpose semi-supervised segmentation model it needs to be evaluated on more organs especially considering the technical novelty is at a system level.

    • Especially on a true world data set where the anatomy is not clearly visible from sensor input – this is the major selling point of the paper.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Considering only a application level novelty of the paper the evaluation do not provide enough trust that this method will outperform SOTA consistently

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper
    1. A novel data augmentation method was proposed to crop and concatenate images along segmentation edges based on labels, boosting confidence in edge predictions and addressing uncertainty.
    2. A method was developed to design differentiated inputs and fix the parameters of the teacher model during knowledge distillation, thereby allowing the student to perceive the underlying shape of segmentation targets by removing the mantle.
    3. The proposed network achieves state-of-the-art performance in semi-supervised segmentation tasks on the LA database.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A data enhancement method named Overlay Mantle-Free is proposed to enhance the segmentation ability of semi-supervised models.
    2. A method was developed to design differentiated inputs and fix the parameters of the teacher model during knowledge distillation, thereby allowing the student to perceive the underlying shape of segmentation targets by removing the mantle.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The segmentation object mentioned in the title of this paper is medical image, but the proposed method is only applied to LA dataset. Obviously, a single LA dataset cannot represent medical images. The image segmentation task on LA dataset belongs to a relatively simple binary classification segmentation task. Although it has achieved excellent performance, it cannot prove the generalization ability and robustness of this method on related medical image data.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    None.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The method in this paper performs data enhancement on labeled and unlabeled data respectively, and also covers post-processing, but experiments are only conducted on LA dataset. Although this method is better than BCP, it cannot prove the generalization performance of this method on other medical data. More medical image datasets such as Pancreas-NIH and ACDC dataset should be included to demonstrate the effectiveness of the proposed method.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents a method called OMF, which proposes a data enhancement and training method that can achieve excellent performance. It is novel to predict the segmentation result of the masked image. However, this model is only tested on LA dataset, and the performance without post-processing is not explained.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The author answers the questions raised above. Since there will be no major changes to the paper, the previous score is maintained.



Review #3

  • Please describe the contribution of the paper

    This manuscript introduces a novel two-stage knowledge distillation framework for semi-supervised medical image segmentation, which aims to improve edge accuracy and generalization using both labeled and unlabeled data. The approach, termed Overlay Mantle-Free (OMF), is innovative and addresses common challenges in medical image segmentation, such as edge uncertainty and the need for a detailed understanding of anatomical shapes in images. The methodology is robust, incorporating advanced techniques like differentiated input handling and novel data augmentation strategies to enhance the model’s learning capabilities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Innovative Methodology: The two-stage knowledge distillation approach, coupled with differentiated inputs for the teacher and student models, is a clever strategy that potentially enhances the learning efficacy of the model under semi-supervised settings.

    2. Effective Use of Unlabeled Data: The method effectively leverages unlabeled data by using it to train the student model with pseudo-labels generated by the teacher model, which is a significant advantage in medical image analysis where labeled data can be scarce and expensive to obtain.

    3. Focus on Edge Accuracy: By focusing on improving edge accuracy through targeted data augmentation and processing strategies, the manuscript addresses a critical area in medical image segmentation that directly impacts the clinical usefulness of segmentation results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Experimental Validation: While the experimental results are promising, the manuscript would benefit from a broader validation across various datasets or more diverse medical imaging modalities to establish the generalizability of the proposed method.

    2. For the quantitative results, multiple runs (different selection of labeled data) should be performed and standard deviation should be reported to show robustness. For example, in Tab. 1, why the performance of BCP dropped even the number of labeled data increased?

    3. What are the advantages of the proposed methods compared to others? The performance improvements as reported appear to be marginal.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please refer to the main weakness of paper. The most import problem here is that the advantages of the proposed method should be clarified given that the performance improvements as reported appear to be marginal.

    Also, the experiments should be run multiple times (different selection of labeled data) or the results may be biased.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The over all idea is good but need more experiments to verify the robustness of the proposed methods.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    My question is partially solved. However, the missing standard deviation seems not explained.




Author Feedback

We appreciate the feedback from reviewers R1, R3, and R4.

@R1(motivation of missing data): We would like to address a possible misunderstanding. Masking is a straightforward method to show whether the trained model has learned the overall shape. Specifically, the model trained by our method is not limited to pixel-by-pixel classification but can perceive that missing part violate the overall shape of the LA as shown in supplementary materials. Prediction on the mask part is similar like MAE, which is more of an interpretation experiment than a comparison. Thank you for your suggestion on “the anatomy is not clear”, which inspired us to discuss this method in ambiguous situations in later extensions.

@R1(Q3: lack of novelty, Q8: application-level novelty): Thank you for your comment. The innovation of this paper lies not on the combination of augmentation and the student-teacher model, which is a common approach in previous works such as Mean-Teacher, UA-MT, BCP, etc. Firstly, we introduce an Overlay strategy. It is a new angle on data augmentation for addressing edge uncertainty and an easy-to-use scheme, but effectively. Secondly, we innovate by removing EMA between student and teacher compared with other related works. Instead, we directly pre-train to fix the teacher parameters and design a Mantle-Free distillation. In this way, it is beneficial to generate pseudo-labels and learn the difference between background and overlay images.

@R1(Q8: augmentation novelty)@R3(Q6.3:marginal improvement): Due to continuous advancements, the performance of semi-supervised methods on LA is close to fully supervised, making marginal improvements challenging. However, We think the point is the Overlay data augmentation method and the Mantle-Free differentiated inputs for the teacher and student models are worth discussing. Notably, “S1” in Table 2 shows that V-net using 10% labeled data with our Overlay data augmentation methods achieves a 4.35% Dice improvement compared to using original images and labels, and 1.06% greater improvement than the regular method even with 20% labeled data (Compare “S1” in Table 2 with the second row of Table 1). We will highlight the unique advantages of the proposed methods in the revised version.

@R4(Q6:binary classification segmentation task weakness): Thank you for your constructive advice. We have to admit that this manuscript has the limitation that it only discusses binary segmentation tasks, but the idea of Overlay is inspired by the “ClassMix” is originally a multi-class data augmentation. For multi-class, such as the ACDC, the solution is to randomly select one class label of three to make a mantle and paste it on the background image. Inspired by your feedback, We plan to explore multi-class research.

@R3:(the performance of BCP dropped): Thank you for your careful analysis and We apologize for any confusion. We attempted to reproduce the result but did not achieve the same result as the original paper. To avoid lowering the baseline, we used the result provided by the original paper for the 10%. The original paper did not provide data for the 20%, so we obtained the 20% result using the same settings as the 10%. Our work also uses the same parameters for 10% and 20% settings in code.

@R4(post-processing): Thank you for bringing this to our attention, We should add an ablation study for NMS to ensure rigor.

@R1(evaluation lack)@R3(Q6.1, Q12:experimental validation) @R4(Q6:robustness of this method, Q10: more datasets): LA is commonly used in previous work, is able to clearly explain how Overlay mask works. Even though we observed excellent results, robustness needs further testing. If it is allowed, future work will include additional datasets such as ACDC on multi-class segmentation. We also plan to explore to apply OMF on different modalities (CT, Xray,WSI).

Finally, we would like to thank all of the reviewers for their careful reading and suggestions.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors have addressed the major concerns raised by the reviewers and promised to revise the paper accordingly. Positive reviews outweigh negative opinions.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors have addressed the major concerns raised by the reviewers and promised to revise the paper accordingly. Positive reviews outweigh negative opinions.



back to top