Abstract

Limited labeled data hinder the application of deep learning in medical domain. In clinical practice, there are sufficient unlabeled data that are not effectively used, and semi-supervised learning (SSL) is a promising way for leveraging these unlabeled data. However, existing SSL methods ignore frequency domain and region-level information and it is important for lesion regions located at low frequencies and with significant scale changes. In this paper, we introduce two consistency regularization strategies for semi-supervised medical image segmentation, including frequency domain consistency (FDC) to assist the feature learning in frequency domain and multi-granularity region similarity consistency (MRSC) to perform multi-scale region-level local context information feature learning. With the help of the proposed FDC and MRSC, we can leverage the powerful feature representation capability of them in an effective and efficient way. Extensive experiments on two medical image segmentation datasets show that our approach achieves large performance gains and exceeds other state-of-the-art methods. Code will be available.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0245_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{He_FRCNet_MICCAI2024,
        author = { He, Along and Li, Tao and Wu, Yanlin and Zou, Ke and Fu, Huazhu},
        title = { { FRCNet: Frequency and Region Consistency for Semi-supervised Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15008},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the author proposed two consistency regularization strategies—frequency-based and region-based regularization—for self-supervised segmentation. These methods were tested on two public datasets using various unlabeled ratios.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-organized, with clear wording and figures that facilitate easy comprehension for readers.
    2. The author introduced frequency-based and region-based consistency methods as additional supervisory mechanisms for a semi-supervised segmentation system.
    3. The experiments and ablation studies conducted are comprehensive and demonstrate the superior performance of the proposed method, as well as the enhancements achieved through the introduced consistency strategies.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. For the frequency-based consistency, the author employs the Discrete Cosine Transform (DCT) to extract frequency information. I am curious about why the author chose the DCT over the Fast Fourier Transform (FFT) or Wavelet Transform.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See weakness.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Novelty and sufficient experiments.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors introduce two consistency loss terms to improve the training of a model for medical image segmentation, in particular in the low label regime. With one strategy, the authors introduce a consistency loss in the frequency domain between teacher and student. With another, they introduce a consistency loss of multi-scale features. The authors evaluated their method on two medical image segmentation datasets and compare it to other SSL methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • As far as I can tell, this is the first formulation of a frequency consistency loss for medical image segmentation
    • While multi-resolution context has been shown to be beneficial for medical image segmentation, the aurhors show a new approach to introduce this context into their model
    • The comparisons appear through and the authors provide informative ablation experiments
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The text and figures are in need of some improvement (see below)
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • The text contains multiple minor grammatical errors; please run it through a grammar checker
    • 2.3 first paragraph: I could not follow the argument made here. When referring to other(?) methods, please cite them (at least examples).
    • Fig. 2: The other methods appear to be doing a lot worse than the proposed method but that is not reflected in the scores. Is this truly a representative example? Please make sure you are showing a representative example
    • Fig. 3: It is not described what images 3,4,7,8 are showing. I assume that they show probability maps - I encourage adding a colorbar. Beyond that, as the evaluation is in the segmented domain, it would be useful to show the segmentation results (as well)
    • In the ablation studies, it is mentioned that “no trainable parameters are introduced.” Are the transformers for MRSC pretrained? Please clarify that in the text

    • Suggestion for Fig. 1: Color the boxes on the left side the same way you color them on the right. This would make it easier to identify what the right side refers to
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors introduce new methods for their identified task and advance over existing methods. The authors provide good ablation experiments.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper introduces a semi-supervised methodology for medical image segmentation, targeting anatomical structures, organs, or lesions for clinical analysis and diagnosis. It employs the FRCNet framework, consisting of Frequency Domain Consistency (FDC) and Multi-granularity Region Similarity Consistency (MRSC) components. This approach mitigates the need for extensive manual labeling of medical data by utilizing semi-supervised techniques.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This study introduces a semi-supervised approach for image segmentation, employing two consistency regularization strategies: FDC and MRSC. The fusion of region-level and frequency domain information represents a noteworthy advancement in image processing for segmentation tasks. This method demonstrates the capability to learn from ample unlabeled data, addressing issues of insufficient annotation effectively. Notably, it offers an open framework easily combinable with existing SSL methods. The integration of contextual relationships, particularly elucidated in the multi-granularity region similarity consistency, ensures efficient understanding of local regions. The experimental results, accompanied by additional ablation studies, are compelling, and the authors also showed that the proposed approach significantly outperformed state-of-the-art methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Firstly, On the MRSC, the paper claimed that “SSL methods based on pixel-level consistency cannot model the relationship between local regions well” but unfortunately there was no cited evidence of this. Secondly, the author adopts some evaluation metrics to decide the significance of the proposed method however, its unclear if these results are clinically sufficient considering the metrics used. Thirdly, The author could improve the result and discussion sections to enhance reader comprehension.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper offers comprehensive insights into the proposed approach, detailing its methodology and experiments conducted to assess its efficacy compared to state-of-the-art techniques. This endeavor by the authors paves the way for potential advancements in method reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Given that the primary innovation of this study lies in leveraging localized information to improve semi-supervised segmentation, I suggest the author enhance clarity in Section 2.2, particularly elucidating how the integration of transformer blocks with self-attention layers aids in amplifying signals within the frequency domain for image segmentation purposes.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I recommend accepting the paper due to its intriguing strategy for addressing common challenges in image segmentation through a semi-supervised approach. First, the authors made a clear proposal and were able to show evidence of significant improvement by comparing with other related work. Second, the contribution described is significant and also has wider application for interested readers

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Q1: Grammatical errors, Fig. 1, Fig. 2 and Fig. 3 (R1) We’ve done a thorough grammar check and revise the first paragraph in 2.3. We will change the color of Fig. 1 in the revision. In Fig. 2,we select images with small lesions (first row) and low contrast (second row) as representative examples to show the effectiveness of our method. In fig 3, the two images are skin (the former) and polyp images (the latter), images 3,4,7,8 are the feature maps of the input images to show the feature learning ability of our proposed frequency domain consistency.

Q2: Are the transformers for MRSC pretrained? Please clarify that in the text (R1) As shown in Fig 1, MRSC does not contain any trainable parameters, it only includes the feature transformation (down sampling, flatten operation and region similarity computation). We mentioned them in 2.3 Multi-granularity Region Similarity Consistency.

Q3: No cited evidence, evaluation metrics, improve the result and discussion sections, enhance clarity in Section 2.2 (R3) MT [19] and URPC [15] are based on pixel-level consistency and they cannot model the relationship between local regions well and we will cite them in the revision. For segmentation evaluation metrics, we follow URPC [15] and SASSNet [10] and adopt these metrics to evaluate our method. We will enhance clarity in Section 2.2. in the revision.

Q4: I am curious about why the author chose the DCT over the Fast Fourier Transform (FFT) or Wavelet Transform (R5) DCT generates real number coefficients, which simplifies subsequent processing and is more intuitive than the complex number coefficients of FFT. Wavelet Transform is very effective in removing image noise. Therefore, we empirically employ the DCT for the frequency-based consistency. In order to be more rigorous, we will study the differences among these frequency transformations in our future work.




Meta-Review

Meta-review not available, early accepted paper.



back to top