Abstract

Early diagnosis of colorectal cancer (CRC) is crucial for improving survival and quality of life. While computed tomography (CT) is a key diagnostic tool, manually screening colon tumors is time-consuming and repetitive for radiologists. Recently, deep learning has shown promise in medical image analysis, but its clinical application is limited by the model’s unexplainability and the need for a large number of finely annotated samples. In this paper, we propose a loose lesion location self-supervision enhanced CRC diagnosis framework to reduce the requirement of fine sample annotations and improve the reliability of prediction results. For both non-contrast and contrast CT, despite potential deviations in imaging positions, the lesion location should be nearly consistent in images of both modalities at the same sequence position. In addition, lesion location in two successive slices is relatively close for the same modality. Therefore, a self-supervision mechanism is devised to enforce lesion location consistency at both temporal and modality levels of CT, reducing the need for fine annotations and enhancing the interpretability of diagnostics. Furthermore, this paper introduces a mask correction loopback strategy to reinforce the interdependence between category label and lesion location, ensuring the reliability of diagnosis. To verify our method’s effectiveness, we collect data from 3,178 CRC patients and 887 healthy controls. Experiment results show that the proposed method not only provides reliable lesion localization but also enhances the classification performance by 1-2%, offering an effective diagnostic tool for CRC. Code is available at https://github.com/Gaotianhong/LooseLocationSS.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1379_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1379_supp.pdf

Link to the Code Repository

https://github.com/Gaotianhong/LooseLocationSS

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Gao_Loose_MICCAI2024,
        author = { Gao, Tianhong and Song, Jie and Yu, Xiaotian and Zhang, Shengxuming and Liang, Wenjie and Zhang, Hongbin and Li, Ziqian and Zhang, Wenzhuo and Zhang, Xiuming and Zhong, Zipeng and Song, Mingli and Feng, Zunlei},
        title = { { Loose Lesion Location Self-supervision Enhanced Colorectal Cancer Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a self-supervision framework aimed at the early diagnosis of colorectal cancer. The principal contributions of this study include the development of a loose lesion location self-supervision (SSL) mechanism and a mask correction loopback strategy. The effectiveness of the proposed method is validated through an in-house dataset, demonstrating satisfactory improvements in early diagnostic accuracy.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is in a good shape. I personally enjoy Fig.1 which is informative. Meanwhile, this paper is well-written and easy to understand. The idea is novel, and interesting. Meanwhile, the method achieves diagnosis while considering the explanability.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The title of Section 2.3 could potentially lead to misunderstandings. 2) It would improve the table representing the main results if it included the types and quantities of annotations used. Providing this detail would facilitate a better understanding of the supervised signals employed during training. 3) Comparing this method directly with other network architectures may not provide a fair assessment. It would be advisable to integrate the proposed method with other backbone networks to ensure a more equitable comparison. 4) Could you specify the backbone networks utilized for GradCAM and EigenCAM? This information would help in understanding the basis of their performance metrics.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The author has provided the corresponding code, which is suitable for use. However, the effectiveness of this method is closely tied to the dataset used. I recommend that the author clearly address the availability of the dataset and consider making it openly accessible to facilitate further research and validation.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    More details are expected to Tab. 1

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think this paper is good and the explored problem is important. I prefer to give an acceptance upon a high-quality response.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    Authors present a loose lesion location self-supervision enhanced colorectal cancer (CRC) diagnosis framework to reduce the requirement of fine sample annotations and improve the reliability of prediction results. For testing purposes, authors collect data from 3,168 CRC patients and 887 healthy controls. This framework was validated at the slice- and patient-level classification tasks, achieving competitive results when comparing with different state-of-the-art CNNs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Authors clearly explain the problem (unmet need), what has been done in this regard, and present a clean methodology. Framework presents a novel idea, it is characterized by its simplicity and how it reduces the need for labeled samples and enhances interpretability.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Authors should include a statistical analysis to provide quantitative evidence to demonstrate better performance and support sentences like “our method excels in slice-level classification tasks” or “our method achieves the best performance in terms of …”.
    • Lack of clarity: Authors should clearly present from experimental design the ablation study.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Abstract is quite long, and does not provide at the end quantitative results that allow authors to conclude about the presented work.
    • The main limitation of the methodology to be translated into a clinical scenario could be having available the pair of contrast and non-contrast images to apply this framework to new patients.
    • In the experiments section, I suggest including a couple of sentences to detail what experiments the reader will find in the results. SOTA is state of the art but it was never defined. Section name should be changed to “Experiments and Results” or have a separate section.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The presented framework is a novel idea and would be helpful in clinical scenarios. Results demonstrate good performance against common CNNs.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors introduce a framework for diagnosing colorectal cancer (CRC) using computed tomography (CT) scans (contrast enhanced, non-contrast enhanced). It incorporates a loose lesion location self-supervision mechanism and a mask correction loopback strategy. The proposed method focuses on enhancing the interpretability of deep learning models using weakly labeled data. These elements aim to ensure consistency in lesion localization across successive CT slices and between non-contrast and contrast-enhanced scans, thereby improving the reliability of the diagnostic results. The authors validated the method on a large in-dataset, showing improved classification and localization performance with limited annotation requirements.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1-The method is novel and utilizes self-supervision to align lesion locations across different modalities and successive slices, which is a significant advancement in minimizing the need for detailed annotations. 2-By linking the classification results directly to the lesion locations through a loopback mechanism, the model not only becomes more interpretable but also ensures the reliability of the predictions, which is crucial for clinical applications. 3- The extensive dataset comprising both CRC patients and healthy controls, along with rigorous testing, provides a robust validation of the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The novel has been applied only to in-house data and haven’t mention that they will publish the data.
    2. Due to nature of topic either the related works hasn’t been explored or there are few works in this domain. Please elaborate.
    3. While the results are promising, the paper could further discuss the application of the framework to other types of cancers or diseases where similar imaging characteristics are observed, to demonstrate broader applicability.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors does provide the code but availability of data of not mentioned anywhere.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1-To strengthen the evaluation and demonstrate the generalizability of the proposed method, applying it to publicly available colorectal cancer datasets like the Medical Image Decathlon (Venous Phase CT) would be valuable. This would allow for direct comparisons with existing state-of-the-art approaches and provide a broader perspective on the method’s performance.

    1. Decision of choosing number of annotated slices is somewhat ambiguous (section 3.3). Authors nor mentioned the rationale behind it nor did the experiment to see what impact of decreased supervision is. 3-Overall, the paper is well-organized and presents a clear methodology and substantial experimental results. Although authors mentioned in the future work but a more discussion on potential limitations and the feasibility of deployment in diverse clinical environments would enhance the paper’s impact.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Due to novelty of method and niche medical application to colorectal cancer classification and detection (multicenter data).

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely thank all reviewers and ACs for dedicating time and effort to review the paper and providing insightful comments and advice. We would like to address questions as follows.

To Reviewer #1 Q1: Include a statistical analysis in the abstract. A1: Thanks for the advice. We will add quantitative results to the abstract and simplify its content in the revision. Q2: Clarify the ablation study from experimental design. A2: Sorry for the lack of clarity. The ablation study evaluates three loss function terms (Section 2.3) to verify their effectiveness. We will revise for clarity. Q3: The main limitation is needing both contrast and non-contrast images in a clinical scenario. A3: Sorry for the confusion. Single modality data is actually sufficient for clinical applications. Training with temporal and modality consistency reduces reliance on fine annotations. During testing, the lesion area is predicted by mapping the highest lesion probability patch onto the original CT image. Q4: Include intro of results, define SOTA, and rename experiments section. A4: Thanks. We will revise according to your valuable advice.

To Reviewer #3 Q1: Section 2.3 title could be misleading. A1: Sorry for the confusion. We will change it to “Complete Framework” for better understanding. Q2: Include annotation types and quantities in the table. A2: Thanks. Annotation types and quantities are detailed in Sections 3.1 and 3.3: Unsupervised: GradCAM, EigenCAM. Fully supervised: Sahoo, ResNet50, MobileNetV3-L, RegNetY-128G, EfficientNetV2-L, ConvNeXt-L, ConvNeXtV2-L. 8.3% supervised: Ours (8.3% of bounding boxes used). We will clarify these in the revised table. Q3: Integrate the method with other backbones for fairer comparison. A3: Thanks for constructive advice. We integrated our method with the ConvNeXtV2-L backbone (see “Ours” in Table 1 and 2), showing a 1.4% increase in classification performance. Integration with other backbones also improved performance by 1-2% and showed promising localization results. Due to space limitations, we present only ConvNeXtV2-L results. Q4: Specify the backbone used for GradCAM and EigenCAM. A4: We used the ConvNeXtV2-L backbone for both. We will add detailed descriptions in the final version. Q5: Address dataset availability. A5: The dataset is being desensitized and reviewed for public release.

To Reviewer #4 Q1: Few related works exist due to the topic’s nature. Please elaborate. A1: This underexplored domain has few related works. As stated in Section 1, most CRC CT diagnosis methods lack interpretability or require extensive labeling. Our work addresses these issues with loose location self-supervision and mask correction loopback, aiding clinical CRC diagnosis. Q2: Dataset availability and applying the method to public CRC datasets. A2: Thanks for valuable feedback. Our dataset is being desensitized and reviewed for public release. We used Task 10 from the Medical Segmentation Decathlon for localization, obtaining bounding boxes from segmentation masks. Testing pre-trained models (ConvNeXtV2-L backbone), our method achieved a P-IoU of 47.09%, compared to 42.83% with fully supervised, validating our approach’s generalizability. Q3: Ambiguity in choosing the number of annotated slices. A3: We apologize for confusion. We used 6304 non-contrast CT slices (8.3% of the total) with lesion location bounding box labeling. We also tested with 1%, 5%, and 10% of the annotated data, achieving Accuracy and P-IoU of 87.23%/27.02%, 90.04%/50.86% and 91.25%/63.64%, respectively. Validating 8.3% proved effective. More details will be added in the revision. Q4: More discussion on limitations and feasibility in diverse clinical environments. A4: Thanks. The main limitation is that for diseases with significant location variations, such as large cardiovascular curvature, the CT imaging position may deviate, introducing erroneous constraints. Future improvements will consider additional criteria to avoid these inaccuracies.




Meta-Review

Meta-review not available, early accepted paper.



back to top