Abstract

Cervical cancer poses a severe threat to women’s health globally. As a non-invasive imaging modality, cervical optical coherence tomography (OCT) rapidly generates micrometer-resolution images from the cervix, comparable nearly to histopathology. However, the scarcity of high-quality labeled OCT images and the inevitable speckle noise impede deep-learning models from extracting discriminative features of high-risk lesion images. This study utilizes segmentation masks and bounding boxes to construct prior activation maps (PAMs) that encode pathologists’ diagnostic insights into different cervical disease categories in OCT images. These PAMs guide the classification model in producing reasonable class activation maps during training, enhancing interpretability and performance to meet gynecologists’ needs. Experiments using five-fold cross-validation demonstrate that the PAM-guided classification model boosts the classification of high-risk lesions on three datasets. Besides, our method enhances histopathology-based interpretability to assist gynecologists in analyzing cervical OCT images efficiently, advancing the integration of deep learning in clinical practice.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1307_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1307_supp.pdf

Link to the Code Repository

https://github.com/ssea-lab/AMGuided_Cervical_OCT_Classification

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Wan_Prior_MICCAI2024,
        author = { Wang, Qingbin and Wong, Wai Chon and Yin, Mi and Ma, Yutao},
        title = { { Prior Activation Map Guided Cervical OCT Image Classification } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper leverage the knowledge and skills of pathologists in analyzing cervical OCT images to construct prior activation maps (PAMs). These PAMs guide the classification model to produce reasonable class activation maps during training, enhancing interpretability and performance to meet gynecologists’ needs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper is of good clarity.
    2. The method has clinical significance.
    3. The dataset is big; the experiment is well designed; the visualizations looks good.
    4. The experiment result is valid to prove adding PAM can improve the performance as well as the interpretability.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The method did not include comparison between different methods. There are existing methods which use similar data for the same task e.g., [14] is already an interpretable approach which can classify cervival OCT. I can increase my score if I am not correct here.
    2. The improvement from the PAM seems to be marginal.

    [14]Wang, Qingbin, et al. “Cross-Attention Based Multi-Resolution Feature Fusion Model for Self-Supervised Cervical OCT Image Classification.” IEEE/ACM Transactions on Computational Biology and Bioinformatics (2023).

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Considering that there is already a paper which did almost the same task, I think a comparison is needed, in interms of performance and interpretable CAM maps.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The PAM is a reasonable idea to improve interpretability. But the paper did not make a full comparison between different methods. Only ablations are given. So I recommend a weak reject.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The rebuttal makes some sense. I now increased my score to 4.weak accept. This paper ranked 2nd in my rebuttal stack.



Review #2

  • Please describe the contribution of the paper

    This paper addresses the scarcity of high-quality annotations and the presence of speckle noise in cervical optical coherence tomography (OCT) images, which affects the accurate extraction of high-risk lesions by deep learning models and the lack of interpretability in traditional deep learning models. By utilizing segmentation masks and bounding boxes, pathologists’ diagnostic insights are encoded into different categories of cervical diseases in cervical OCT images. By guiding the classification model to generate reasonable class activation maps during training, it becomes easier to understand and interpret. The PAM-guided classification model shows significant performance improvement in the classification of high-risk lesions. By enhancing the interpretability based on tissue pathology, it helps gynecologists analyze cervical OCT images more effectively, thus promoting the application and integration of deep learning technology in clinical practice.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. By introducing the concept of Prior Activation Maps (PAM), this study proposes a novel approach to address issues in cervical OCT image analysis. PAM not only improves the performance of deep learning models but also enhances their interpretability, thereby increasing doctors’ trust in the models.
    2. The paper’s structure is reasonable, and the method description is logical and clear, with authentic experimental data.
    3. The authors validate the effectiveness of the proposed method through five-fold cross-validation experiments and demonstrate improved classification performance on multiple datasets. This experimental evidence enhances the credibility and persuasiveness of the research findings.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The proposed process for generating PAMs might be relatively complex, requiring manual segmentation mask and bounding box generation. This could increase the deployment cost and time consumption of the model, limiting its feasibility in actual clinical applications.
    2. While the paper mentions the impact of speckle noise on deep learning models, it does not thoroughly discuss how to accurately handle this type of noise. If speckle noise is not effectively addressed, it could reduce the model’s performance and reliability.
    3. Although the authors used five-fold cross-validation to validate the effectiveness of the proposed method, the authors could include some related comparison methods.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The method mentions calculating distances, such as computing the shortest distance between the current epithelial pixel and all BM pixels. In Step 3, the variable abm likely represents a specific parameter related to the BM pixels, but without further context or definition, it’s challenging to provide a precise formula.
    2. The authors should clarify the purpose of generating heatmap in the overall process, since the heatmap is not used in the training framework.
    3. The evaluation metrics should be defined clearly.
    4. In addition to using five-fold cross-validation, other evaluation metrics and methods can be considered to assess the model’s performance, such as ROC curve analysis, confusion matrix, sensitivity, and specificity, to comprehensively understand the model’s performance and limitations. The model’s parameter count and inference speed should be provided to ensure real-time responsiveness in actual medical scenarios.
    5. If the segmentation and bounding boxes are considered as prior knowledge, the figure 1a should be revised to avoid misunderstanding. In figure 1 it sees there is another step to extract prior knowledge from segmentation masks and bounding boxes.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method proposed in this paper is novel and significant, as it introduces Prior Activation Maps (PAM) for the first time to enhance the interpretability and performance of deep learning models in cervical OCT image analysis by utilizing segmentation masks and bounding boxes. Constructing PAMs based on pathologists’ insights adds a new dimension to the model training process, potentially leading to more clinically relevant and interpretable results.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors have addressed my concerns. I would keep my recommendation to accept this paper.



Review #3

  • Please describe the contribution of the paper

    In this paper, the authors present a method which uses both segmentation and classification heads to train an OCT image - to - cervical disease classifier. First, they propose a method to generate prior activation maps (PAM) in a semi-automatic manner. Upon generating PAMs, they train a model with OCT images as input, and a training objective which combines both a classification loss (5-class) and a reconstruction loss (against PAMs). Using three datasets (1 labeled dataset for training, 2 auxiliary datasets for validation) the authors demonstrate that (A) PAM-guided classification boosts the performance of high-risk lesions classification across all datasets used, and PAM-guidance result in better performance (by comparing classification accuracies), and more attention on relevant regions (by comparing Grad-CAM activations).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • technically sound, good evaluation, reads well.
    • includes detailed steps for the PAM ​​generation process and training process
    • includes detailed figures and captions
    • In addition to the main dataset, the authors have utilized two external datasets (Huaxi dataset and Xiangya dataset) for validation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • What’s the rationale behind merging image data with the segmentation mask? Why not train against the segmentation mask directly?

    This paper lacks adequate references to the following claims.

    • “However, most gynecologists are not yet well-versed in diagnostic features in cervical OCT images.”
    • “Cervical OCT images inherently contain coherent noise … obstructs deep-learning models from identifying distinctive features in high-risk cervical lesion images … overfit more readily to the noise, resulting in poor generalizability …”
    • “… GradCAM [10] visualizations … tend to be unstable and occasionally inaccurate.”
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    What’s the rationale behind merging image data with the segmentation mask? Why not train against the segmentation mask directly?

    Try to substantiate any strong claims made in the introduction section.

    Please provide references to the two validation datasets, if available.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • technically sound, good evaluation, reads well.
    • includes detailed steps for the PAM ​​generation process and training process
    • includes detailed figures and captions
    • In addition to the main dataset, the authors have utilized two external datasets (Huaxi dataset and Xiangya dataset) for validation.
  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely thank all reviewers and ACs for their detailed and constructive comments. Reviewer: #1

  1. The value of a^BM was set to 1, as explained in the 2nd paragraph in Section 3.1.
  2. The purpose of depicting the heatmap in Fig.1(a) is to show that the accurately generated heatmap can provide good visualization and interpretability.
  3. The definitions of metrics, including accuracy, sensitivity, specificity, AUC, and confusion matrix, will be added in final version.
  4. Model parameters and inference speed: ResNet-18 (11.18M params, 0.004s/image) and ConvNeXt_Pico (8.54M params, 0.006s/image).
  5. We will add a function set symbol on the “Prior Knowledge” (text-like icon) to avoid misunderstandings in Fig. 1(a).
  6. Generating PAMs is not complex. The segmentation model performed well with 4,200 images, and bounding boxes needed only one rectangle per image.
  7. The handle of speckle noise. Activation values of speckle noises in non-tissue areas are all set to 0, guided by segmentation masks and bounding boxes. A 7×7 average filter is applied for tissue-related noises before calculating the activation map to reduce speckle noises. We will make this clear in the final version.
  8. Related comparison methods. In our literature review, the existing cervical OCT image classification methods are mainly based on self supervised learning. This is the first work to propose the novel PAM to guide the classification model in supervised manner. We have not found other appropriate benchmarks using similar methods.

Reviewer: #3

  1. Why not train against the segmentation mask directly? Only MI, CY, and EP images have distinct structures. HSIL and CC images lack these and can only be annotated with bounding boxes. Using a segmentation model on new images without knowing the category would produce unreasonable results for HSIL and CC, as the model isn’t trained for them.
  2. We will substantiate the strong claims in introduction and add adequate references in the final version.
  3. We can’t publicize the dataset due to regulations and privacy concerns, but we’ll release the code and samples on GitHub.

Reviewer: #4

  1. The method did not compare to the existing method [14]. There are five reasons we did not compare to the work [14]: 1) Different learning manner: This proposed method is supervised learning manner. Work [14] focused on improving the classification in self-supervised pre-training manner. 2) Different motivations: Work [14] did not take any measure to address the issue of inaccurate and unstable interpretability in cervical OCT image classification. 3) Different datasets: Work [14] used a large amount of unlabeled data for self-supervised pre-training, while we used only a small amount of labeled data. 4) Different image sizes: Work [14] used 600×600 image patches, while we used the entire OCT frame of 761×1200. Their ViT-B (with quadratic complexity) network (86M params) with 3 branches had extremely high computational complexity and memory requirements. Our RTX A6000 GPU (48G) can barely handle full-frame training with a batch size of 2 to avoid memory overflow. 5) Different visualization methods: Work [14] used similarity matrices from multi-head attention for visualization, only applicable to Transformers. Instead, we used the GradCAM method for CNNs.
  2. The improvement from the PAM seems to be marginal. As shown in Table 2, our method significantly increased sensitivity (an average increase of 4.13%, with a maximum of 7.92%). Despite slight decreases in a few specificity values (<0.8%), binary classification accuracy and AUC values comprehensively improved. Overall, our model significantly enhances classification performance, especially in sensitivity. As shown in Fig.3, PAM apparently improved the activation area of GradCAM. For MI, CY, and EP images, PAM constrained attention to the basement membrane, cysts, and protrusions. For HSIL and CC images, PAM accurately directed attention to dark areas, avoiding non-tissue areas.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top