Abstract

Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, and (2) the interaction among the fine-grained pathological components in WSIs. Specifically, we leverage large language models to generate medical prompts that serve as prior knowledge in extracting instances associated with genetic biomarkers. We adopt a coarse-to-fine approach to mine biomarker information within the tumor microenvironment. This involves extracting instances related to genetic biomarkers using coarse medical prior knowledge, grouping pathology instances into fine-grained pathological components and mining their interactions. Experimental results on two colorectal cancer datasets show the superiority of our method, achieving 91.49% in AUC for MSI classification. The analysis further shows the clinical interpretability of our method. Code is publicly available at https://github.com/DeepMed-Lab-ECNU/PromptBio.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0622_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/DeepMed-Lab-ECNU/PromptBio

Link to the Dataset(s)

https://portal.gdc.cancer.gov/ https://pdc.cancer.gov/pdc/

BibTex

@InProceedings{Zha_Prompting_MICCAI2024,
        author = { Zhang, Ling and Yun, Boxiang and Xie, Xingran and Li, Qingli and Li, Xinxing and Wang, Yan},
        title = { { Prompting Whole Slide Image Based Genetic Biomarker Prediction } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15004},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the authors have proposed a genetic biomarker prediction method via prompting techniques. The experimental results on two CRC cohorts shows that the proposed method can achieve higher classification accuracy than the comapring methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Predicting the microsatellite instability and the mutation of gene BRAF based on the histopathological images.
    2. Applying the LLM model i.e., PLIP to select the ROIs in WSIs that are associated with different TME components.
    3. Evaluate the performance of the proposed method on both TCGA and CPTAC dataset.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The novolty of the proposed method is limited. Although applying the PLIP model to select patches with specific category is applicable, it is hard to judge wheather the selected patch meet the requirment.
    2. Applying the Transformer to characterize the assoication among different patches are widely applied in the existing studies listed follow: a. Wang, X., Yang, S., Zhang, J., Wang, M., Zhang, J., Yang, W., Huang, J. and Han, X., 2022. Transformer-based unsupervised contrastive learning for histopathological image classification. Medical image analysis, 81, p.102559. b. Xu, H., Xu, Q., Cong, F., Kang, J., Han, C., Liu, Z., Madabhushi, A. and Lu, C., 2023. Vision transformers for computational histopathology. IEEE Reviews in Biomedical Engineering.
    3. The experiments are conducted without cross-validation, so it is hard to say that the proposed method is significanyly superior to the comparing methods.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please see the comments above

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The novolty of the proposed method is limited that cannot meet the standard of the MICCAI conference.
    2. WIthout the cross-validation, it is hard to judge the superiorty of the proposed method.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed a multi-modality model for genetic biomarker prediction from (vision-language model) VLM and whole slide image. The pipeline comprises 3 steps: (1) selection of patches belonging to 9 tissue classes using PLIP . (2) fine-grained pathological component grouping based on the prototype features extracted from pathology text prompts. And (3) Transformer-based pathological component mutual attention. The proposed method was evaluated on three datasets, and the results showed the superiority in performance over other pre-existing methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1). Well written and easy to follow.
    (2). A simple solution to integrate multi-modal information for an important clinical problem.
    (3). Comprehensive experimental evaluation, and the improvements in performance are with large margins in comparison with other methods.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    From the methodological point of view, the novelty is limited. This paper is not the first to introduce the concept of integrating image and text for pathological analysis [1] [2]. Also, the techniques adopted have been widely used in the related work, e.g, The grouping method is borrowed from [3], and the step 3 using transformer for multi-token aggregation is a common operation

    [1] Qu et al, “The rise of ai language pathologists: Exploring two-level prompt learning for few-shot weakly-supervised whole slide image classification”

    [2] Lu et al, “Visual Language Pretrained Multiple Instance Zero-Shot Transfer for Histopathology Images”.

    [3] Richard “Multimodal co-attention transformer for survival prediction in gigapixel whole slide images”

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The selection procedure in Fig.1 should be more in details.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the novelty in methodology is limited, this is a well-written paper, and the experiments and the corresponding analysis are related thorough.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    The rebuttal seems not very convincing. The authors cannot address my concerns regarding the novelty w.r.t. pre-existing works. (1) Integrating LLM features with WSI is not a new concept. (2). Transferring the idea used in [3] from genetic \& WSI to text \& WSI is straightforward. Overall, the implementation details may slightly different from existing works, but no significant contribution is introduced.



Review #3

  • Please describe the contribution of the paper

    The authors propose a framework that significantly advances the automatic prediction of genetic biomarkers from Whole Slide Images (WSIs), demonstrating a substantial improvement over existing state-of-the-art methods. Specifically, this method leverages the vision-language model, PILP, to effectively select and integrate features from WSIs and associated textual data. For the textual analysis, the framework utilizes ChatGPT to generate four pathology text prompts that encapsulate the salient features of the WSIs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1)The effective utilization of the current vision-language model to perform instance selection, showcasing the model’s capability in accurately identifying relevant instances within the data. 2) The innovative adoption of a Large Language Model (LLM) to generate comprehensive and precise pathology descriptions, enhancing the interpretability and utility of the findings.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    First, it lacks a detailed discussion on the influence of the number of pathology text descriptions, which could provide further insight into the generalizability and robustness of the model. Additionally, the results presented in Table 2 from the ablation study reveal an approximate 4% drop in performance when using the Selection + PGG combination on the TCGA(MSI) dataset, a point that remains unexplained. Furthermore, there is no exploration of the combination of Selection+PCIM in the study. Also, the 0.9156 is better than 0.9149, should be bold instead.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The ablation study should be conducted to thoroughly investigate the influence and impact of each component within the proposed framework. This detailed examination is crucial for demonstrating the individual contributions and effects of the components to the overall efficacy of the model.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The overall score for this paper is primarily influenced by the innovative approach of the method and the extensive experimental validation provided to support its effectiveness.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Rebuttal: We thank all reviewers and the AC for constructive feedback. First, we do not simply integrate image and text or apply Transformer to integrate multi-tokens. Our main contributions are (1) a coarse-grained pathological instance selection strategy, (2) a prompt-guided fine-grained pathological component (PC) grouping strategy, and (3) hierarchical contextual interaction of intra- and inter-PCs mining strategy, which enhances tumor microenvironment (TME) modeling. We will release the code once accepted. Next, we will address the major concerns one by one.

1)@R1 @ R3: the novelty We argue our method is novel. We propose a coarse-to-fine approach to mine biomarker information, introduce text prompts to group instances and focus on modeling TME hierarchically. 1.1)Compared to [1], we do not simply integrate image and text. [1] guides aggregation of instances considering similarity between image and text features as pooling weights, but we group instances into different PCs based on similarity. We compared ours with [1] in 8th row of Table 1 in paper, showing ours is superior. Compared to [2], our motivation in integrating image and text differs. [2] aims to align image and text features, but we introduce aligned text features to group instances into PCs in TME. 1.2)Compared to [3], (a) grouping objects differ: [3] groups genes, but we group instances of WSIs. (b) Grouping criteria differ: [3] groups genes based on similar biological functions, but we group instances guided by text prompts related to genetic biomarkers. (c) Grouping purposes differ: [3] groups gene to reweight WSI embeddings, but we group instances to further learn the interactions of intra- and inter-PCs in TME. 1.3)We do not simply merge multi-tokens by Transformer [a,b], we focus on modeling TME hierarchically instead. The Transformer layers are designed to learn features of TME hierarchically including interaction of intra- (first Transformer layer aggregates tokens within each PC) and inter-PCs (second Transformer layer aggregates all pooled tokens of PCs).

2)@R3: the selected patches meet the requirement. 1.1) From medical knowledge: stroma has been proven to be related to gene biomarkers. 1.2) From pathologists: pathologists have validated selected patches in Fig 2 indeed belong to stroma and the selected patches contain inflammatory response and lymphatic infiltration related to gene biomarkers. 1.3) From results: in Table 2, the ablation in 2nd and last rows shows selecting stroma patches can enhance our method.

3)@R3: without cross-validation Experiments are widely conducted via Hard-split of dataset without cross-validation in WSI tasks including biomarker prediction (Chen R, UNI, Nature Medicine 2024; Lin T, IBMIL, CVPR 2023; Jin T, GiMP, MICCAI 2023; Wang T X, PALHI, ISBI 2020). We follow their settings. Further with a 4-fold cross-validation on TCGA(MSI) dataset, our method still outperforms 2nd best TOP and 3rd best DSMIL by 5.63% and 5.66% in AUC.

4)@R4: ablation 4.1) number of text descriptions We adjust seed of inference parameters in GPT-4 and regenerate 5 descriptions. Result difference on TCGA(MSI) dataset is 0.06%, indicating varying number of pathology text descriptions has little impact on result. 4.2) explanation of performance drop using Selection+PGG combination The reasons for performance drop were explained in Sec. 3.3 in paper (1st line on Page 8). W/o PCIM (simply aggregate multi-tokens), interaction of intra- and inter-PCs is not mined. 4.3) exploration of Selection+PCIM We explored Selection+PCIM in paper (see “K-means” in the 4th row of Table 2). W/o PGG, instances are not grouped into different PCs and PCIM cannot be applied. We replaced PGG with Kmeans results in a performance drop, showing the importance of PGG. We further replace PGG by random grouping. Results reveal an approximate 4.1% drop on TCGA(MSI) dataset. These both indicate PGG can better model fine-grained PCs related to genetic biomarkers in TME.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    NA

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NA



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    NA

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    NA



back to top