Abstract

Whole slide images are the foundation of digital pathology for the diagnosis and treatment of carcinomas. Writing pathology reports is laborious and error-prone for inexperienced pathologists. To reduce the workload and improve clinical automation, we investigate how to generate pathology reports given whole slide images. On the data end, we curated the largest WSI-text dataset (PathText). In specific, we collected nearly 10000 high-quality WSI-text pairs for visual-language models by recognizing and cleaning pathology reports which narrate diagnostic slides in TCGA. On the model end, we propose the multiple instance generative model (MI-Gen) which can produce pathology reports for gigapixel WSIs. We benchmark our model on the largest subset of PathText. Experimental results show our model can generate pathology reports which contain multiple clinical clues and achieve competitive performance on certain slide-level tasks. We observe that simple semantic extraction from the pathology reports can achieve the best performance (0.838 of F1 score) on BRCA subtyping surpassing previous state-of-the-art approaches. Our collected dataset and related code are available at https://github.com/cpystan/Wsi-Caption.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0761_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0761_supp.pdf

Link to the Code Repository

https://github.com/cpystan/Wsi-Caption/tree/master

Link to the Dataset(s)

https://drive.google.com/file/d/1MLXUaqH5Yuv7RfyKW1hIWqnecHNqgZQR/view

BibTex

@InProceedings{Che_WsiCaption_MICCAI2024,
        author = { Chen, Pingyi and Li, Honglin and Zhu, Chenglu and Zheng, Sunyi and Shui, Zhongyi and Yang, Lin},
        title = { { WsiCaption: Multiple Instance Generation of Pathology Reports for Gigapixel Whole-Slide Images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15004},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a method for generating textual report (beyond captioning) from the entire high-resolution whole slide image. They utilise decent sized WSI data, visual extractor, and encoder-decoder architecture to extract visual features at patch level, aggregate those through position aware module, and utilise the aggregated features to generate text report.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths are tackling the challenge of pathology report generation, working with large and high-resolution data, and systematic evaluation against several existing methods / benchmarks / baselines.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses are in going beyond captioning / report to WSI classification which isn’t needed and the results are not convincing as the comparison is made against older techniques (e.g., H2T, CLAM would be fair comparisons for slide level tasks).

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    will be made publicly available, but details about making this avaiable are not shared in the manuscript. Several implementation details are missing to do with hyperparameters or algorithmic parameters for various models.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    no further comments, see comments above

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a novel method for generating pathology reports from entire WSIs on a larg sized data. The results are convincing, and the comparisons are favourable.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    (1) The paper designs a novel method to automate the generation of pathology reports from WSI, addressing the laborious and error-prone nature of manual report writing. (2) By curating the largest WSI-text dataset and introducing the MI-Gen model, the study shows competitive performance in generating reports with multiple clinical clues. (3) This method can achieve a promising performance on certain slide-level tasks like tumor subtyping, surpassing previous state-of-the-art MIL methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The paper introduces a novel perspective by generating WSI-level reports, which is different from previous methods that rely on small patches and patch-level descriptions. (2) The pipeline for producing WSI-level reports datasets will significantly contribute to future researches. (3) This paper offers comprehensive quantitative results for various combinations of visual extractors and encoder-decoders, providing a thorough benchmark.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I am curious about the effectiveness of the prompt (“Please summarize the following pathology report:”) used to generate the report. Have other prompts been explored? It would be beneficial to understand how this specific prompt yields reasonable reports.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    To improve the robustness and accuracy of the generated pathology reports, I suggest considering the integration of additional modalities, such as biomarker data, clinical data, genomic data, and more.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The research in this paper is at the forefront of the field. Its impact could be further enhanced by making the source code and dataset available earlier.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper
    1. On the data side, the authors curated a WSI-text dataset (TCGA-PathoText).
    2. On the model side, the authors proposed the multiple instance generative (MI-Gen) model to generate pathology reports from WSIs.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors proposed a pipeline utilizing OCR and LLM to convert PDF format pathology report to text and curated more than 9K WSI-text pairs for the development of visual-language model for the slide level.
    2. The authors proposed MI-Gen model to generate texts from slides.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The experiemnts is only conducted on TCGA-BRCA. Extensive evaluations on more than 1 dataset would be more convincing.
    2. How the model is applied to slide-level tasks needs further elaboration.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    accept as it is.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Strong Accept — must be accepted due to excellence (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The topic of WSI caption is of significant clinical meaning and research interests. The authors created the largest WSI-text dataset, and proposed MI-Gen model to translate WSIs to reports, and show decent results.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Thank so much for the appreciation of the reviewers! We will provide more details as you suggested in the final version. The code and dataset is now public at https://github.com/cpystan/Wsi-Caption




Meta-Review

Meta-review not available, early accepted paper.



back to top