Abstract

Diagnosing medical conditions from histopathology data requires a thorough analysis across the various resolutions of Whole Slide Images (WSI). However, existing generative methods fail to consistently represent the hierarchical structure of WSIs due to a focus on high-fidelity patches. To tackle this, we propose Ultra-Resolution Cascaded Diffusion Models (URCDMs) which are capable of synthesising entire histopathology images at high resolutions whilst authentically capturing the details of both the underlying anatomy and pathology at all magnification levels. We evaluate our method on three separate datasets, consisting of brain, breast and kidney tissue, and surpass existing state-of-the-art multi-resolution models. Furthermore, an expert evaluation study was conducted, demonstrating that URCDMs consistently generate outputs across various resolutions that trained evaluators cannot distinguish from real images. All code and additional examples can be found on GitHub.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0770_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0770_supp.pdf

Link to the Code Repository

https://github.com/scechnicka/URCDM

Link to the Dataset(s)

https://www.nejm.org/doi/10.1056/NEJMp1607591 https://www.cancer.gov/ccg/research/genome-sequencing/tcga

BibTex

@InProceedings{Cec_URCDM_MICCAI2024,
        author = { Cechnicka, Sarah and Ball, James and Baugh, Matthew and Reynaud, Hadrien and Simmonds, Naomi and Smith, Andrew P.T. and Horsfield, Catherine and Roufosse, Candice and Kainz, Bernhard},
        title = { { URCDM: Ultra-Resolution Image Synthesis in Histopathology } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15004},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a novel method utilizing Ultra-Resolution Cascaded Diffusion Models (URCDMs) to generate high-quality, realistic histopathology images at the Whole Slide Imaging (WSI) scale, which is a pioneering achievement in the field. The approach effectively captures intricate details at various magnifications and facilitates long-range contextual understanding, overcoming the memory limitations observed in attention-based models. Importantly, it accomplishes this with significantly reduced computational resources, enabling efficient image generation, particularly in data-intensive WSI learning scenarios.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper conducts comprehensive experiments on three diverse datasets.
    2. Addressing the topic of generating WSIs holds clinical significance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Lack of experiments utilizing the proposed methods as an augmentation technique to enhance patch classification and WSI classification.
    2. The computational requirements of the paper are significant, but there is a lack of detailed analysis on this aspect.
    3. The technique described in the paper is not depicted clearly. The meaning of the blue, red, and green boxes in Fig. 1 remains unclear.
    4. Code availability would greatly aid in understanding the proposed method.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    see weakness

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although this task is intersting and significant, the technique introduction is poor, also missing some important experiments.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors have addressed my first question regarding the clarity of the method description and my second question about computation cost, promising to include the computation cost analysis in the revised manuscript. However, they did not respond to my question about the downstream task. Therefore, I change my rate to weak accept.



Review #2

  • Please describe the contribution of the paper

    The paper introduces an innovative method called Ultra-Resolution Cascaded Diffusion Models (URCDM) to generate high-fidelity, photorealistic histopathology images across multiple magnifications of Whole Slide Images (WSI). This method addresses the challenge of representing the hierarchical structure of WSIs by synthesizing entire images, capturing detailed anatomical and pathological features. Evaluated on three distinct datasets (brain, breast, kidney tissues), URCDMs have shown to surpass existing state-of-the-art models and produce images indistinguishable from real ones by trained evaluators.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.The paper employs a three-stage cascade structure to capture detailed features at multiple magnifications effectively. This novel approach allows for a fine-grained synthesis of images, enhancing the resolution and detail fidelity at each subsequent stage. 2.Similar to Cascaded super-resolution methods, in this paper, each stage of the image generation process is conditioned on the output from the previous stage, ensuring continuity and consistency with the high-level structure of the original image. This method significantly enhances the coherence and realism of the synthesized histopathology images, maintaining important contextual relationships.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1.The paper lacks a detailed and clear explanation of the URCDM image generation process, particularly the mechanics of transitioning between different stages of the model. This omission hampers the reader’s understanding of the operational framework and theoretical underpinnings of the proposed method. 2.The presentation of experimental results is limited to a few examples, which does not sufficiently demonstrate the robustness or effectiveness of the proposed methods across varying conditions or datasets. 3.the authenticity and applicability of the synthesized images have only been validated on one of the three datasets used in the study. This narrow scope of validation raises concerns about the generalizability of the results across different types of histopathology data.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    It is necessary to provide a detailed description of the overall framework of the proposed Ultra-Resolution Cascaded Diffusion Models. For example, How the 1024x1024 image generated in the first stage is upsampled to generate the 6400x6400 image in the second stage is not explained in detail.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    (1)The author needs to provide a detailed description of how to use the image patches from the previous stage as conditions to ensure that the generated patches are consistent with the high-level structure of the image. (2) It is necessary for the authors to explain how to ensure the authenticity of the generated pathological images in order to apply them to downstream tasks such as classification, segmentation, and recognition.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Utilizing diffusion models to generate pathological images is quite common, however, generating whole slide images using these models is a relatively rare endeavor. Therefore, this work is both intriguing and insightful.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    My decision remains unchanged; the rebuttal lacks new perspectives or results that would warrant an improved score.



Review #3

  • Please describe the contribution of the paper

    The authors introduced a new method for generating high-quality synthetic whole slide images (WSI) in histopathology using cascaded diffusion models. The effectiveness of the method was demonstrated through a comprehensive analysis and evaluation by experts.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. They contributed to a new research direction for WSI in pathology, which had previously focused primarily on high-quality patch-by-patch studies.
    2. Their expert evaluation provides a balanced view, highlighting both the strengths and weaknesses of their approach.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Question for the need to synthesize high-resolution WSIs: Given that current pathology analysis algorithms mainly relies on a patch-by-patch approach, I have a concern about the need to generate high-resolution whole slide images (WSIs) that differ from studies such as [1].
    2. Validity of expert assessment: If synthetic WSIs are of such a quality that their authenticity can be easily judged by experts familiar with certain shortcuts, it is questionable whether it is a meaningful experiment to assess their authenticity. [1] Aversa, Marco, et al. “Diffinfinite: Large mask-image synthesis via parallel random patch diffusion in histopathology.” Advances in Neural Information Processing Systems 36 (2024).
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Performing downstream analysis using real and synthetic WSIs seems like a good way to validate the usefulness of synthetic WSIs.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experimental results and discussion of the results were a major factor in the overall score.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    This work pioneered WSI generation work, bringing interesting research directions to the domain.




Author Feedback

We thank the reviewers for their thoughtful feedback and time. The reviewers agree on pioneering methodological novelty, commend the new research direction, clinical significance, and thorough evaluation.

(R1, R4) Clarity of method description: We use a cascading diffusion model approach. (1) generation of a low-resolution (1024x1024 pixels) WSI using the first of three CDMs. (2) results from the first sage are enhanced through overlapping patches of the generated low-resolution image as conditioning for the second CDM. This second model uses the spatial context provided by the first to generate higher-resolution images (also 1024x1024 pixels) for each patch’s center. These patches are then stitched together, taking into account the overlaps, to form a medium-resolution image of 6400x6400 pixels. The process is repeated with the third CDM to achieve a final, high-resolution synthetic WSI of 41,344x41,344 pixels. This method allows each stage of magnification to build upon the last, refining details and expanding the image size. Inpainting ensures seamless integration of patches, avoiding artifacts and ensuring that synthetic images are useful for both computational analysis and practical clinical applications. As explained in the caption of Figure 1 the blue box is the conditioning patch, the green box signifies the center patch that is generated and the red box shows the output of size 1024x1024 for each image. We will use colour font to highlight this in the caption.

(R4) downstream experiments/computational costs: Our method is particularly useful for niche domains where limited or no public data is available, thus kidney pathology is a prominent example in our work. We also integrate public datasets to a) show that our method scales to other domains and b) to foster reproducibility. To the best of our knowledge only very few works focus on large scale dependencies and we are the first to synthesize at WSI scale. Using synthetic patch data as augmentation has been evaluated in literature before. As stated in the introduction, our main aim is to make such data publicly available through synthesis and to enhance other data, e.g., for bias mitigation. This is essential, when downstream tasks depend on WSI-level assessment such as structural assessment in the kidney in contrast to, e.g., localized cell anomalies in cancer. We will clarify computational requirements in the camera ready version. Note, that synthesis only needs to be done once before, e.g., a synthetic dataset can be shared.

(R3, R4) expert validation: To the best of our knowledge our work is the first to show synthetic data that is essentially indistinguishable from real data for experts. At present, we only have access to kidney pathologists and will expand this evaluation in future, since human expert user studies need to be carefully balanced with the workload of rare experts. Note that the shortcut mechanism (R3) is extremely rare and cannot be used to reliably identify synthetic data at scale. We will clarify in the paper.

(R3, R4) usefulness of synthetic WSIs: Many patch-based pathology algorithms rely on “bags of patches” sourced from complete WSIs; therefore, synthesizing the entire WSI is crucial for providing realistic and structurally accurate training data. Furthermore, the generation of high-resolution WSIs mirrors clinical practice and can also be used for human training on top of data sharing, augmentation, and bias mitigation options as discussed above. We will discuss differences of large-mask synthesis (Aversa et al.) vs. whole WSI generation in the paper.

(R1) only a few examples: we evaluate on three different datasets and the supplement shows several visual examples. We will add more visual examples to the supplement in the final version.

(R3, R4) We will publish our code together with a synthetic dataset from our kidney data with the camera-ready paper as a public github repository.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Based on the reviews and author feedback, I recommend accepting this paper. The reviewers collectively appreciate the methodological novelty of the Ultra-Resolution Cascaded Diffusion Models (URCDM) which generate high-fidelity, photorealistic histopathology images across multiple magnifications. While there are some concerns about the clarity of the method and limited experimental demonstrations, these issues are not severe enough to outweigh the benefits.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Based on the reviews and author feedback, I recommend accepting this paper. The reviewers collectively appreciate the methodological novelty of the Ultra-Resolution Cascaded Diffusion Models (URCDM) which generate high-fidelity, photorealistic histopathology images across multiple magnifications. While there are some concerns about the clarity of the method and limited experimental demonstrations, these issues are not severe enough to outweigh the benefits.



back to top