Abstract

This study explored the application of implicit neural representations (INRs) to enhance digital histopathological imaging. Traditional imaging methods rely on discretizing the image space into grids, managed through a pyramid file structure to accommodate the large size of whole slide images (WSIs); however, the continuous mapping capability of INRs, utilizing a multi-layer perceptron (MLP) to encode images directly from coordinates, presents a transformative approach. This method promises to streamline WSI management by eliminating the need for down-sampled versions, allowing instantaneous access to any image region at the desired magnification, thereby optimizing memory usage and reducing data storage requirements. Despite their potential, INRs face challenges in accurately representing high spatial frequency components that are pivotal in histopathology. To address this gap, we introduce a novel INR framework that integrates auxiliary convolutional neural networks (CNN) with a standard MLP model. This dual-network approach not only facilitates pixel-level analysis, but also enhances the representation of local spatial variations, which is crucial for accurately rendering the complex patterns found in WSIs. Our experimental findings indicated a substantial improvement in the fidelity of histopathological image representation, as evidenced by a 3-6 dB increase in the peak signal-to-noise ratio compared to existing methods. This advancement underscores the potential of INRs to revolutionize digital histopathology, offering a pathway towards more efficient diagnostic imaging techniques. Our code is available at https://pnu-amilab.github.io/CINR/

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/4034_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/4034_supp.pdf

Link to the Code Repository

https://github.com/pnu-amilab/CINR

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Lee_Convolutional_MICCAI2024,
        author = { Lee, DongEon and Park, Chunsu and Lee, SeonYeong and Lee, SiYeoul and Kim, MinWoo},
        title = { { Convolutional Implicit Neural Representation of pathology whole-slide images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15007},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents an innovative approach to INR (Implicit Neural Representation) that utilizes a new coordinate encoding strategy combined with hash tables at multiple resolutions. The authors enhance their model by integrating an auxiliary CNN with the MLP. This dual-network approach extends the scope of contextual analysis beyond individual pixels, enabling a more precise estimation of pixel values and effectively handling local spatial variations. The experimental results demonstrate a notable enhancement in the restoration of high-frequency details in pathological images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    One of the notable strengths of this article lies in its introduction of a groundbreaking position encoding strategy that employs a hash mapping technique to transform raw coordinates into feature vectors. This approach demonstrates both efficiency and innovation in its implementation in the Positional encoding section. On the other hand, this paper presents decent experimental results, demonstrate a notable enhancement in the restoration of high-frequency details in pathological images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The experimental results are mainly based on individual WSI samples, lacking support from large-scale datasets, which is not convincing enough. The differences in the visualization results are subtle and difficult to judge with the naked eye. Quantitative evaluation metrics should be adopted to compare the strengths and weaknesses of different methods. The experimental setup is singular. It is recommended to conduct evaluations on more public datasets to comprehensively verify the generalization capability of the method. In summary, the authors need to supplement more comprehensive experimental analysis and conduct tests on multiple public datasets to enhance the credibility and persuasiveness of the experimental results.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Firstly, the paper originates from reducing the storage space for WSIs by using a small number of network parameters to store high-resolution WSIs. However, the entire paper does not provide any data or experiments to demonstrate how much space or cost this method can actually reduce, deviating from the central idea. Secondly, the experimental section of the paper is also insufficient,the experimental results are mainly based on individual WSI samples, lacking support from large-scale datasets, which is not convincing enough. The differences in the visualization results are subtle and difficult to judge with the naked eye. Quantitative evaluation metrics should be used for comparison. The experimental setup is singular. It is recommended to evaluate on more public datasets to verify the generalization capability of the method. Finally, the article’s explanation of the application of the model is not very clear. I did not understand whether a separate position encoding module and CINR module need to be trained for each WSI that needs to be stored, or if only the position encoding module needs to be trained and the CINR module is universal.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Firstly, the article slightly deviates from the central idea of saving space for high-resolution WSI, without providing any data or experiments to support this claim. Secondly,the authors need to supplement more comprehensive experimental analysis and conduct tests on multiple public datasets to enhance the credibility and persuasiveness of the experimental results. Finally, the authors need further explanations to enhance the readability and logical coherence of the paper.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors clarify the proposed problems and give some constructive defination of they proposed methods. Although, there still exists some hinders of utilizing INR for efficient transmission, storage and reconstruction of whole slide image, this paper provide a newly solution, which is rarely involved in the domain . And this should be welcome in the area of computational pathology. So, I recommand “Accept” for this paper.



Review #2

  • Please describe the contribution of the paper

    The size of the pathology image slices is very big. The slices observation is traditionally done with image pyramids and image blocks. This study uses the direction of Implicit Neural Representation (INR) where a region is recovered from deep learning by entering the coordinates. Such methods suffer from lack of detail and lack of high frequencies for the recovered image. To address this problem, they introduce a novel INR framework that integrates auxiliary convolutional neural networks (CNN) with a standard MLP model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The method is potentially useful for the practical observation of large high resolution whole slide images. The study addresses a valid problem, that is improving detail in the reconstructed pathology images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    It is not explained why an additional path in the deep network with convolutional neural network layers can achieve the task of improving resolution in the reconstructed images.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The reconstructed images in the experiments in figure 2 and figure 4 with the different methods all look the same. The superiority of the proposed CINR method is marginally obvious by observation. The study should address this point and find the differences given that the objective is to improve the ability to observe the images.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is useful and also demonstrated to quantitatively, based on measures, to perform better than other baseline methods.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a novel framework called Convolutional Implicit Neural Representation (CINR) for digital histopathology imaging using pathology whole-slide images. This model combines Implicit Neural Representations with Convolutional Neural Networks to enhance the resolution and clarity of histopathological images. The approach notably reduces memory usage and enhances accessibility compared to traditional methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novel INR Structure: The integration of CNNs with INRs to form CINR is highly innovative in addressing the challenges of high spatial frequency components in histopathology. The authors adeptly explain the connections between convolution and MLP in terms of implicit neural representations. This fusion of neighboring information and point-wise information captured by convolutions and MLPs, respectively, enables accurate representation of the rich-contained image. Additionally, the selection of multi-resolution hash grid encoding plus INR also addresses the issue of huge image sizes in Whole Slide Images (WSIs) in a very intuitive and efficient manner.
    • Improvement in Image Representation: The paper demonstrates a significant increase in the peak signal-to-noise ratio and structural similarity index measures, indicating a higher fidelity in image representations compared to pure INR methods.
    • Potential Clinical Implications: By enabling instantaneous access to any region of a Whole Slide Image (WSI) at the desired magnification, this method could potentially streamline workflows in pathological diagnosis, making it a valuable clinical tool.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper is generally very well-written. However, a potential weakness lies in the results comparison. In Fig. 2, the reconstructed images from different methods appear too similar to each other, making it difficult to conduct a meaningful comparison. Additionally, some minor details about the experiments, such as training loss and the number of images in the test dataset, are missing. Adding these details could enhance the clarity and completeness of the experimental evaluation.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Considering including error maps in Fig. 2 to make the differences more clear.
    • Provide information about how the authors train the method and perform inference.
    • Including the total number of images in the dataset will make the quantitative results more meaningful.
    • It would indeed be interesting to compare the proposed representation to existing pyramid structure representations of Whole Slide Images (WSIs), particularly in terms of loading speed and storage size. This comparison could provide valuable insights into the efficiency and effectiveness of the proposed method relative to established approaches.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed Convolutional Implicit Neural Representation (CINR) provides a very interesting method the handling of whole-slide images by enhancing image resolution and clarity, reducing memory usage, and improving accessibility. The integration of CNNs with INRs, demonstrated improvements in image fidelity, and potential clinical implications are pivotal strengths that significantly outweigh the minor weaknesses related to experimental details. These contributions assure its impact and relevance to both academic research and clinical practice, making a strong case for its acceptance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We appreciate the thoughtful analysis provided by the reviewers. Here are our responses to the concerns raised by Reviewers R1, R3, and R5: Resolution Improvement with Convolutional Layers (R1): As outlined in Chapter 1, a batch containing all pixels from a single image undergoes processing similar to multi-channel convolutions with a 1x1 kernel size in the MLP flow. Our auxiliary flow introduces 3x3 kernel size convolutions, engaging both the target pixel and adjacent positions, which effectively manage local spatial fluctuations. This concept is expanded in Chapter 2.2. Our intuition was that encoded vectors from adjacent positions provide additional information about high frequency components. We appreciate this comment and are working on linking this intuition with mathematical explanations. Different Methods Look the Same in Result Figures (R1, R3, R5): We acknowledge that differences in reconstruction results are not distinct to the naked eye because the original size of the sections of WSIs is still big but the figure size in the manuscript is limited. To address this, we included difference images and spectral images as alternatives in the paper. We will add zoomed images focusing on specific local spots to highlight differences between methods more clearly. Lacking Support from Large-Scale Datasets (R3): The INR differs from traditional frameworks that use large-scale images for training and testing. In INR, “training sets” comprise pixels within a single image, which, due to the immense size and pixel count of a whole slide image, effectively constitutes a large-scale dataset. Furthermore, in INR, the concept of a “test dataset” is ambiguous since inference involves restoring an image used in training rather than a separate test image. More Public Datasets for Testing (R3): Our study utilized a publicly available pathology image dataset from The Cancer Genome Atlas, known for its wide use in numerous studies. We selected five random WSIs for thorough evaluation of our model. Each WSI is substantial, and testing across five independent images provides a robust assessment of reconstruction quality. However, to address potential bias from using a single data source, we agree that incorporating additional datasets could further verify our method’s capability. Compression (R3): Image compression, introduced as a potential benefit of INR in Chapter 1, requires accurate reconstruction of complex WSIs as a precondition. Currently, our focus is enhancing image restoration, with discussions on compression and multi-resolution representations planned for future work. Quantitative Evaluation Metrics (R3): We used reference-based metrics like PSNR and SSIM, standard in INR studies, to demonstrate our framework’s performance. Encoding and CINR Module (R3): Position encoding and CINR modules are trained for each WSI, with parameters stored accordingly. Training and Testset Details (R5): Training loss was MSE of pixel values, with tests conducted on five WSIs as shown in Fig. 3. These details will be elaborated in the revised manuscript. Training and Inference Processes (R5): During training, each WSI was segmented into patches with overlapping, and each batch of 100 patches was processed through the CNN and parallelly the MLP modules. During inference, pixel positions are defined by the original resolution and segmented into patches without overlapping. This detail will be expanded upon in the revised manuscript. Comparison with Existing WSI Representation (R5): Our method can reconstruct images quickly (around 1 min. per 10k x 10k image). For compression purposes, reducing encoding and network parameters is an ongoing challenge and beyond the scope of this current study, but it remains a focus for our future research. It would indeed be beneficial to detail the current state of our method regarding loading speed and storage efficiency, especially in comparison with conventional frameworks. Thanks for highlighting this aspect.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors well addressed the issues raised by the reviewers during the rebuttal.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors well addressed the issues raised by the reviewers during the rebuttal.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors have done a good job in rebuttal and all reivwers accepted the work.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors have done a good job in rebuttal and all reivwers accepted the work.



back to top