Abstract

Immunohistochemical (IHC) staining plays a pivotal role in the evaluation of numerous diseases. However, the standard IHC staining process involves a series of time-consuming and labor-intensive steps, which severely hinders its application in histopathology. With the rapid advancement of deep learning techniques, virtual staining has promising potential to address this issue. But it has long been challenging to determine how to effectively provide supervision information for networks by utilizing consecutive tissue slices. To this end, we propose a weakly supervised pathological consistency constraint acting on multiple layers of GAN. Due to variations of receptive fields in different layers of the network, weakly paired consecutive slices have different degrees of alignment. Thus we allocate adaptive weights to different layers in order to dynamically adjust the supervision strengths of the pathological consistency constraint. Additionally, as an effective deep generative model, GAN can generate high-fidelity images, but it suffers from the issue of discriminator failure. To tackle this issue, a discriminator contrastive regularization method is proposed. It compels the discriminator to contrast the differences between generated images and real images from consecutive layers, thereby enhancing its capability to distinguish virtual images. The experimental results demonstrate that our method generates IHC images from H&E images robustly and identifies cancer regions accurately. Compared to previous methods, our method achieves superior results.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3332_paper.pdf

SharedIt Link: https://rdcu.be/dY6io

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72083-3_11

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_Exploiting_MICCAI2024,
        author = { Li, Yueheng and Guan, Xianchao and Wang, Yifeng and Zhang, Yongbing},
        title = { { Exploiting Supervision Information in Weakly Paired Images for IHC Virtual Staining } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15004},
        month = {October},
        page = {113 -- 122}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors propose a contrastive learning based GAN framework for IHC virtual staining. The framework involves two proposed contrastive losses: 1. The adaptive pathological consistency constraint for the generator; 2. The contrastive regularization loss for the discriminator. The authors validate the effectiveness of these loss functions on virtual staining tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The experimental results show the proposed WPCC, adaptive weights, and the discriminator regularization loss are effective.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Lack of discussing ASP: ASP is an important baseline method with adaptive contrastive loss. It would be helpful if the authors discuss the baseline method and its difference with the proposed one.
    • Lack of thorough analysis about the results: the authors only provide the quantitative comparisons and some visualizations, without discussing the results.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The method seems reproducible according to the method descriptions.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • The consistency constraint doesn’t seem related to pathology. It’s sort of a representation consistency constraint.
    • The authors should discuss more about the experimental results. For example, why is the proposed method better than ASP? How much does the Reg help in discriminating between real and fake IHC images? Any failure cases? Why does pyramid Pix2Pix outperform the proposed method in HER2(BCI)?

    I am open to raise my scores If my concerns are addressed clearly.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please refer to the above sections

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presents a new technique for IHC virtual staining. There are two main problems in virtual staining; namely: (i) to ensure the accuracy of pathological information in virtual images, and (ii) to correctly use consecutive tissue slices information. In this paper, these problems are tackled through two different techniques, respectively: (i) weakly pathological consistency constraints (WPCC), and (ii) adaptive weights for WPCC. Moreover, the paper addresses the common problem in GAN-based models of the performance of the discriminator. To palliate that problem, authors propose a discriminator contrastive regularization approach.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper presents a virtual staining method that effectively tackles significant challenges in generating precise IHC images without requiring finely aligned datasets. Firstly, it introduces weakly pathological consistency constraints along with adaptive weights, addressing the critical issue of maintaining pathological accuracy in the generated virtual stained images. Secondly, the proposal of a discriminator contrastive regularization strategy targets a common problem affecting the performance of GAN-based discriminators. This approach has the potential to significantly enhance model stability and performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper is generally well-written but requires further clarification in several parts. In the first proposal, there is a mistake in the definition of expression (2) as it should involve the embeddings of the patches from the generated images. Additionally, there is a need for clarification about the motivation behind the selection of specific metrics, such as the Jensen-Shannon Divergence and Pearson correlation coefficient. In the second proposal, the main issue is that the strategy to gradually include the adaptive weights is not clearly explained, and further comment on this concept is necessary. The same concern arises regarding the motivation behind the selection of the Pearson correlation coefficient. Moreover, the authors do not clarify whether the normalization factor is computed as the sum of all the weights. Another weakness is the selection of stains for the ablation study; the authors have selected only two stains, but there are three other cases in Table 2. Analyzing HER2_MIST would also be interesting to determine whether this type of staining exhibits coherent behavior. Apart from that, there are a few mistakes and typos noted in the Comments section of the review, such as the incorrect definition in expression (2) and potential inconsistencies in variable definitions across sections.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper provides a detailed methodology, but there are several areas where more detailed explanations are necessary to fully reproduce the experiments.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Additionally, minor comments and questions are noted. Regarding the second proposal:

    • As mentioned earlier, the strategy for gradually including the adaptive weights is not clearly explained. Please provide further comments on this concept but move the specific values (41st epoch) to Section 3.

    Regarding the third proposal, which aims to improve the performance of the discriminator:

    • What do the authors mean by “the all but the last layer …”?
    • Please ensure consistency with the definitions of variables that have been proposed. In Section 3.1, “v” and “y” are used for patches of the original and generated images, respectively. However, in Section 3.3, the authors use “y” and “^y”, respectively.
    • Check expression (6) and relate it to the cosine similarity.
    • Loss_Adv in expressions (7) and (8) has not been defined.

    In Section 3 Experiments:

    • It would be valuable to illustrate the stabilization capabilities of the proposed regularization loss averaged across a set of experiments rather than in just a single realization.

    Other minor changes:

    • Page 3: “… through the generator encoder …”
    • Page 3: “… in a similar way …”
    • Page 3: “… from the l-th layer of the …”
    • Page 6: Please check the sentence “While other methods … positive images.” Also, review the link with the previous sentence and the sentence itself.
    • Page 7: I would not use the verb “degenerates”.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Given the novel contributions and the significance of the tackled problems, a weak accept recommendation is justified. However, addressing the highlighted issues regarding clarity, variable consistency, and additional validation could strengthen the paper substantially. The potential impact of the proposed techniques in the field of virtual staining supports the decision to encourage further development and refinement.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a framework for IHC virtual staining from HE slides including a multi-layer consistency constraint, adaptive weights for tissue alignment and a discriminative contrastive regularization loss.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • good writing

    • interesting and clear idea
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper misses some further testing, such as a qualitative assessment of the generated images (fake/real) and the impact of generated imaged on a classification tasks accuracy.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    We advise the authors to release the source code upon acceptance of the submission, for better reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • It’s not clear which encoders are the same or different. Please revise the text and Fig.1.

    • I suggest including a classification task from generated images as part of the evaluation (if possible, with public a public dataset(s)). Also, a qualitative testing on real/fake images would strengthen the analysis.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The idea is interesting and the paper is well written, with clear ideas and good SOTA comparison, ablation experiments and several datasets (and stainings).

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Thank you for all the valuable feedback sincerely. I will respond to each reviewer’s comments individually below. Reviewer #1: We will evaluate the impact of generated images on the accuracy of classification tasks in our future work. All the encoders are the same encoder in Fig.1. We will revise the text to eliminate any confusion. Reviewer #3: We speculate that the suboptimal performance of ASP may be attributed to the fact that the similarities between the generated and real IHC images cannot represent the degrees of alignment between consecutive slices accurately. Because the positive/negative expressions of the generated images largely impact their similarities. Therefore, we utilize H&E and real IHC images to calculate the degrees of alignment. Reviewer #4: Thank you for your careful review. I will check all the mistakes and typos. The adaptive weights are employed linearly in the training process. Owing to the page limit, we only present the results of HER2 in BCI and ER in MIST in the ablation experiment. “The all but the last layer” means that we obtain embedding vectors from the features of the discriminator’s second-to-last layer. Loss_Adv represents the conventional adversarial loss of GAN.




Meta-Review

Meta-review not available, early accepted paper.



back to top