Abstract

In computational pathology, Multiple Instance Learning (MIL) is widely applied for classifying Giga-pixel whole slide images (WSIs) with only im-age-level labels. Due to the size and prominence of positive areas varying significantly across different WSIs, it is difficult for existing methods to learn task-specific features accurately. Additionally, subjective label noise usually affects deep learning frameworks, further hindering the mining of discriminative features. To address this problem, we propose an effective theory that optimizes patch and WSI feature extraction jointly, enhancing feature discriminability. Powered by this theory, we develop an angle-guided MIL framework called PSJA-MIL, effectively leveraging features at both levels. We also focus on eliminating noise between instances and em-phasizing feature enhancement within WSIs. We evaluate our approach on Camelyon17 and TCGA-Liver datasets, comparing it against state-of-the-art methods. The experimental results show significant improvements in accu-racy and generalizability, surpassing the latest methods by more than 2%. Code will be available at: https://github.com/sm8754/PSJAMIL.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0962_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/sm8754/PSJAMIL

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Yu_PatchSlide_MICCAI2024,
        author = { Yu, Jiahui and Wang, Xuna and Ma, Tianyu and Li, Xiaoxiao and Xu, Yingke},
        title = { { Patch-Slide Discriminative Joint Learning for Weakly-Supervised Whole Slide Image Representation and Classification } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper focuses on whole slide image classification with multiple instance learning. The major characteristics of the method is the proposed concept of the ratio of the mean attention over the maximum attention, which serves as the quality indicator to adjust the angle between the extracted feature and the normalized category centers (derived from the weights of the last fc layer). The proposed method is evaluated on two datasets, in which it outperformed the other pre-existing method empirically.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The introduced concept of the attention ratio is interesting, and the applying it to adjust the angle in the loss function makes sense to me.
    2. Easy to follow.
    3. Excellent performance on the two datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Those with small attention scores are labelled as normal instances. Are there any literature or empirical validation to endorse this prior?
    2. The presentation needs further improvement
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In equation 1. there should be a transpose?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the presentation needs improvement, the idea is an interesting observation.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors use multiple instances learning (MIL) for whole slide image classification in pathology slides. They propose using an angle-based classification method to enhance feature discrimination and perform feature optimization and patch and WSI level. They integrate all this into a framework called PSJA-MIL and show its effectiveness on Camelyon17 and TCGA-Liver datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    While the use of cosine-related losses to adjust correlations to normalized confidence scores is common in natural language, it is interesting to see vision-based work tackle it for MIL. The ablation studies demonstrate the effectiveness of the angle-based classifier and also the adaptive cross-entropy losses.

    The authors have done a commendable job of comparing their method against other top MIL frameworks. Table 1 clearly establishes the paper’s superior results. Since there are a large number of hyperparameters and tuning details involved, code release upon acceptance is recommended for the benefit of the MICCAI community.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I do not have many weaknesses to point out. There are a couple of suggestions to enhance readability. For Fig 2, it would be helpful to point out specific regions of interest between the ground truth and the heatmap.

    I am not sure if angle-based classification is the right name for the method. Perhaps prototypical cosine similarity would be a better term since it connects both to the idea of cosine similarity in literature and the idea of matching with the class center (prototype).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The details provided are not sufficient for reproducing the results since, even for baselines, a variety of hyperparameters must be considered. A code release is recommended.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The authors have done a good job of connecting the various concepts and related works. During the review, I found two additional papers which might be useful to add.

    Raswa, Farchan Hakim, Chun-Shien Lu, and Jia-Ching Wang. “Attention-Guided Prototype Mixing: Diversifying Minority Context on Imbalanced Whole Slide Images Classification Learning.” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024.– This paper examines prototype mixing and uses attention for WSI classification in a similar fashion.

    Barz, Bjorn, and Joachim Denzler. “Deep learning on small datasets without pre-training using cosine loss.” Proceedings of the IEEE/CVF winter conference on applications of computer vision. 2020.– Introduces the idea of the benefits of cosine loss for smaller datasets. It would be useful to connect the cosine loss (angle guidance) strategy with this paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The good results and novelty in methodology along with a rich discussion both in the introduction and results section make it a useful work for the MICCAI community.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The manuscript introduces a new patch-slide discriminative joint learning approach for weakly-supervised whole slide image representation, which aims to overcome the subjective label noise and enhance the representation capability of task-specific features. Beside, an angle-based classification method is developed to enhance the correlation of sample feature vectors to confidence scores.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed method is of novelty. The patch-slide discrimination joint learning theory is new, which contains contrastive learning between patches and adaptive cross-entropy loss is designed for WSI-level feature learning. Moreover, an angle-based classifier is proposed to improve the discrimination ability of the model. The proposed method is extensively evaluated on two public datasets along with comparison with some state-of-the-art approaches, which achieves promising results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method is not introduced logically with sufficient details. For instance, the patch contrastive estimation loss needs more clarification. Besides, some symbols are not explained. The authors should provide evidence for some statements when they introduce the settings in their proposed method.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The proposed method is a little bit complex, so I suggest that the authors provide the code and model for better reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The authors did not introduce how to get the attention score of patches at the beginning of the Method section. (2) The batch size is set as 1 in this study. Does K in Eq.(1) denote the number of patches for a certain WSI, and is K a dynamic value across the WSIs, as each WSI might contain a different number of patches? However, I am not clear about the meaning of Q in Eq. (1) although the authors explain that it is the total set. Is it the same as K? (3) The authors introduced that discernibility directly proportional to the learning intensity would result in overfitting. Is it verified by some experiments? Please analyze more in the manuscript. (4) The Eq. (1) is not sufficiently explained. Is K the set of patches or the set of patch features? Besides, what is the operation of f_{S}^{k}f_{S}^{q}? (5) The meaning of t in Y_{n,t} in Eq. (2) is not given. (6) The authors introduced that the patch features are extracted offline in this work. Is it contradictory to the patch-slide discrimination joint learning? (7) How did the authors determine the value of the contribution coefficient (0.4)?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work introduces a new representation learning method for WSI analysis, and it has been extensively evaluated on two public datasets with both quantitative and qualitative analysis. Despite its good performance, the method is not introduced with sufficient details and explanations, which should be improved in the revision.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Thank you for your constructive review and recommendations to improve the quality of work.




Meta-Review

Meta-review not available, early accepted paper.



back to top