List of Papers Browse by Subject Areas Author List
Abstract
We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading. We engage this decoding task by developing a novel method for predicting the attention of pathologists as they read Whole-Slide Images (WSIs) of prostate tissue and make cancer grade classifications. Our ground truth measure of a pathologists’ attention is the x, y and z (magnification) movement of their viewport as they navigated through WSIs during readings, and to date we have the attention behavior of 43 pathologists reading 123 WSIs. These data revealed that specialists have higher agreement in both their attention and cancer grades compared to general pathologists and residents, suggesting that sufficient information may exist in their attention behavior to classify their expertise level. To attempt this, we trained a transformer-based model to predict the visual attention heatmaps of resident, general, and specialist (Genitourinary) pathologists during Gleason grading. Based solely on a pathologist’s attention during a reading, our model was able to predict their level of expertise with 75.3%, 56.1%, and 77.2% accuracy, respectively, better than chance and baseline models. Our model therefore enables a pathologist’s expertise level to be easily and objectively evaluated, important for pathology training and competency assessment. Tools developed from our model could be used to help pathology trainees learn how to read WSIs like an expert.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0394_paper.pdf
SharedIt Link: pending
SpringerLink (DOI): pending
Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0394_supp.pdf
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Cha_Decoding_MICCAI2024,
author = { Chakraborty, Souradeep and Gupta, Rajarsi and Yaskiv, Oksana and Friedman, Constantin and Sheuka, Natallia and Perez, Dana and Friedman, Paul and Zelinsky, Gregory and Saltz, Joel and Samaras, Dimitris},
title = { { Decoding the visual attention of pathologists to reveal their level of expertise } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15003},
month = {October},
page = {pending}
}
Reviews
Review #1
- Please describe the contribution of the paper
Decoding the level of expertise of pathologists from human attention on pathological slides will be helpful for pathology training and competency assessment. Based on a relatively large dataset that includes human attention data recorded from 43 pathologists with different levels of expertise on 123 whole-slice images, the paper demonstrates how a transformer-based architecture can predict the level of expertise from human attention.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
General observations of how human attention differs among various levels of expertise, along with additional information regarding the correlation with diagnosis, were informative and enhanced the significance of the motivation for this work.
-
The relatively large dataset featuring human attention data from different levels of expertise will be useful for the research community to enhance future research in the same direction.
-
The accurate prediction of human attention from pathological images (shown in Fig. 5) will contribute to the future development of image recognition models that are well-aligned with human attention.
-
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
-
Due to the diverse perspectives on human attention and the relevant use of machine learning technique, the main focus of this work seems ambiguous. Particularly, the logical connection between the motivation to predict human attention with ProstAttFormer (in Fig. 3) and to determine expertise level with ExpertiseNet (in Fig. 4) is unclear.
-
Even though the results of predicting human attention by ProstAttFormer in Fig. 5 appear good, the significance of these results could not be comprehended. Given their observations, the pattern of human attention can vary according to expertise level. Hence, I thought such expertise level should be input into the network to predict more precise human attention; however, I could not find any explanation regarding this point.
-
More seriously, the final result for the pathologist expertise classification performance seems to have been obtained from the same group of pathologists, which was also used in the model training. Therefore, one can argue that the model may overfit to learning individual preferences rather than the general tendencies regarding expertise. Particularly based on this limitation, I could not be convinced of the generalizability of this work.
-
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
If the logical connection between ProstAttFormer and ExpertiseNet is enhanced and the generalizability of the results from ExpertiseNet is improved, the overall significance of this work will be increased.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
If the logical connection between ProstAttFormer and ExpertiseNet is enhanced and the generalizability of the results from ExpertiseNet is improved, the overall significance of this work will be increased.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Reject — should be rejected, independent of rebuttal (2)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
I consider the most serious issue of this paper to be the lack of external validation for the results of the pathologist expertise classification performance.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
In the rebuttal, the future prospects based on this research are well articulated, and it can be understood that this is an important study as the first step towards that future.
Review #2
- Please describe the contribution of the paper
This method proposed an approach to classify pathologist expertise based on their attention allocation on prostate cancer grade classifications task. The manuscript is based on a large data set reporting the attention behaviour of 43 pathologists on 123 WSIs. The manuscript also present ProstAttFormer, a vision-transformer (ViT) model that predict pathologists attention and ExpertiseNet, another ViT that predict pathologist expertise based on their attention.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The manuscript is based on a large data set reporting the attention behaviour of 43 pathologists on 123 WSIs
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
My main concern is the limited clinical utility of classifying a pathologist as resident/general/specialist pathologists and lack of qualitative and statistical way to compare trajectories between pathologists. Average of attention heatmaps may not sufficient to decode visual attention, saccade frequency, scanpath patterns, time to first fixation etc are also important.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
Some minor/mojor commnets are
- In ExpertiseNet, it is not clear how temporal, and magnifications heatmaps are combined with image features.
- Scanpaths colours are not clear and what they are representing.
- Explain how in Fig 1, overall GU specialist spent more time as compared to resident and general pathologist
- Current statistics regarding variability in cancer diagnosis, where is the need in clinic? This should be supported with relevant statistics or literature review
- Ablation study to investigate other loss functions as compared to cross correlation loss, e.g., mean squared error (MSE), Contrastive, cross entropy loss
- GU terminology is not defined in abstract, loss and model layers defined with same variable
- The other aspect is to investigate intra-observer variability, difference in attention for same set of cases.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Reject — could be rejected, dependent on rebuttal (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
As mentioned, limited clinical utility and lack of statistical, qualitative methods of comparing attention
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
The authors addressed the comments regarding the motivation of this work and acknowledged to address the minor changes. Some important experiments are needed to further examine the visual attention but I understand it can not be addressed during this rebuttable phase.
Review #3
- Please describe the contribution of the paper
Authors utilized eye-tracking parameters to characterize pathologists levels of expertise in assessing WSI.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Interesting idea - builds upon older studies that found differences in eye-tracking as a function of levels of expertise. Uses interesting analysis methods.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The rationale needs to be better explained. We have known for years through a number of other studies in pathology and radiology that there are differences in eye tracking parameters as a function of training and expertise/experience. Simply predicting someone’s level of expertise is not all that useful, new or interesting - need to think broader about what can you really do with this information? How can you incorporate it into training? There are already several studies using eye tracking of experts as models for novices so that is not new.
- https://www.spiedigitallibrary.org/conference-proceedings-of-spie/11316/1131607/Understanding-digital-pathology-performance-an-eye-tracking-study/10.1117/12.2550513.full
- https://www.spiedigitallibrary.org/journals/journal-of-medical-imaging/volume-9/issue-3/035501/Digital-pathology–the-effect-of-experience-on-visual-search/10.1117/1.JMI.9.3.035501.full
- https://www.spiedigitallibrary.org/conference-proceedings-of-spie/11603/116030Z/Eye-tracking-in-digital-pathology–identifying-expert-and-novice/10.1117/12.2580959.full -https://jvme.utpjournals.press/doi/full/10.3138/jvme.1115-187r -https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0103447
- https://www.sciencedirect.com/science/article/pii/S0046817706005302
- https://www.spiedigitallibrary.org/conference-proceedings-of-spie/7966/79660P/Changes-in-visual-search-patterns-of-pathology-residents-as-they/10.1117/12.877735.full
- https://link.springer.com/article/10.1007/s10459-015-9589-x
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
Not really applicable here
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
The rationale needs to be better explained. We have known for years through a number of other studies in pathology and radiology that there are differences in eye tracking parameters as a function of training and expertise/experience. Simply predicting someone’s level of expertise is not all that useful, new or interesting - need to think broader about what can you really do with this information? How can you incorporate it into training? There are already several studies using eye tracking of experts as models for novices so that is not new.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Not a lot new here but is on good sample size images & readers. Narrow topic but interesting. Good as a poster.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Accept — should be accepted, independent of rebuttal (5)
- [Post rebuttal] Please justify your decision
Authors addressed concerns adequately
Author Feedback
Pathologist viewing for cancer diagnosis is a complex and specialized cognitive task requiring years of training. We present two independent tasks: predicting pathologist attention (to guide trainee attention based on specialists’) and predicting expertise (for trainee evaluation), both essential technical components towards developing our AI-assisted pathologist training pipeline. We will clarify this in the Introduction.
While non-specialists focus on the upper G3 tumor regions, specialists also allocate more attention to G4 and G5 regions, explaining their higher viewing time.
We’ll add Genitourinary (GU) in abstract and fix all notations.
Meta-Review
Meta-review #1
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper aims to predict attention / expertise of a pathologist during assessment of whole slide images of prostate tissue.
While the motivation of the approach was not fully clear to the reviewers, this concern has been alleviated after the rebuttal. Similarly, the rebuttal clarifies potential issues with the train / test split for the proposed data set.
Since these concerns have been remedied, and all reviewers value the paper as weak accept or accept, the strengths of the paper (large dataset, use case, human-AI collaboration) and the benefit of discussing these topics at MICCAI justify acceptance from my perspective.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
This paper aims to predict attention / expertise of a pathologist during assessment of whole slide images of prostate tissue.
While the motivation of the approach was not fully clear to the reviewers, this concern has been alleviated after the rebuttal. Similarly, the rebuttal clarifies potential issues with the train / test split for the proposed data set.
Since these concerns have been remedied, and all reviewers value the paper as weak accept or accept, the strengths of the paper (large dataset, use case, human-AI collaboration) and the benefit of discussing these topics at MICCAI justify acceptance from my perspective.
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
N/A