List of Papers Browse by Subject Areas Author List
Abstract
Prostate cancer is a highly prevalent cancer and ranks as the second leading cause of cancer-related deaths in men globally. Recently, the utilization of multi-modality transrectal ultrasound (TRUS) has gained significant traction as a valuable technique for guiding prostate biopsies. In this study, we present a novel learning framework for clinically significant prostate cancer (csPCa) classification by using multi-modality TRUS. The proposed framework employs two separate 3D ResNet-50 to extract distinctive features from B-mode and shear wave elastography (SWE). Additionally, an attention module is incorporated to effectively refine B-mode features and aggregate the extracted features from both modalities. Furthermore, we utilize few shot segmentation task to enhance the capacity of the classification encoder. Due to the limited availability of csPCa masks, a prototype correction module is employed to extract representative prototypes of csPCa. The performance of the framework is assessed on a large-scale dataset consisting of 512 TRUS videos with biopsy-proved prostate cancer. The results demonstrate the strong capability in accurately identifying csPCa, achieving an area under the curve (AUC) of 0.86. Moreover, the framework generates visual class activation mapping (CAM), which can serve as valuable assistance for localizing csPCa. These CAM images may offer valuable guidance during TRUS-guided targeted biopsies, enhancing the efficacy of the biopsy procedure. The code is available at https://github.com/2313595986/SmileCode.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0721_paper.pdf
SharedIt Link: https://rdcu.be/dV19p
SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72086-4_68
Supplementary Material: N/A
Link to the Code Repository
https://github.com/2313595986/SmileCode
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Wu_Towards_MICCAI2024,
author = { Wu, Hong and Fu, Juan and Ye, Hongsheng and Zhong, Yuming and Zou, Xuebin and Zhou, Jianhua and Wang, Yi},
title = { { Towards Multi-modality Fusion and Prototype-based Feature Refinement for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15005},
month = {October},
page = {724 -- 733}
}
Reviews
Review #1
- Please describe the contribution of the paper
In this paper, the authors introduce a novel framework designed for classifying prostate cancer using multi-modality transrectal ultrasound videos. The framework’s primary contributions to model design are centered around three key components: attention-based multi-modality fusion block, auxiliary segmentation branch to assist cancer classification, and prototype-based correction module to enhance robustness under limited csPCa data.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1.Novel application: this is the first work incorporating multi-modal TRUS videos into deep learning framework for prostate cancer classification. 2.The proposed method outperforms existing approach by a large margin. The results of the ablation study confirm the effectiveness of the two key components: the Attention Fusion and Prototype Correction modules. 3.The CAM visualization show that the proposed method can focus well on the suspicious tumor regions which validates its clinical feasibility.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1.The manuscript contains several typographical errors that could lead to confusion. For instance, the symbol Lneg in Equation (9) is not defined in the Methods section. Additionally, the caption in Figure 5 incorrectly reads “out network” instead of “our network.” I recommend a thorough review and correction of these errors to improve the clarity and professionalism of the paper. 2.The experimental section of the paper appears limited, as it includes only one comparative method. For a more comprehensive understanding of how the proposed method stacks up against existing technology, it is important to include multiple comparative analyses. I suggest reviewing additional literature, such as [1] on the use of ultrasound imaging for prostate cancer classification and incorporating these findings into a broader comparative study. 3.While the inclusion of Grad-CAM visualizations enhances the interpretability of the proposed method, positioning this as a primary contribution might be overstated, particularly since the paper lacks a detailed explanation or illustration of the CAM generation process in the methodological overview. [1] Gilany, Mahdi, et al. “TRUSformer: improving prostate cancer detection from micro-ultrasound using attention and self-supervision.” International Journal of Computer Assisted Radiology and Surgery 18.7 (2023): 1193-1200.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
Publishing the data/code would be helpful for reproducibility.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1.The paper should address or compare its findings with previous work [1], which also applies ultrasound images for prostate cancer detection. A discussion on how the current approach differs or improves upon these earlier results would provide valuable context and strengthen the manuscript. 2.Table 1 currently lacks clarity, making it difficult to identify which results correspond to the proposed method. I recommend reorganizing this table to separate comparison results and the ablation study into two distinct tables. This change would enhance readability and make it easier for reviewers to understand the contributions and performance of the proposed method. 3.There is an error on page 6 concerning the performance improvement reported. The statement, “Our method outperformed theirs in terms of AUC by 0.08 when the prostate segmentation module was excluded,” appears to be incorrect. Based on the provided AUC value of 0.83 for the proposed method without Multi-task and PCM, the actual improvement should be reported as 0.05. This discrepancy should be corrected to ensure the accuracy of the reported results. 4.The method section mentions two different types of attention—dimensional attention and adaptive spatial attention. Considering their potential impact on model performance, it would be beneficial to evaluate the effectiveness of each attention mechanism separately in the experiments. [1] Gilany, Mahdi, et al. “TRUSformer: improving prostate cancer detection from micro-ultrasound using attention and self-supervision.” International Journal of Computer Assisted Radiology and Surgery 18.7 (2023): 1193-1200.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Reject — could be rejected, dependent on rebuttal (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
I think the paper has some merits, such as its clinical application and the methodology design. However, comparing the proposed method with only one existing approach does not sufficiently demonstrate its advantages, especially given the presence of other studies related to prostate cancer classification using ultrasound images. Additionally, various errors and typographical mistakes detract from the paper’s professionalism. The author should thoroughly review and revise the manuscript before submission to ensure accuracy and enhance credibility.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #2
- Please describe the contribution of the paper
This study introduces a multi-task learning framework tailored for the classification of clinically significant prostate cancer (csPCa) within the multi-modality TRUS setting. The authors concentrate on harnessing B-mode and shear wave elastography (SWE) image data through an attention module, employing few-shot segmentation to enable the encoder to capture crucial features. They demonstrate the significance of their method in discerning and locating csPCa, though a comparison with recent approaches would further contextualize its effectiveness.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Methodologically, this paper employs a 3D ResNet model to extract pertinent features from both B-mode and SWE images. This process incorporates a dimensional attention module and an adaptive spatial attention module, facilitating accurate fusion of features from the two modalities. The utilization of the Adaptive Spatial Attention Module enables element-wise computation of features from B-mode (FX) and SWE (FE). Additionally, the authors incorporate a prototype correction module to extract representative prototypes of clinically significant prostate cancer (csPCa), given the limited availability of csPCa masks. Notably, the authors evaluate the performance of their proposal using pertinent metrics such as ROC curve (AUC), F1-score (F1), accuracy (Acc), sensitivity (Sen), and specificity (Spe) in comparison with other studies.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Initially, the authors introduce a model for classifying clinically significant prostate cancer (csPCa) without specifying the number of distinct classes considered. Additionally, they assert addressing class imbalance solely by adjusting the batch size, despite its existence. While the authors utilize important evaluation metrics, it remains unclear whether their proposed method only outperforms counterparts by 0.08 in AUC or exhibits superiority in other metrics. Lastly, comparing this proposal with only one recent work may be insufficient. If deemed necessary, justification should be provided for the significance inferred from such a comparison.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
The author should offer clarification on several sections of the paper to enhance readers’ comprehension of the message. Author would also need to improve the . Firstly, they should address whether the introduced prototype correction module imposes any bottleneck on the process. Secondly, after ranking the similarity points as pseudo-labels, it’s essential to elucidate the validation process employed to ensure the relevance of this step. Thirdly, the author needs to clarify the discrepancy in observations regarding the use of CAM visualization and Grad-CAM visualization, both of which were purportedly utilized. Lastly, alongside the result evaluation metrics, providing direct classification or segmentation results would allow readers to observe the relationship in the result analysis more clearly.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This study offers a noteworthy contribution by proposing a valuable technique for medical image comprehension. While the work shows promise, significant improvements are necessary. Therefore, I recommend a weak-acceptance, as the overall paper requires further clarification to ensure readers grasp the message and the significance of this work.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Accept — should be accepted, independent of rebuttal (5)
- [Post rebuttal] Please justify your decision
The responses provided have satisfactorily responded to the main concerns raised about this paper. I recommend the acceptance (5) because of the significance and clarity of the proposed work.
Review #3
- Please describe the contribution of the paper
This submission proposed a network to improve the classification of clinically significant prostate cancer (csPCa) by combining transrectal ultrasound (TRUS) data from multiple modalities and refining the features using attention module. The authors declared that they add segmentation as an auxiliary task to enhance the classification.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The authors made a good integration of different modules for a well-known application, i.e., using novel data modalities to classify the clinically significant prostate cancers.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1) Too intuitive when introducing some important modules. 2) Lack of necessary ablation study and parameter sensitivity analysis.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
The authors are encouraged to provide a clear and detailed description of the algorithm to ensure reproducibility.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
Please address my comments for weakness above. Besides, I have the following reviews. 1) For some concepts (such as attention and 3D U-net) and formulations, please don’t borrow them from other papers without a detailed motivation. You’d better introduce more about why we should use it rather than other simply options. 2) Seen from the total loss, why is there no regularization parameter for L_seg_support? How do you garrentee the first two loss’ values are always at similar level? What if you don’t have the L_seg_support? 3) It is better to report your results in a form of mean+/-STD after you re-conduct your experiments with different splits/random shuffles of data.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The novelty, writing, and organization of different parts in the manuscripts made me give the recommendation. I am open to change my score if the authors can address any questions in the feedback stage.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Author Feedback
We cordially thank you for your time and efforts on the review of the submission. We have carefully studied your suggestions and addressed your main concerns.
- Experiments
- more comparison methods (R1, R4) As officially clarified by MICCAI ‘New/additional experimental results in the rebuttal are not allowed, and breaking this rule will lead to automatic rejection’, we cannot add more comparison results. Here we discuss related studies in a concise manner and justify our experimental setting. Firstly, we are currently the only study using deep learning (DL) for the csPCa classification in multi-modality TRUS videos. Secondly, regarding the task of csPCa classification in TRUS, most previous studies employ traditional machine learning methods to analyze 2D region-of-interests (ROIs). These methods require manually selected ROIs, which are difficult to reproduce. Thirdly, DL has been applied to this task recently. However, most of related works require additional annotation information or extra clinical information. For example, [17] is the most relevant study, which requires the prostate segmentation masks for all data (we compare this one). [1] suggested by R4 employs high-resolution micro-US and requires needle trace region as well as 2D labeling for all data to train model. ‘SciRep 12, 860 (2022)’ classifies 2D TRUS using images, patient ages, and prostate specific antigen (PSA) for all data. In contrast, we only use 4 annotated lesion masks to analyze multi-modality TRUS videos.
- more ablation studies (R3, R4) Most of ablation results are shown in Table 1. In addition, we have tested the efficacy of our two attention mechanisms, and each attention mechanism has been demonstrated to be effective in our experiments. To sum up, we commit to making our model, as well as ablation models, comparison method, and parameters publicly available, therefore making our study easy to be reproduced.
Design Motivation (R3) The main motivation of our study can be found in contribution paragraphs and Conclusion section. Dimensional Attention Module (DAM): DAM enables the network to extract more global features, which is beneficial for lesion localization. It is designed to achieve the global attention while saving a huge amount of computation. Adaptive Spatial Attention Module (ASAM): ASAM is used to aggregate features from different modalities. Since each modality contributes differently to the final csPCa classification, we adopt ASAM to measure the voxel-level contributions of each modality. 3D U-Net: we apply a classical 3D U-Net decoder to perform segmentation. The segmentation task serves as an auxiliary task to enhance the classification encoder.
- Others R1:
- class number Two classes: csPCa (346) and non-csPCa (166). During training, csPCa data (275) is around twice of non-csPCa (129), so we use undersampling to alleviate this issue.
- evaluation metrics Table 1 shows 5 evaluation metrics. Our method beats [17] in AUC, F1, Acc, Sen.
- PCM bottleneck When query and support prototypes are very close, the PCM runs into a bottleneck.
- pseudo-label The relevance of this step can be verified through the classification metrics, but cannot be directly evaluated due to the scarcity of csPCa masks.
- CAM and Grad-CAM In this study, CAM refers to the class activation map image, and Grad-CAM is a method to generate CAM. R3:
- L_seg_support L_cls and L_seg_support are both calculated using cross-entropy loss, so they maintain similar magnitudes. L_seg_support is a pivotal loss for segmentation task.
- mean+/-STD We’ll update this. R4:
- L_neg in (9) is corrected as L_seg_neg, mentioned in (4).
- We have amended and proofread the paper to improve its clarity.
- CAM visualization We do not consider this as a technique contribution, but a clinical contribution since it may offer valuable guidance for TRUS-guided targeted biopsies.
- We have reorganized Table 1.
- We have corrected the error on page 6.
Meta-Review
Meta-review #1
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
All the major points raised by reviewers wrt previous work and the use/conclusions from the CAM visualization have been comprehensivelty addressed, including corrections and clarifications.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
All the major points raised by reviewers wrt previous work and the use/conclusions from the CAM visualization have been comprehensivelty addressed, including corrections and clarifications.
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
Paper has novel components especially the multimodal fusion. Moreover, the rebuttal address the major concerns of all reviewers.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
Paper has novel components especially the multimodal fusion. Moreover, the rebuttal address the major concerns of all reviewers.