Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Prostate cancer (PCa) is a leading cause of cancer-related mortality in men, and accurate identification of clinically significant PCa (csPCa) is critical for timely intervention. Transrectal ultrasound (TRUS) is widely used for prostate biopsy; however, its low contrast and anisotropic spatial resolution pose diagnostic challenges. To address these limitations, we propose a novel hybrid-view attention (HVA) network for csPCa classification in 3D TRUS that leverages complementary information from transverse and sagittal views. Our approach integrates a CNN-transformer hybrid architecture, where convolutional layers extract fine-grained local features and transformer-based HVA models global dependencies. Specifically, the HVA comprises intra-view attention to refine features within a single view and cross-view attention to incorporate complementary information across views. Furthermore, a hybrid-view adaptive fusion module dynamically aggregates features along both channel and spatial dimensions, enhancing the overall representation. Experiments are conducted on an in-house dataset containing 590 subjects who underwent prostate biopsy. Comparative and ablation results prove the efficacy of our method. The code is available at https://github.com/mock1ngbrd/HVAN.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1144_paper.pdf

SharedIt Link: https://rdcu.be/eHwK9

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04927-8_25

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/mock1ngbrd/HVAN

Link to the Dataset(s)

N/A

BibTex

@InProceedings{FenZet_HybridView_MICCAI2025,
        author = { Feng, Zetian AND Fu, Juan AND Zou, Xuebin AND Ye, Hongsheng AND Wu, Hong AND Zhou, Jianhua AND Wang, Yi},
        title = { { Hybrid-View Attention Network for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15960},
        month = {September},
        page = {260 -- 269}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposed a method for identifying clinical significant cancer in prostate ultrasound image via combining axial and sagittal view.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The structure of the paper is clean and in general easy to understand. Results shows improvement in ablation study and in comparison with other baseline methods.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Introduction, paragraph 2, “….Thus, methods to enhance early detection and accurate identification of csPCa in TRUS are critical.” If you can identify the cancer on ultrasound images, most of the time, the cancer should be no longer in early stage.
2. Introduction, paragraph 3 “However, they all focuson the transverse view, neglecting the complementary information in the sagittal view, which can help confirm any suspicious lesions” – Need to specify what is the complementary information. 3.In Fig.1(), the input of the CNN is reconstructed from Ultrasound images, Please explain the detailed of the reconstruction. 4.As for the CNN-transformer architecture the authors claim that they are, it’s more like pure CNN with attention mechanism added in. As the model has no typical transformer-like structure , such as multi-head attention.
3. Section 3, what is the type of the biopsy?
4. Section 4, why the training is missing validation set?
5. In Fig.3 (b), since your data is collected with patients with biopsy, but the visualisation seems the histopathology images are from whole gland Prostatectomy. Please explain where those data comping from?
6. In the abstract, the author indicating the code will be published after the completion of the review process, like many other studies claimed. But this has no help to the reviewer to judge the reproducibility of this study – as many people didn’t release anything after the review. I suggest use the anonymous GitHub to demonstrate the reproducibility of the code.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Considering the points mentioned in the weakness of the paper, this paper should be weakly rejected, depend on the rebuttal.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

Concerns addressed.

Review #2

Please describe the contribution of the paper

This paper proposed a hybrid-view attention network for 3D TRUS csPCa classification, using information from both a single view and orthogonal views. Experiment results of ablation study and comparison results shows the effectiveness of the proposed method.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper makes use of two orthogonal views of TRUS scanning, by designing the Intra-View Attention and Cross-View Attention module. The information from two parts are fused using a Hybrid-View Adaptive Fusion.

The network architecture is clearly explained with figures and equations. The dataset and experiment implementation are well described. The experiment results shows the effectiveness of the proposed network.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. This paper reads more like a code description. For example, “Concat” in Eq. 3, and “AvgPool” in Eq. 5. “:” in Eq. 6. As this paper is mainly about the network design, it is important to explain the reason behind each module and clearly state the input and output, together with dimensions.
2. “Focal loss was employed” - either define the loss in method section or add reference. Same as the evaluation metrics.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper proposed a new network architecture for 3D TRUS csPCa classification. Detailed description is listed in Method section. However, this section is more like a description of the network architecture from the code perspective, rather than in math notations. The ablation study shows the effectiveness of each module and prove the improved performance compared with other SOTA methods.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have addressed most concerns raised by reviews.

Review #3

Please describe the contribution of the paper

In this paper, the authors propose a novel hybrid-view attention network for the classification of clinically significant prostate cancer in 3D transrectal ultrasound (TRUS). The network leverages both sagittal and transverse views of TRUS images, utilizing convolutional neural networks (CNNs) to capture local features and transformers to extract global representations. An intra-view attention module is employed to capture information within each individual view, while a cross-view attention mechanism integrates complementary information from the second view. Finally, these features are combined through a hybrid-view adaptive fusion module, enabling effective integration of multi-view information for improved classification performance.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The model design is clear and well-reasoned, and the paper is well written. The manuscript is well organized, with a clear and detailed description of the proposed method.
- The application is clinically meaningful, addressing an important challenge in prostate cancer diagnosis using TRUS imaging.
The authors provide a comparative evaluation against state-of-the-art methods, demonstrating the effectiveness of their approach.
- Transparency through ablation study: The inclusion of an ablation study contributes to a deeper understanding of the model’s components. By analyzing the effect of individual modules, the study offers valuable insights for further optimization and development.
- Comprehensive evaluation metrics: The authors adopt a thorough evaluation strategy using multiple metrics, including accuracy, specificity, sensitivity, F1-score, and AUC. This multifaceted assessment enhances the credibility and robustness of the reported performance.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Limited reproducibility due to non-public data: The evaluation relies on a dataset that is not publicly available, which limits the reproducibility of the study and prevents other researchers from verifying the results or building upon the work.
- Evaluation on a single dataset: While it is acceptable to evaluate a new model on a single dataset as an initial step, testing on multiple datasets is generally recommended. This would help assess the generalizability of the model and reduce the risk of overfitting to the specific characteristics of the dataset used.
- Lack of computational efficiency comparison: The paper does not report or compare the computational time or efficiency of the proposed method against other approaches. Including such analysis would provide a more comprehensive understanding of the model’s practical applicability, especially in clinical settings where runtime may be critical.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

Fig1: (c) is not consistent with (a). in (c), one of the inputs to CVT is sagittal view, while in (a) both inputs of CVA are the outputs of IVAs.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors improved the classification of clinically significant prostate cancer (csPCa) by proposing a novel architecture that leverages both sagittal and transverse views of TRUS images. They demonstrated the contribution of each component of the network through a comprehensive ablation study. The proposed method is both technically novel and holds significant clinical relevance.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have thoroughly addressed the reviewers’ concerns and provided satisfactory responses; I therefore recommend that this paper be accepted.

Author Feedback

R1: Thank you for your review and here we explain all your concerns. 1) csPCa in TRUS: In practice, csPCa can be challenging to identify in TRUS, regardless of its stage. Our work primarily focuses on accurate CAD for csPCa by leveraging both transverse and sagittal TRUS. We’ll revise the sentence by deleting “early detection”. 2) Complementary information: As noted in [4], performing TRUS from multiple imaging planes is essential for comprehensive lesion assessment, because lesion characteristics such as echogenicity can differ across views. Additionally, due to the high in-plane spatial resolution and relatively low resolution along the scanning direction, multi-view acquisition compensates for anisotropic imaging limitations, as mentioned in Sec 2.2. Our method leverages this complementary information through intra-view and cross-view attention mechanisms to enhance feature representation. 3) Reconstruction: It is performed directly by the US imaging system. 4) Transformer: The transformer used in our work is a tailored variant that integrates spatial and channel attention mechanisms, inspired by the core principles of the standard transformer architecture. As detailed in Sec 2.2, input features are projected into query, key, spatial value, and channel value representations. In IVA, spatial and channel attention are computed via self-attention mechanisms (Eqs 1&2), while in CVA, cross-attention is applied to integrate information across different views. 5) Biopsy type: TRUS-guided transperineal biopsy. 6) Validation: We adopted a train-test split strategy. Considering the specific task and data amount, we prioritized maximizing training data to enhance model learning, and reserved a set of testing data exclusively for evaluation. To mitigate overfitting, we adopted an early stopping strategy based on the training loss. 7) Data: All patients with csPCa in our dataset underwent biopsy followed by whole-gland prostatectomy. The histopathology images in Fig. 3 are obtained from the prostatectomy specimens of these same patients. 8) Code: As officially stated by MICCAI, we cannot include external link at this stage. Nonetheless, our method is described in detail and ensures high reproducibility, as recognized by other reviewers. We again promise to release code upon acceptance.

R2: Thank you for your positive feedback. 1) Clear description: The input/output dimensions will be clearly specified. The IVA leverages the high in-plane resolution of TRUS, while the CVA integrates complementary information from orthogonal views to address the limited resolution along the scanning direction (see Sec 2.2). The HVAF further aggregates spatial and channel-wise features from both views before classification (see Sec 2.3). The overall network design is inspired by TRUS-guided biopsy procedures, where multi-view information is routinely used for comprehensive lesion assessment [4]. 2) We’ll add references for the focal loss and evaluation metrics.

R3: We appreciate your recognition of our work. 1) Non-public data: Due to patient privacy restrictions, the dataset cannot be shared publicly at this time. We are actively exploring comparable open-access datasets for future release. To the best of our knowledge, this is the first study to explore multi-view information in TRUS for prostate cancer. We believe the proposed method can offer a new perspective of multi-view learning in other US-based diagnosis applications and inspire more researchers to work on this field. We’ll add a concise discussion on this. 2) Single dataset: The TRUS dataset with biopsy-proven csPCa is challenging to acquire. While our current study is based on single-center dataset, its size is adequate (surpassing [16] MICCAI 2024, 590 vs 512). We’ll add a concise discussion on this. 3) Runtime: Our method is the fastest among all comparison methods in our experiments. If allowed, this will be added in the manuscript. 4) Fig. 1: We’ll update Fig.1 to make it more clear.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The authors have adequately addressed all major concerns.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Hybrid-View Attention Network for Clinically Significant Prostate Cancer Classification in Transrectal Ultrasound

Author(s):