Abstract

Accurate carotid plaque grading (CPG) is vital to assess the risk of cardiovascular and cerebrovascular diseases. Due to the small size and high intra-class variability of plaque, CPG is commonly evaluated using a combination of transverse and longitudinal ultrasound views in clinical practice. However, most existing deep learning-based multi-view classification methods focus on feature fusion across different views, neglecting the importance of representation learning and the difference in class features. To address these issues, we propose a novel Corpus-ViewCategory Refinement Framework (CVC-RF) that processes information from Corpus-, View-, and Category-levels, enhancing model performance. Our contribution is four-fold. First, to the best of our knowledge, we are the foremost deep-learning-based method for CPG according to the latest Carotid Plaque-RADS guidelines. Second, we propose a novel centermemory contrastive loss, which enhances the network’s global modeling capability by comparing with representative cluster centers and diverse negative samples at Corpus-level. Third, we design a cascaded down-sampling attention module to fuse multi-scale information and achieve implicit feature interaction at View-level. Finally, a parameterfree mixture-of-experts weighting strategy is introduced to leverage class clustering knowledge to weight different experts, enabling feature decoupling at Category-level. Experimental results indicate that CVC-RF effectively models global features via multi-level refinement, achieving state-of-the-art performance in the challenging CPG task.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0896_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/dndins/CVC-RF

Link to the Dataset(s)

N/A

BibTex

@InProceedings{ZhuZhi_Hierarchical_MICCAI2025,
        author = { Zhu, Zhiyuan and Wang, Jian and Jiang, Yong and Han, Tong and Huang, Yuhao and Zhang, Ang and Yang, Kaiwen and Luo, Mingyuan and Liu, Zhe and Duan, Yaofei and Ni, Dong and Tang, Tianhong and Yang, Xin},
        title = { { Hierarchical Corpus-View-Category Refinement for Carotid Plaque Risk Grading in Ultrasound } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {255 -- 264}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper, the authors propose a novel multi-level refinement framework (CVC-RF) for carotid plaque grading in ultrasound, aligning with the latest Plaque-RADS guidelines. They introduce a Center-Memory Contrastive Loss to enhance global representation learning, a cascaded attention module for multi-scale feature fusion, and a parameter-free mixture-of-experts strategy for class-specific modeling.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper are:

    1. The multi-level refinement strategy is novel and well-motivated, effectively integrating representation learning, view fusion, and category-aware modeling.
    2. The CMCL offers a solid contrastive learning formulation with memory-bank-based class centers, enabling strong representation learning even with small batch sizes.
    3. The design of the DSAM module demonstrates careful attention to the small plaque detection problem, which is critical in CPG tasks.
    4. The MoE weighting strategy is elegant and efficient—it captures class-specific nuances without adding trainable gating networks.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. In Section 2.1, the authors argue that positive and negative comparisons with large batch samples may lead to a deviation from the overall distribution, but there is no experimental evidence provided to support this claim.
    2. The paper uses softmax to assign weights to each expert without introducing additional parameters; however, it does not discuss the accuracy trade-off compared to approaches that use trainable parameters.
    3. The experiments are conducted on a private dataset. Providing validation results on a public dataset would make the findings more convincing. 4.The implementation code is not provided, so I am uncertain about the reproducibility of the method.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is the first deep learning-based work that follows the Plaque-RADS guidelines for CPG, which shows a certain level of novelty. However, the architecture lacks innovation in the machine learning field, the code is not publicly available, and the experiments are conducted on a private dataset, making the reproducibility of the results uncertain.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Thank you for your detailed response. You have clearly articulated the key innovations of your work—Based on your justification and the supporting experimental validation, I agree that the novelty is well-substantiated and have revised my assessment accordingly.



Review #2

  • Please describe the contribution of the paper

    The paper demonstrates notable innovation by being the first to integrate the gold standard for carotid plaque grading, Plaque-RADS, into the design of a plaque classification diagnostic model. The methodology is also quite novel, as it combines two distinct scanning views and applies a multi-level approach (defined by the authors) to achieve the classification of plaques.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors proposed a novel model for carotid artery plaque grading diagnosis, built upon the latest plaque grading standard, Plaque-RADS, which holds significant value for the clinical evaluation of plaques. In terms of methodology, the model primarily integrates contrastive loss (CMCL) to achieve inter-class alignment across two views. Additionally, different experts are designed for specific categories based on the Mixture of Experts (MOE) framework to guide the classification task.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. It is unclear whether the “spacing information” mentioned at the beginning of the Methods section refers specifically to intima-media thickness. The manuscript does not elaborate on how this information is integrated into the feature extraction process. This ambiguity raises the concern that a priori identification of thickening or plaque presence may be required for all samples, which could limit the clinical applicability.
    2. In Section 2.1, the initialization strategy for “mL (or mT)” is not specified. It remains unclear whether M is randomly initialized or predefined, which introduces confusion in interpreting the method.
    3. The comparison methods used in the experiments are general medical image analysis methods, rather than those specifically designed for plaque or intima-media classification (Similar issues appear in the introduction).
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The primary innovation of the paper lies in the overall pipeline, where the authors combine Plaque-RADS for the first time and attempt to optimize the plaque grading model across multiple levels.
    2. The experiments are generally thorough, and the contributions of CMCL and DSAM are effectively visualized.
    3. However, there are aspects that may cause confusion, such as the unclear explanation of “spacing information” and “mL (or mT),” as well as the choice of comparison methods.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have clearly explained the spacing information and addressed the concerns regarding the network details. The methodology presented in the paper is innovative, and the overall structure is well-organized. Therefore, I recommend accepting the paper. A small suggestion is to replace the comparison methods with those specifically for intima-media or plaque classification, which further strengthens the model’s credibility.



Review #3

  • Please describe the contribution of the paper

    This paper proposes a novel multi-view deep learning framework to enhance carotid plaque grading in clinical ultrasound imaging. The model leverages three levels of learning including Corpus-level contrastive learning for better representation learning via a memory bank-based Center-memory Contrastive Loss (CMCL) to improve global representation learning with small batch sizes, View-level fusion to extract shared information, and Category-level decoupling with mixture-of-experts strategy for class-specific learning. The framework is designed in alignment with the latest Carotid Plaque-RADS guidelines and incorporates hierarchical refinement to capture multi-scale, multi-domain features. Validated on a clinical ultrasound dataset, the method outperforms existing baselines and includes comprehensive ablation studies to demonstrate the contribution of each component within the network.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel framework: The paper presents a thoughtfully designed deep learning framework that integrates corpus-level, view-level, and category-level learning modules to enhance representation learning and inter-class feature discrimination in multi-view carotid plaque classification. Unlike previous multi-view approaches that primarily focus on feature fusion, this method explicitly models shared and class-specific features across views, resulting in more discriminative representations. Methodological Rigor: The authors conduct comprehensive experiments on a clinical ultrasound dataset, benchmarking their method against several state-of-the-art baselines. Furthermore, detailed ablation studies and explainable visualization analyses (such as t-SNE and Grad-CAM) are included to validate the individual contributions of each component in the framework.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Single dataset evaluation: The model is trained and tested only on a single-center dataset with limited image pairs and with an imbalanced class distribution (especially classes 3–4). This limits its demonstrated generalizability. Complexity of architecture: While modular, the combination of memory banks, attention fusion, and MoE may introduce computational overhead. The paper lacks detailed analysis of inference time, memory usage, or deployment feasibility. No statistical tests: Performance is reported via mean values, but lacks statistical significance testing (e.g., paired t-tests or confidence intervals) to strengthen claims of superiority.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Data and Code Availability: The authors are encouraged to release the code and provide more implementation details for reproducibility. Dataset Generalizability: To better assess real-world performance, testing on public datasets (e.g., https://pmc.ncbi.nlm.nih.gov/articles/PMC10417708/) or multi-center data is highly recommended. Model Scalability: Please consider including training and inference time, number of parameters, and model size to assess real-time deployment potential.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents a well-motivated and technically sound framework for improving carotid plaque grading in clinical ultrasound imaging. Despite some limitations in dataset size and generalizability, the proposed multi-level architecture and learning objectives represent a substantial contribution to the field.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors provided a clear and convincing rebuttal that effectively addressed all key concerns raised in the reviews:

    Single Dataset Evaluation: The authors justified the use of a single, clinically annotated dataset due to the lack of publicly available data aligned with Plaque-RADS, and emphasized its diversity and real-world relevance.

    Model Complexity: The authors clarified the motivation for the model’s architecture and demonstrated that, despite its complexity, it achieves a relatively fast training time (~52 minutes), making it practical.

    Lack of Statistical Analysis: The authors added statistical tests in the rebuttal and showed significant differences across groups, strengthening the validity of their findings.

    Reproducibility: The authors explained the constraints around data sharing and committed to releasing code, improving the transparency and utility of their work for the community.




Author Feedback

We appreciate all the reviewers for the valuable comments. We will address the main concerns of the reviewers and improve the writing.

Q1: Code. (R1, R2, R3) We will release the code upon acceptance for solid reproducibility.

Q2: Novelty. (R1) This study builds the largest dataset strictly following the latest world Plaque-RADS guidelines. All reviewers recognize that our architecture optimizes at three necessary & complementary levels, and hence fundamentally differs from prior multi-view methods that only rely on view-level cues. Our technic innovations are: (1) CMCL improves feature learning with small batch sizes. (2) DSAM enables smooth multi-scale fusion, enhancing small object perception. (3) MoE reduces complexity while improving accuracy without adding parameters. Extensive experiments confirm the effectiveness and overall advantage of our approach.

Q3: Dataset. (R1, R2, R3) (1) We will release our paired carotid ultrasound (US) dataset upon acceptance to support community (3314 images with expert labels). (2) We reviewed 17 public carotid US datasets suggested by R2. All of them lack paired views, making them incompatible with our task definition. (3) Experiments indicate that our method effectively overcomes the class imbalance. Our method improves minority class performance by 5.86% over the second-best method (Tab. 1), and t-SNE shows better feature separation via CMCL. Class imbalance in our dataset reflects clinical practice, where RADS 3–4 plaques are naturally less common. We will continue data collection to address imbalance further. (4) A multi-center dataset (≥2000 patients) is under development. We plan to validate our method on it to assess generalizability across centers and imaging protocols.

Q4: Comparisons & Statistical Significance. (R2, R3) Existing plaque or intima-media classification studies only used single-view analysis, while latest guidelines we are following recommend both transverse and longitudinal views. This mismatch limits direct comparability. In Tab. 1, we compared single-view and multi-view baselines; all multi-view methods achieved 1.5%–10.8% gains. In addition, we conducted a paired t-tests and all p-values < 0.05, which statistically proves the superiority of multi-view settings.

Q5: Parameter-free Gate (PFG) vs Parameter-trained Gate (PTG). (R1) PFG is better than PTG in accuracy, interpretability and parameter amount. (1) Our previous experiments shown that PFG outperforms PTG in accuracy (1.2%). PTG tends to activate a specific expert while ignoring others and degrades accuracy. (2) PFG provides explicit category prior knowledge via cosine similarity among input image features and class clusters. Higher similarity leads to more weight for the corresponding expert, guiding each expert to learn category-specific cues. Whereas, PTG relies on implicit weight optimization, which is less interpretable. (3) PFG doesn’t is parameter-free. PTG requires additional training parameters.

Q6: Method Details. (R3) Spacing refers to the physical size (mm) of each pixel. It helps model perceive the plaque size. The spacing is expanded to [1, H, W] dimension, concatenated with the image along the channel dim, and fed into the model. M consists of two memory banks, mL & mT, both initialized using ResNet18(pretrained on ImageNet) + 2-layers MLP. All training images are compressed and projected to a 128-dim features, normalized, and added to mL and mT for initialize.

Q7: Writing. (R1) We will revise the unclear expression. We want to express is that as the batch size increases, it better approximates the global distribution, but some bias always remains unless the entire training set is used as one batch.

Q8: Computational Overhead. (R2) We prioritized accuracy while keeping complexity moderate. Training on an NVIDIA GeForce RTX 3090 for 100 epochs took 52 minutes. The model infers in 41.96 ms/case, with 18.73M parameters, 16.50 GFLOPs, demonstrating strong deployment potential.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    The reviewers generally found the paper to be relevant and recognized its potential contributions. However, they all raised their concerns. The authors should carefully address the reviewers’ concerns, particularly those related to method clarity, dataset limitations, and reproducibility, in the rebuttal.

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top