Abstract

Color Doppler echocardiography is a crucial tool for diagnosing mitral regurgitation (MR). Recent studies have explored intelligent methods for MR diagnosis to minimize user dependence and improve accuracy. However, these approaches often fail to align with clinical workflow and may lead to suboptimal accuracy and interpretability. In this study, we introduce an automated MR diagnosis model (MReg) developed on the 4-chamber cardiac color Doppler echocardiography video (A4C-CDV). It follows comprehensive feature mining strategies to detect MR and assess its severity considering clinical realities. Our contribution is threefold. First, we formulate the MR diagnosis as a regression task to capture the continuity and ordinal relationships between categories. Second, we design a feature selection and amplification mechanism to imitate the sonographer’s diagnostic logic for accurate MR grading. Third, inspired by the Mixture-of-Experts concept, we introduce a feature summary module to extract the category-level features, enhancing the representational capacity for more accurate grading. We trained and evaluated our proposed MReg on a large in-house A4C-CDV dataset comprising 1868 cases with three graded regurgitation labels. Compared to other weakly supervised video anomaly detection and supervised classification methods, MReg demonstrated superior performance in MR diagnosis. Our code is available at: https://github.com/cskdstz/MReg.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2951_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiuZhe_MReg_MICCAI2025,
        author = { Liu, Zhe and Huang, Yuhao and Liu, Lian and Zhang, Chengrui and Lin, Haotian and Han, Tong and Zhu, Zhiyuan and Chen, Yanlin and Chen, Ruiyue and Ni, Dong and Gou, Zhongshan and Yang, Xin},
        title = { { MReg: A Novel Regression Model with MoE-based Video Feature Mining for Mitral Regurgitation Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {384 -- 393}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a video-based model that is formulated as a regression task to predict grades of mitral regurgitation (MR). The method uses 4-chamber cardiac color Doppler echocardiography video and incorporates feature selection and amplification designs. The method is tested on an in-house dataset.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strengths of the paper:

    1. The paper introduces some novelty as proposed method deals with 4-chamber cardiac color Doppler echocardiography and tests it for a diseases that is not investigated very often.
    2. Results show potential in performing better than the state-of-the-art results, due to unique approach.
    3. The model is tested with different modules to provide details on performance with/without certain modules.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The paper unfortunately has several major weaknesses:

    1. The formulation of the problem is listed by the authors as regression problem, however in the paper itself, the model in fact predicts three distinct classes (normal - grade 0; mild MR - grade 1; moderate-severe MR - grade 3). It is confusing why the authors list the method to be a regression problem, when methodology clearly focuses on 3 distinct severity classes.
    2. The explanations of methods are very hard to follow and understand. Many details are given, which are not clearly connected.
    3. In the methods section, the authors mention combination of features from text and video, but no clear explanation of what text is used (extracted values form video, additional reports/medical history).
    4. A lot of existing methods such as Yolov8, X-CLIP are references as used, but it is no clear in which way and how/whether they are optimised for the specific purpose.
    5. Feature selection module is very hard to understand, as additional stages are introduced and details are given for the loss function, without clear objective of the inputs/outputs and the process of feature selection.
    6. The role of regression loss is not clear if there are three distinct grades that are predicted.
    7. Since the dataset used is in house dataset, vert little details are given regarding the imaging conditions, no details on image/video spatial/temporal resolution.
    8. It is not clear why the dataset was divided in such a way that the test set contains almost as many cases as training set (training 450/296/103, validation 112/74/25 and testing 403/307/98).
    9. The authors do not compare their approach with some standard state-of-the-art methods for video processing, such as 3D CNN. Is there a reason for that?
    10. The demonstration of the methods tested on only in-house dataset, with very little details on the dataset description may limit the application.
    11. Conclusion is written in a very short and general form.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. The abstract is written in a somewhat general form with not so many details on the proposed methods. Abstract can be improved by including details on the proposed methodology, details of the dataset (number of patients/videos), and numerical results. In such a way, abstract would catch the reader attention better.

    2. Limitations of the study listed in the paper can help the reader understand the improvement that would/could be made in the approach.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has potential of using proposed method for diagnostics of MR disease. However, because of the unclear formulation of objectives, explanations in methodology, no details on the dataset, the paper would need to go extensive changes before reassessing its methods can be provided.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose a novel end-to-end, regression-based framework for detecting mitral regurgitation and estimating its severity using the apical four-chamber echocardiographic view. The framework integrates several innovative components, including a tailored feature selection mechanism, amplification design strategies, and a Mixture-of-Experts-based feature summarization module.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper exhibits the following important strengths: -The integration of regression modeling with feature selection, amplification, and a Mixture-of-Experts summarization module represents a novel formulation for echocardiographic analysis. -The focus on mitral regurgitation, a common and clinically significant condition, enhances the work’s applicability to real-world practice. -The inclusion of p-values to support performance comparisons strengthens the credibility of the findings.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Despite its strengths, the paper has several important weaknesses that limit the assessment of its generalizability and reproducibility:

    • Insufficient dataset and image information: The manuscript lacks key details about the dataset, including imaging device specifications, image quality, resolution, input dimensions, and the age range of subjects. These omissions hinder an evaluation of how well the model might generalize to different clinical environments or patient populations.
    • Unclear justification for design choices: While the method focuses on the A4C view, the rationale for excluding other common views (e.g., parasternal long-axis) is not addressed. Furthermore, the basis for selecting 16 consecutive frames per sample is unclear—was this empirically determined, and if so, how?
    • Limited explanation of preprocessing and model adaptation: The use of YOLOv8 for cropping the ROI is mentioned but not described in detail. The methodology for eliminating irrelevant areas and grouping samples into three bags also lacks sufficient explanation.
    • Missing performance variability metrics: Although mean performance metrics are reported, standard deviations and error bars are not provided. This omission makes it difficult to assess the robustness and variability of the results.
    • No discussion of limitations: The paper does not include a discussion of the method’s limitations or challenges, such as performance on low-quality images or potential failure modes.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The absence of the mentioned information limits the ability to fully evaluate the proposed approach. I recommend providing more comprehensive details, as outlined above.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose MReg, a novel supervised framework for the automatic classification of mitral regurgitation (MR) severity—normal, mild, and moderate-to-severe—using 4-chamber cardiac colour Doppler echocardiography videos, formulated as a regression task. In the preprocessing stage, YOLOv8 is employed to localise the region of interest (ROI), which is subsequently divided into multiple instances via Multi-Instance Learning (MIL) to alleviate the need for pinpointing the frame with peak regurgitation severity.

    The core architecture of MReg comprises three key modules:

    1. Cross-Modal CLIP (XCLIP) is used for feature selection, identifying the instance that exhibits the highest similarity to binary text labels, thereby simulating clinical focus on diagnostically informative frames;

    2. A feature amplification module further refines the selected instance’s representation, enhancing the network’s ability to capture salient diagnostic cues;

    3. A Mixture-of-Experts (MoE) module improves regression performance by adaptively weighting the amplified visual features with category-specific textual embeddings using three learnable experts.

    The proposed MReg framework demonstrates superior performance compared to several state-of-the-art methods. Ablation studies further validate the effectiveness of each individual component in contributing to the overall performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novel and clinically relevant formulation: MReg tackles a challenging real-world problem—mitral regurgitation (MR) severity classification—using Doppler echocardiography, a modality that is widely used but often hard to interpret quantitatively. The regression-based formulation better reflects the continuous nature of MR severity than discrete classification.

    • End-to-end and interpretable design: Incorporates clinically inspired ideas such as selecting the most informative frame (via XCLIP) and amplifying key features, which mirrors how cardiologists focus on salient frames during diagnosis.

    • Effective use of multi-instance learning (MIL): MIL avoids the need for precise manual annotation of keyframes with peak regurgitation severity, improving usability and scalability in real-world settings.

    • Modular architecture with strong empirical support: The three-part design (XCLIP feature selection, amplification, and MoE fusion) is well-motivated, and ablation studies back up the contribution of each component.

    • Strong performance: Outperforms state-of-the-art models, suggesting both technical novelty and practical utility.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Computational complexity and data demands: The integration of multiple advanced modules—YOLOv8, XCLIP, and Mixture-of-Experts (MoE)—results in a computationally intensive pipeline. Additionally, components like XCLIP and MoE are known to be data-hungry, which may pose challenges when fine-tuning on relatively small datasets like echocardiography. This could limit the reproducibility and generalisability of the approach across other clinical tasks or institutions where large-scale annotated data is less accessible—a common constraint in medical imaging.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • Lacks clarity on:
      1. ROI mask generation: How are the masks used for ROI extraction obtained? Are they created using a simple threshold-based algorithm or manually annotated by experts? A brief clarification would help readers better understand the preprocessing pipeline.
      2. Frame distribution and MIL setup: What is the average number of frames per video in the dataset? Additionally, are there overlapping frames between the three instances formed during the Multi-Instance Learning (MIL) step? This detail is important to understand the temporal coverage and variability within each instance.
      3. MoE training details: Is the Mixture-of-Experts (MoE) model pre-trained on a publicly available dataset, or is it trained from scratch within this framework? Clarifying this would help assess both reproducibility and the reliance on external data sources.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (6) Strong Accept — must be accepted due to excellence

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a highly innovative and well-executed framework for mitral regurgitation (MR) severity classification using cardiac colour Doppler echocardiography videos. By effectively integrating multiple cutting-edge components—including YOLOv8 for ROI localisation, CLIP-based instance selection, feature amplification, and a Mixture-of-Experts (MoE) module—the proposed MReg framework tackles a clinically relevant and technically challenging problem with strong methodological novelty. Each module is thoughtfully designed to mimic real-world clinical reasoning, and the overall architecture shows strong performance gains over state-of-the-art baselines. The use of multi-instance learning to handle temporal ambiguity, and the incorporation of cross-modal representations, further enhance the framework’s practicality and robustness. The paper is well-written, clearly structured, and includes thorough ablation studies that validate the contribution of each component. Given its clinical importance, technical innovation, and strong results, I believe this work makes an excellent contribution and should be accepted without reservation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors provided sufficient clarifications on the methodology, do not see a reason to reject the paper.




Author Feedback

We thank all the reviewers (R) for recognizing our work. We have carefully addressed each comment to improve the manuscript.

Q1. Methodology (#R2) (1) Our method workflow and innovations are sound. It takes A4C-CDV, 2-category text (“Normal”, “Abnormal”), and 3-category text (“Normal”, “Mild MR”, “Moderate-to-Severe MR”) as input. First, the model combines the video feature and 2-category text features (extracted via X-CLIP) to select the severest MR clip feature in the video bag (Feature Selection mechanism). The selected clip feature is then combined with 3-category text features to output a MR severity regression value (0~2.0) after the feature amplification and summarization modules. Finally, we classify the value into three categories via thresholds (0.5, 1.5). (2) Using regression loss can better capture MR severity’s continuous nature, while classification loss disrupts this relationship (ablation study proves).

Q2. MIL and Feature Selection (#R2, #R3, #R4) (1) The MIL-based operation is designed to use the most representative clip for diagnosis, avoiding interference from other clips. (2) Feature Selection mimics doctors’ identification of the most severe MR cycle for final diagnosis. A binary classification loss (with/without MR) supervises the learning. (3) We set 16-frame clips to ensure ≥1 cardiac cycle coverage. Adults have ~75 heartbeats/minute (~1.25 cycle/second). The FPS of A4C-CDV is ~16. Thus, a 16-frame clip can include >1 cardiac cycle for diagnosis. Also, our acquisition protocol requires A4C-CDV with ≥3 cardiac cycles to cover MR variations, hence each video yields ≥3 clips.

Q3. Dataset Details (#R2, #R3) (1) Our study obtained ethical approval in 2023. We used Vivid E95/E9 (GE Healthcare) and EPIQ7/IE33/CX50 (Philips Healthcare) ultrasound systems. Doppler imaging followed the ASE guideline, with recommended settings for 2D sector size, color ROI box size, gain adjustment, color map selection, and velocity scale optimization. Volunteer age ranges from 18 ~ 99. (2) While multiple views are recommended for MR diagnosis, A4C-CDV is most prevalent & accessible view in practice. For elderly/severely impaired patients, other views are infeasible, and clinical diagnosis relies solely on A4C-CDV. So our study is based on A4C as a priority. (3) Collected with two phases, our dataset was split as: training 450/296/103, validation 112/74/25, testing 403/307/98. Initially, 30% of each category was randomly selected to testing, with the remaining data split 4:1 for training/validation. Then, we augmented the test set with additional cases (Most are MR) to enhance model validation.

Q4. Experiment (#R2, #R3) (1) Due to limited space, we only present Transformer results. CNNs performed worse by ~5% in our testing. This is also supported by a high-impact study by David et al. published in Circulation. They proved that with 10,000+ cases, CNNs reached only 80% sensitivity for moderate-to-severe MR. This may be because MR diagnosis focuses on dynamic regurgitated regions in videos. The self-attention mechanism in Transformers fits this need well. (2) We reported metrics from a fixed randomly-sampled set (not cross-validation) and observed significant performance gains. We appreciate the reviewers’ suggestions and will conduct additional validation in journal version.

Q5. Conclusions and limitations (#R2, #R3) We will add more discussion in the revision: The model was developed using limited MR subcategories, with above-moderate cases being only 25% of normal data. We will expand each subcategory to improve performance, and develop a multi-view diagnostic method.

Q6. ROI (#R2, #R3, #R4) The ROI is a rectangular area covering all four cardiac chambers. It was obtained using YOLO v8 pre-trained on MR dataset with ROI labels. Furthermore, our approach utilizes YOLOv8 (for pre-computed ROI detection), X-CLIP, and MoE (lightweight experts without pre-training). Only X-CLIP requires substantial training resources.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Despite one glowing review, it appears that the paper still had deficiencies as pointed to by other reviewers who haven’t changed their assessment after the rebuttal.



back to top