Abstract

Rheumatic heart disease (RHD) is the leading global cardiac condition, affecting over 54 million people, predominantly in resource-constrained countries. Early detection via color Doppler echocardiography is crucial but often inaccessible due to reliance on specialized cardiologists. Consequently, such data from patients diagnosed with RHD are scarce. To address data limitations in developing robust RHD detection methods, we propose a novel AI-driven approach to synthesize color Doppler echocardiograms with matched B-mode ultrasound using a multi-factor conditioned diffusion model. To our knowledge, this is the first generative AI design for dual-channel color Doppler synthesis. Our model enhances realism by incorporating temporal information for motion consistency and class label for targeted synthesis. We use B-mode ultrasound to visualize anatomical structures and the Doppler-mode fields of view to define blood flow regions across key echocardiographic views (e.g., parasternal and apical). We synthesize one echocardiographic mode from another using cross-view translation to augment data and improve diversity. We evaluated our approach using synthetic data generated from echocardiograms of 589 Ugandan cases and the public CAMUS dataset. Our model outperformed state-of-the-art generative methods in fidelity and structural similarity. We trained and tested an RHD classifier on limited data from different devices. Training with synthetic data significantly improved detection performance compared to a model trained only on real data. These findings highlight the potential of diffusion-based synthetic data to democratize the diagnosis of heart diseases in marginalized populations and low-resource settings. Our approach is scalable, promotes health equity, and contributes to RHD prevention and reduced mortality.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1573_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

CAMUS dataset: https://www.creatis.insa-lyon.fr/Challenge/camus/databases.html

BibTex

@InProceedings{RosPoo_Synthesis_MICCAI2025,
        author = { Roshanitabrizi, Pooneh and Guo, Pengfei and Aharonyan, Artur Arturi and Brown, Kelsey and Broudy, Taylor Gloria and Parida, Abhijeet and Tapp, Austin and Jiang, Zhifan and Tompsett, Alison and Rwebembera, Joselyn and Okello, Emmy and Beaton, Andrea and Roth, Holger R. and Xu, Daguang and Anwar, Syed Muhammad and Sable, Craig A. and Linguraru, Marius George},
        title = { { Synthesis of Pathological Dual-Channel Color Doppler Echocardiograms for Equitable Diagnosis of Heart Diseases } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {589 -- 599}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a multi-conditioned denoising diffusion model (“3.5 D-DM”) that simultaneously generates B-mode and color-Doppler cine loops at 128 × 128 × 32 resolution. Conditions include class label, temporal index, Doppler field-of-view mask, and optionally a B-mode frame. The synthetic videos are used to train a classifier, yielding a better classifier than the one trained on real data only.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well written and easy to follow.
    • The motivation of the paper is promising, addressing the gap between B-mode and Doppler ultrasound and potentially useful for clinical applications.
    • Results showing that the synthetic data can be used to train a classifier that outperforms the one trained on real data only are interesting and relevant.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Some statements in the introduction are overclaimed. While authors claim the proposed method to be the “first AI-driven approach for synthesizing color Doppler echocardiography with matched B-mode ultrasound,” it is not convincing, as Sun et al.​ 2022 already produced paired B-mode + Doppler reconstructions (albeit physics-based) and Zhou et al.​ 2024 generated controllable multi-modal echo cine with diffusion models.
    • In addition, the architecture “3.5 D” is catchy but ill-defined. From the text, it is a 2.5D U-Net for Doppler plus a 1D branch for B-mode, which is simply frame-wise generation with temporal concatenation, and also claims to be “effectively incorporated temporal information” that is no different from using conventional 3D U-Net. The authors should clarify the novelty of the architecture.
    • Authors did not mention the publication of code and dataset. Minor comments:
    • Will the resolution of 128 × 128 × 32 be sufficient for clinical applications?
    • How are Doppler FOV masks generated?
    • 516 minutes per echo is a long time for synthesizing a single echo. Maybe a typo?
    • Generated videos are kept only if a pre-trained RHD classifier predicts the conditioned label. This creates a feedback loop: the same feature space used for downstream assessment filters the training data
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an interesting problem of generating synthetic ultrasound data, but lack of novelty in the method and overclaims in the introduction hinder the overall contribution.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    A cardiac generated database on ultrasound with synthesized color Doppler echocardiograms specialized on Rheumatic heart disease (RHD).

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Creation of a generative data base based on color flow doppler and anatomy, based on diffusion models, for RHD and normal subjects Application on time dual-channel color Doppler echocardiograms coupled with B-mode ultrasound The model’s output exhibits characteristics of RHD, as confirmed by clinicians.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    They do not mention the number of clinicians or from which hospital came, in order ti confirm the database as valid, or if it was a quantitative or qualitative assessed by them. A posterior classification was but a further study should be made in order to show the maximum limit of studies is valid and the classifier performance does not decay. They do not mentions important details such as reproducibility or if the used and generated database will be available

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The study is good per se but there are still details of reproducibility and validation that need to be done.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Questions and comments were answered.



Review #3

  • Please describe the contribution of the paper

    The paper proposes an approach for synthesizing color Doppler echocar-diography (2.5D) with matched B-mode (1D) ultrasound using diffusion models (total 3.5D). It leverages multi-factor conditioning for enhanced realism and diagnostic utility. The resulting images look better and is lower in FID/FVD metrics and higher in SSIM metrics. The resulting model is used for classification purposes - RHD detection, with better results.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is well-written and easy to read.
    2. The diffusion model conditioning was easier to understand from an application point of view and Fig. 2 is also easy to interpret.
    3. The follow-up application in RHD detection further helps in the advantage of the proposed approach.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    There is a limitation to this approach (which is also mentioned in the paper!) - the quality of the synthesis is dependent on the quality of the inputs. However, a pathway to help solve this problem has also been discussed in the paper.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper was a breeze to read and understand. It also compares to the state of the art methods like LVDM and 3D LDM. The model seamlessly synthesized paired B-mode and Doppler data—both healthy and pathological—as well as generated new acquisition views with clinical utility (RHD detection). All in all, it is a good paper, with slightly limited (albeit useful novelty), and a good application for an initial starting point.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The rebuttal answered all the questions posed by the reviewer (myself and the others) and have, albeit not completely novel, managed to solve an existing problem.




Author Feedback

We thank the reviewers for their thoughtful, constructive, and encouraging feedback. The responses and clarifications below will be incorporated into the manuscript. Significance/Motivation: Our study introduces a novel approach for early RHD detection—a critical global health need in LMICs, where the lives of over 54 million young people are at risk. Mild RHD is treatable if detected early, but it cannot be heard with a stethoscope and requires specialized interpretation of ultrasound, which is a rare skill in low-resource settings. Our paper is inspired by this real-world clinical need and our strong belief that AI-enhanced portable ultrasound will have a huge impact on global health and the lives of the poorest patients. We are excited to have started a screening program for 200,000 schoolchildren in an LMIC (details not shared for anonymity). Data: Because RHD affects underserved children: -RHD ultrasound data is rare; data augmentation is essential. -No public datasets exist. -Our exceptional dataset is representative of RHD in LMICs and proprietary to the LMIC team. Claim Clarification and Novelty–R#4: Sun et al. (2022) used physics-based simulations of only healthy cases, while Zhou et al. (2024) synthesized B-mode ultrasound (no Doppler images) using diffusion. In contrast, our method is data-driven and synthesizes both B-mode and color Doppler images—individually or paired—with and without pathology. To our knowledge, this is the first AI-driven approach to generate clinically realistic, pathological, Doppler–B-mode image pairs. 3.5D Architecture–R#4: We define “3.5D” as a hybrid temporal model for ultrasound videos as image sequences. It combines a 2.5D U-Net for Doppler and a 1D encoder for B-mode to capture modality-specific dynamics. Unlike 3D U-Nets with volumetric convolutions suitable for static data (e.g., CT/MRI), our architecture embeds temporal indices for efficient joint spatial-temporal modeling without added computational overhead. Instead of frame-wise generation with temporal concatenation, we use temporal conditioning to enable dynamic interaction across paired inputs. As shown in Table 1, our method generates more realistic images and significantly outperforms 3D LDM, which is based on a latent 3D U-Net. Filtering & RHD Classifier–R#4, R#2: In response to R#4, we confirm that the RHD classifier used for filtering was not reused for downstream evaluation. Filtering removed mislabeled samples, and repeated failures were manually reviewed and added to training to improve robustness. A separate classifier was trained from scratch for RHD classification. In response to R#2, we initially trained the classifier using 462 synthetic images, matching the real dataset size, and then gradually increased the synthetic data to three times that amount. This led to a significant improvement in classification performance and demonstrated the effectiveness of synthetic data. While we are not allowed to include new results in the rebuttal, we acknowledge that it would have been valuable to assess performance limits and estimate optimal sample sizes. Resolution and Efficiency–R#4: Doppler FOVs were cropped using the same aspect ratios that retained essential flow for all data. The compact 128×128×32 resolution was sufficient to retain the key diagnostic features, as demonstrated in the downstream classification task. Reproducibility and Code/Data Release–R#1, R#2, R#4: We appreciate the feedback on clarity and reproducibility. The real data cannot be shared (please see Data), but we will release the de-identified synthetic data and code post-publication, pending ethical review and institutional approval. Expert Review and Validation–R#2: Two expert cardiologists from *** independently and blindly reviewed the synthesized data to determine if they were real or not and classify them as RHD or normal. Minor Clarification–R#4: -Total training: 516 min; synthesis: 6.5 min/video (GPU); RHD detection: <1 min (CPU laptop).




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top