Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Congenital uterine anomalies (CUAs) can lead to infertility, miscarriage, preterm birth, and an increased risk of pregnancy complications. Compared to traditional 2D ultrasound (US), 3D US can reconstruct the coronal plane, providing a clear visualization of the uterine morphology for assessing CUAs accurately. In this paper, we propose an intelligent system for simultaneous automated plane localization and CUA diagnosis. Our highlights are: 1) we develop a denoising diffusion model with local (plane) and global (volume/text) guidance, using an adaptive weighting strategy to optimize attention allocation to different conditions; 2) we introduce a reinforcement learning-based framework with unsupervised rewards to extract the key slice summary from redundant sequences, fully integrating information across multiple planes to reduce learning difficulty; 3) we provide text-driven uncertainty modeling for coarse prediction, and leverage it to adjust the classification probability for overall performance improvement. Extensive experiments on a large 3D uterine US dataset show the efficacy of our method, in terms of plane localization and CUA diagnosis. Code is available at GitHub (https://github.com/yuhoo0302/CUA-US).

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5180_paper.pdf

SharedIt Link: https://rdcu.be/eHaYI

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04965-0_61

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/MICCAI25-5180/CUA-US

Link to the Dataset(s)

N/A

BibTex

@InProceedings{HuaYuh_Uncertaintyaware_MICCAI2025,
        author = { Huang, Yuhao AND Xu, Yueyue AND Dou, Haoran AND Deng, Jiaxiao AND Yang, Xin AND Zheng, Hongyu AND Ni, Dong},
        title = { { Uncertainty-aware Diffusion and Reinforcement Learning for Joint Plane Localization and Anomaly Diagnosis in 3D Ultrasound } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {650 -- 660}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposed a complicated pipeline that includes using diffusion model to do plane localization, RL to select slices and text condition to further help improve the accuracy of detection. Improvements are shown in the results section on a private dataset compared with several baseline models.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. I appreciate the authors efforts on this paper. Three highlights are made and many experiments and results are shown.
2. The results show promising improvements compared with the baseline methods. Actually it’s pretty significant.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The paper is not well written and hard to track the details. For example, what’s x_t in equation (1), where does feature f_1, f_2 come from?
2. I respect the work of the authors but some efforts are not necessary. I don’t appreciate using diffusion model for localizations. It’s making the process unnecessarily complicated. Is diffusion model really suitable and helpful for localization? RL for slice selection makes sense and the authors are following previous works. For me this paper is trying to combine popular techniques which could make the pipeline unnecessarily complicated.
3. No public dataset is used and the results become less convincing. Some of the categories only consists few samples.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Based on the weakness, I think even though this paper is making efforts to improve the pipeline performance, but I’m concerning about the paper writing, unnecessary usage of complicated techniques and private dataset.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

Thank the reviewers for the rebuttal. I’m still not convinced and hold the same opinion as before.

Review #2

Please describe the contribution of the paper

The paper proposes a framework for plane localization in 3D pelvic US and diagnosis of congenital uterine abnormalities. The method’s key components are threefold: (i) a conditional diffusion model for plane localization, conditioned on the 3D volume, the slice, and text input; (ii) a reinforcement learning strategy for key slice selection, and (iii) a strategy to improve the classification scores by utilizing the uncertainty score from text condition. The method is evaluated extensively on a private dataset of 677 3D US volumes and compared against multiple previous methods.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well structured. The figures are well chosen and support the paper well. Especially Fig. 1 (overview of the framework) is informative.
- The clinical problem is relevant. The automatic identification of standard planes in 3D US is important and the paper presents an interesting method with good performance.
- The method utilizes state-of-the-art techniques, such as diffusion models and text-image foundation models (BiomedClip).
- The evaluation is extensive with a comparison to multiple other methods and an ablation study. Statistical tests are included to test for significant differences.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
Contributions:
- The contributions need to be specified. The diffusion model is based on [7] and the RL strategy on [13,23]. It remains unclear if modifications to the previous works were necessary and what these modifications are. Or is the contribution to use these modules in the same framework for the task of CUA classification?
Method:
- The methodology is hard to follow, especially Sec 2.1. Crucial parts, like the conditions and the representation of the plane function, are only understandable because of Fig. 1. Just an illustration is not enough. The text should adequatley describe the method. For example: o How exactly is the plane function represented? Just 3 parameters? How does this work for the diffusion process? o Architecture of the encoders is missing and in general a description of the conditions. o Notation is not precise and often not defined/introduced: the 3 parameters for plane function, \mathcal{D} in sec 2.2, What does f^hat_S=(1,D) mean in sec 2.2? o Sec 2.2: Where
Evaluation:
- Standard deviations are missing.
- It Is a bit unclear if the methods for comparison are adequate choices and state-of-the-art. Some description and justifications why to choose them are necessary.
- What is the input data to the 2D, 2D+t comparison method? Actual 2D US acquisitions? Or some slices from the 3D US volumes? This is crucial to understand because methods on 2D US are currently the standard. It is important to compare 2D US (better image quality) to 3D US.
- Fig 3: Are the uncertainties in c3-5 from the same sample? Is the text the GT class or the predicted class? r1c4 does not predict normal uterus, doesn’t it? An r3c4 and r3c5 both predict arcuate uterus? For a better understanding, it would be good to list the class labels in the fig caption.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper has several limitations but they are mainly related to the method description and clarification of the contributions. Overall, I like the paper and the evaluation seems appropriate.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I keep my initial recommendation.

Review #3

Please describe the contribution of the paper

The paper proposes a novel joint framework for automated plane localization and congenital uterine anomaly (CUA) diagnosis in 3D ultrasound (US) by integrating three key innovations: (1) an adaptive conditional diffusion model that leverages multi-scale guidance (local plane features, global volume features, and text embeddings) with dynamic weighting to improve plane localization accuracy; (2) a reinforcement learning (RL)-based approach with unsupervised rewards to select informative key slices from redundant sequences, enhancing feature representation for classification; and (3) an uncertainty-aware strategy that refines diagnostic probabilities using text-driven uncertainty scores, boosting overall performance. Validated on a large in-house dataset, the method outperforms existing techniques in both localization and diagnosis tasks, offering a clinically aligned solution that reduces operator dependency and improves interpretability.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper introduces a first-of-its-kind framework that combines a conditional denoising diffusion model with RL-based key slice selection for joint plane localization and anomaly diagnosis in 3D ultrasound.Unlike standard classifiers, this approach uses zero-shot uncertainty estimation from diffusion-generated predictions, correcting misclassifications by assessing the consistency of plane parameters under different text prompts (e.g., “This is a septate uterus”). This bridges generative and discriminative modeling in a clinically interpretable way.The method is validated on a large in-house dataset (677 volumes) with diverse CUA categories, including rare anomalies (e.g., bicornuate uterus) . The framework mimics the clinical workflow by unifying plane localization and diagnosis, reducing operator dependency in 3D US—a known challenge due to anatomical complexity. Overall, the paper’s novel formulation (diffusion + RL + uncertainty), rigorous evaluation, and clinical applicability make it a standout contribution in medical image analysis. The release of code and the dataset (though anonymized) further strengthens reproducibility.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The adaptive conditioning and uncertainty strategies could inspire applications beyond ultrasound, such as MRI plane navigation or lesion classification.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

no
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper excels in novelty, technical depth, and clinical utility, with experiments that convincingly demonstrate superiority over existing methods. It would be a strong candidate for an oral presentation
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank all the reviewers for reviewing our work, and recognizing the novelty and solid experiments. Clarifications have been provided to address the comments.

Q1. Diffusion model for localization. (R4) We choose the diffusion model to locate planes due to its strong performance and efficient inference speed (~2s). One recent work (DIFF_MSG) and our results prove its significantly higher performance than existing methods (Tab. 1). We also show that improving localization accuracy via diffusion can better boost congenital uterine anomaly (CUA) diagnosis (>10% F1, Tab. 3).

Q2. Contribution clarification. (R3) (1) Compared to [7], we add text condition to enhance global semantics and drive uncertainty modeling, along with an adaptive weighting strategy to optimize condition fusion. (2) We design new pipeline and unsupervised rewards for key slice summarization in 3D ultrasound (US). While previous work [13,23] rely on supervised annotations (keyframe/structures/attributes), and are designed for video, not 3D.

Q3. Equation and symbol. (R3, R4) (1) In Eq. 1, x_t should be p_t, which means the plane parameters at t step. (2) Features (f_1, …) are extracted from the plane encoder that takes planes as inputs. (3) Three parameters can define a 2D plane in 3D, using the formula of tangent-point in spherical coordinates. r: radial distance from tangent point to origin; η and θ: azimuth and elevation angles. (4) D means feature dimension of slice. f^hat_S applies maxpooling over the S dimension to aggregate features of size (S,D) to (1,D). We will revise the paper to remove any confusion.

Q4. Dataset. (R2, R4) Few cases in some categories reflect their low real-world incidence, yet our model improves F1-score by ~12% over nnMamba despite data imbalance. We have established multi-center collaboration for future data collection, discussed dataset release affairs with our clinical partners, and will publish the first public CUA dataset upon IRB approval. We will test our method on public MR datasets in journal paper.

Q5. Method details. (R3) Encoder (EC) Parameter EC: a 1×1 convolution (conv), with group normalization, ReLU activation, and another 1×1 conv. Plane EC: ResNet-18. Volume EC: five 3×3×3 conv, with instance normalization, Leaky-ReLU activation; and a global average pooling layer. Text EC: BiomedClip. Localization process Diffusion: Gaussian noise is added to the plane parameters (pp; r, η, θ) for T times. Conditional denoising: The noisy pp was turned into a R^32 vector via MLP. Then, we input 2D plane sliced from 3D, 3D data, and text prompt to the ECs and MLPs to obtain plane/volume/text conditions (R^32 vectors). All vectors are concatenated and processed by UNet to update PP iteratively and get the coronal plane. Some details are already included in Secs. 2&3, and the rest will be added in the revision.

Q6. Evaluation. (R3) (1) We report standard deviations (SD) for localization (Tab. 1), as it computes case-level metrics; while classification on fixed test set typically measures dataset-level metrics, thus cannot yield SD. (2) We choose the typical and state-of-the-art (SOTA) methods for comprehensive comparison. For localization, traditional ITN and SOTA RL&diffusion methods are included. For classification, different common SOTA models were considered, covering 2D/2.5D/video/3D. The first three take slices from 3D. While 2D US offers better image quality than 3D, it cannot provide the coronal plane and surrounding structure observation required for CUA diagnosis due to pelvic bone obstruction.

Q7. Fig. 3. (R3) c3-5 are original, uncertainty and adjusted probabilities (prob) from the same sample, with text showing predicted class. For original/adjusted probs, the highest value indicates the predicted class, while for uncertainty, the lowest value represents the most confident prediction. We have checked and confirmed the prediction text is correct; and will add class labels in the caption for clarity.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Uncertainty-aware Diffusion and Reinforcement Learning for Joint Plane Localization and Anomaly Diagnosis in 3D Ultrasound

Author(s):