List of Papers Browse by Subject Areas Author List
Abstract
Aortic stenosis (AS) is a life-threatening condition caused by a narrowing of the aortic valve, leading to impaired blood flow. Despite its high prevalence, access to echocardiography (echo)—the gold-standard diagnostic tool—is often limited due to resource constraints, particularly in rural and underserved areas. Point-of-care ultrasound (POCUS) offers a more accessible alternative but is restricted by operator expertise and the challenge of selecting the most relevant imaging views. To address this, we propose a reinforcement learning (RL)-driven active video acquisition framework that dynamically selects each patient’s most informative echo videos. Unlike traditional methods that rely on a fixed set of videos, our approach continuously evaluates whether additional imaging is needed, optimizing both accuracy and efficiency. Tested on data from 2,572 patients, our method achieves 80.6% classification accuracy while using only 47% of the echo videos compared to a full acquisition. These results demonstrate the potential of active feature acquisition to enhance AS diagnosis, making echocardiographic assessments more efficient, scalable, and personalized.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5008_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{SaaArm_PRECISEAS_MICCAI2025,
author = { Saadat, Armin and Hashemi, Nima and Vaseli, Hooman and Tsang, Michael Y. and Luong, Christina and Van de Panne, Michiel and Tsang, Teresa S. M. and Abolmaesumi, Purang},
title = { { PRECISE-AS: Personalized Reinforcement Learning for Efficient Point-of-Care Echocardiography in Aortic Stenosis Diagnosis } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15973},
month = {September},
}
Reviews
Review #1
- Please describe the contribution of the paper
This manuscript addresses a clinically relevant problem using a technically sound, although not highly novel, method. It proposes a practical solution with convincing empirical results but would benefit from clearer positioning with respect to existing work, especially in RL based active acquisition.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1/ The manuscript tackles a real bottleneck in cardiovascular care improving efficiency and accuracy in point of care echocardiography (POCUS) for aortic stenosis (AS).
2/ Resource optimization in this context is highly relevant, especially in underserved settings.
3/ The use of reinforcement learning (DDQN) to select echo videos dynamically per patient is well-motivated and practically useful.
4/ The framework offers a clear trade-off between acquisition cost and predictive performance. Combines a pre-trained, interpretable encoder (ProtoASNet) with a Transformer-based classifier and an RL policy.
5/ Incorporates positional encodings and masked attention to allow flexibility for varying video sets.
6/ Evaluated on a sizable, real-world clinical dataset (2,572 patient studies).
7/ PRECISE-AS achieves the same state-of-the-art accuracy as full acquisition while reducing the number of required videos by over 50%.
8/ Extensive ablation studies on the acquisition cost coefficient (λ).
9/ Includes qualitative insights (e.g., diagnostic pathways and video acquisition order).
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1/ Incremental Novelty: While well-executed, the core approach is an application of known techniques: active acquisition via DDQN, Pre-trained feature extractors, Transformer-based sequence classifiers.
2/ Closely related works like Bernardino et al., MICCAI 2022 already use RL for modality selection, and this work applies similar principles at a video level within one modality.
3/ There’s no new RL formulation, architectural innovation, or theoretical contribution.
4/ The action space is limited to selecting or skipping a fixed set of 4 videos, all the same imaging type. This restricts the generality of the approach compared to other AFA settings where modality, sensor type, or time-based data may vary.
5/ The study is based entirely on a private dataset, with no generalization assessment on public datasets like EchoNet-Dynamic. The lack of public validation or external reproducibility limits the broader adoption and testing of this method. Validation on more than one dataset is necessary.
6/ There are no comparisons to non-RL AFA methods, such as: Uncertainty-based sampling, Greedy information gain, Mutual information selection.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Refer to the weakness section
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
The paper introduce a framework call PRECISE-AS, an active video aquisition framework for echocardiography. The framework is designed to improve the quality of echocardiographic imaging by select the most informative videos. The framework is based on a combination of deep learning and reinforcement learning techniques, claimed to be state-of-the-art in the field.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The motivation of the paper is clear and well-articulated.
- While the framework consists of multiple components, the paper does a good job of explaining each component and how they work together to achieve the overall goal of improving echocardiographic imaging.
- The paper provides a clear evaluation of the framework, including a comparison with existing methods and a discussion of the results.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
I find a hard time to differentiate the contributions of the paper with Bernardino et al. 2022, in which the difference is frozen feature extractor (ProtoASNet) with an off-the-shelf DDQN, the performance in comparison with ProtoASNet is not that significant. The paper should clarify the novelty of the proposed method and how it improves upon existing methods.
-
The setup of choosing video among four pre-recorded videos(2 PLAX and 2PSAX) might not reflect the real-world scenario. In practice, the problem is more about choosing the best steering angle.
-
Sparse terminal reward is used, similar to Bernardino et al. 2022, which could cause noisy credit assignment.
-
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The work shows an interesting application of RL to echocardiography, but authors should provide the difference and highlight more about the contribution of the paper.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #3
- Please describe the contribution of the paper
The authors propose a framework for active video acquisition in echocardiography, leveraging reinforcement learning for sequentially select videos of diagnostic relevance from each patient. The method achieves comparable performance to baseline approaches while significantly reducing data requirements.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Novel and clinically relevant approach addressing both the difficult problem of AS diagnosis and the data selection scheme in echocardiography. It is likely that the method has applicability across multiple tasks especially in POCUS settings, or for patient follow-up scenarios.
- Extensive validation comparing to other SOTA methods, demonstrating the incremental improvements of the method and support of the design.
- The diagnostic pathways offer an interesting and potentially valuable post-hoc analysis for investigating view relevance in echocardiographic diagnosis.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Individually, the methodological components show limited novelty. For instance, the reinforcement learning method (DDQN) is not state-of-the-art, and it remains unclear if newer RL approaches could improve the performance further.
- The paper lacks detailed descriptions of data distributions and classification labels, limiting interpretability and reproducibility of results.
- Statistical significance tests comparing differences between the proposed method and baselines are absent, undermining confidence in interpreting relatively small performance improvements.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
See strengths and weaknesses. Interesting approach of interest to the community.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have adequately addressed the concerns raised in my review.
Author Feedback
The reviewers recognized the methodology as technically sound and clinically relevant. Below, we clarify the key points raised and would appreciate your consideration in raising the scores in light of these responses.
NOVELTY: We introduce the first active video acquisition framework for task-driven echo. This is a challenging task, since echo video data are high-dimensional and dynamic in nature. The closest work (Bernardino et al., MICCAI 2022) focused on low-dimensional data (1D signals and tabular information) combined with simple classifiers (support vector machines). However, the direct application of that framework to high-dimensional data for a more complex classification task with more advanced classifiers such as transformers in our work is not straightforward. To clarify, Bernardino trained a separate classifier and Q-network for each superstate (representing the possible combination of the imaging modalities). With N actions, this requires 2^N model pairs. With high-dimensional data, this approach is not scalable. Instead, we train one classifier and one RL policy network, making the framework broadly applicable to more complex problems. To make the RL optimization feasible on high-dimensional data, we take advantage of a prototypical network, which maps the input videos to a more stable embedding space. We use the similarity between the learned embeddings and prototypes to simplify the training.
GENERALIZABILITY: Our framework is inherently flexible and can accommodate a variable number of inputs and more complex classification tasks. To increase the action space, one can simply adjust the output layer of the policy network. Also, to increase the number of input types, one needs to train the corresponding encoders of those inputs to project them to a shared embedding space. With transformers, the classification is also flexible, where new inputs can be integrated as additional tokens in the sequence. Collectively, our design choices lead to a more generalizable approach than prior art.
DDQN: DDQN provides a stable and effective baseline, and advances in discrete-action RL methods beyond that are often about sample efficiency. Given that all MDP transitions come from our fixed-size data set, more sample-efficient variants may yield faster RL training, but are unlikely to provide significant gains to asymptotic performance for the final policy, given sufficient training time.
R3 commented on optimizing the best steering angle for a single echo image acquisition. This is a different problem than what we have focused on, i.e., optimizing the entire echo acquisition workflow. Single echo view acquisition optimization has been tackled before by a number of groups and is also commercially available (using real-time image quality assessment to provide feedback to a user).
R2 requested evaluation on public datasets. Unfortunately, no publicly available dataset offers multiple echo videos from different views per patient, which is essential for evaluating active video acquisition frameworks like ours. E.g., EchoNet-Dynamic only provides AP4 videos.
R1/R3 stated that the performance gain compared to ProtoASNet is small, and R1 asked for statistical significance tests. We underscore that our method matches ProtoASNet in evaluation metrics, while requiring <50% of echo videos, as R2 highlighted. We will add a full statistical significance analysis to the paper.
R2/R3 correctly pointed out that no new RL method is introduced. However, RL methods are by default designed to be general, i.e., to require minimal domain-specific adaptations, and are commonly evaluated in terms of their aggregate performance across a variety of problems.
Regarding the descriptions of classification labels, our dataset includes 1088 normal, 575 early, and 909 significant AS cases. We reported balanced accuracy and weighted F1 score to account for class imbalance. We will add this to the paper.
We will release the source code publicly.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This was a tough call. My recommendation tried to merge the following:
- The reviewers were split two Against vs one For, with experienced reviewers on both sides. There were several meaningful issues raised in the reviews.
- The For vote was the most engaged.
- The rebuttal was excellent.
- However, importantly, the rebuttal did not affect the Against votes.
- There is a question about whether the advance beyond Bernadino constitutes sufficient “innovation”. I am all in favor of incremental improvement, so I can see it either way, but two reviewers were evidently unconvinced.
- I had a couple questions of my own about the clinical need: a. The main benefit of the proposed method appears to be a reduced number of videos required (Table 1). But there is no explanation as to how this is a benefit in the clinical context. Would it improve care and, if so, how? b. The Abstract appears to argue that the proposed method would enable care by less-trained operators and in rural and underserved areas (which often de facto means less well trained operators). However, the dataset was collected by trained sonographers, so it’s unclear whether or how the results point to effective use by less well trained operators. c. That is, the value-add in clinical terms is not clear.
- I was also constrained by the overall Accept/Not accept ratios encouraged in our instructions. Misc comment to authors: In references, capitals can be retained by using braces {Canada} in the .bib file.