List of Papers Browse by Subject Areas Author List
Abstract
Echocardiography is the only technique capable of real-time imaging of the heart and is vital for diagnosing the majority of cardiac diseases. However, there is a severe shortage of experienced cardiac sonographers, due to the heart’s complex structure and significant operational challenges. To mitigate this situation, we present a Cardiac Copilot system capable of providing real-time probe movement guidance to assist less experienced sonographers in conducting freehand echocardiography. This system can enable non-experts, especially in primary departments and medically underserved areas, to perform cardiac ultrasound examinations, potentially improving global healthcare delivery. The core innovation lies in proposing a data-driven world model, named Cardiac Dreamer, for representing cardiac spatial structures. This world model can provide structure features of any cardiac planes around the current probe position in the latent space, serving as an precise navigation map for autonomous plane localization. We train our model with real-world ultrasound data and corresponding probe motion from 110 routine clinical scans with 151K sample pairs by three certified sonographers. Evaluations on three standard planes with 37K sample pairs demonstrate that the world model can reduce navigation errors by up to 33% and exhibit more stable performance.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0053_paper.pdf
SharedIt Link: https://rdcu.be/dVZek
SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72378-0_18
Supplementary Material: N/A
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Jia_Cardiac_MICCAI2024,
author = { Jiang, Haojun and Sun, Zhenguo and Jia, Ning and Li, Meng and Sun, Yu and Luo, Shaqi and Song, Shiji and Huang, Gao},
title = { { Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15001},
month = {October},
page = {190 -- 199}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper proposes a data-driven model for automatic guidance of an ultrasound probe to find three target standard planes. It was trained and validated on 125 clinical scans.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed project is definetely an interesting approach directly linking the features of consecutive poses. The data is acquired in a complex setup with a robotic arm and the acquired dataset seems to be very valuable. Unfortunately, it does not seem to be published.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The major weakpoint is the baseline (which is the same model but without the CardiacDreamer) / limited comparison to other similar approaches. It is not quite clear on how feasible the approach is given that the authors do not state how accurate the approach needs to be for clinical translation.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
Not all implementation details are listed (ie 3 ‘custom’ fully connected layers with giving more insights)
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
- How was the acquisition protocol for acquiring the training and test data? Were the clinicans trained/given instructions?
- Prerequirements of the probe location from the start?
- The ideal translation of the ultrasound image is ambiguous and dependent on zoom/tilt and depth, can the authors comment on how they evaluate whether the translation made sense?
- The authors write that Fig.3 shows “significantly” lower AE but the figure makes it hard to see that (except maybe at the very end). Which significance test did the authors use?
- Minor: ‘Cadiac’ dreamer in the figure
- What is the desired MAE for clinical application?
- The related work should contain the probe guidance paper by Pasdeloup et al. ‘Real-Time Echocardiography Guidance for Optimized Apical Standard Views’
- How exactly did you compute the combined Absolute error given that you have translational and rotational components?
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
I picked the score because of my raised concerns that are still outweighed by the strengths.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #2
- Please describe the contribution of the paper
• This paper presented an ultrasound probe guidance solution for helping clinicians effectively locate the standard imaging planes in cardiac applications. • Specifically, the proposed solution first trained a policy network to encode the current image and generate an initial 6D guidance signal; then the encoded image features were input into a transformer-based network (“Cardiac Dreamer” world model) to generate a refinement signal which combines with the initial signal as the final guidance output. • The methods were trained and evaluated on a real-world expert demonstration dataset and showed that the solution with the Cardiac Dreamer’s guidance signal refinement outperforms the basic policy network output.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
• The use of a transformer-based world model to “imagine” and examine the initial estimated guidance pose is novel. The experiments have also demonstrated its effectiveness in reducing the target plane guidance signal errors. It is promising to see this framework being applied beyond the evaluated 3 standard planes as the world model densely represents the spatial structure of the heart. • The writing is clear and easy to follow. Figures and tables are also well organized. • The evaluation is done on a real-world large-scale dataset, which contains 125 scans with 188K samples (image & pose) acquired from healthy adult male volunteers.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
• As mentioned in the introduction, cardiac ultrasound is challenging because of “the need for very precise adjustments”. However, the paper did not discuss whether the current results achieved the required accuracy. For the final results with Cardiac Dreamer, a MAE of >6mm of translational error with >9deg rotational error may still result in distinct view appearance based on which target plane it is imaging. • Some evaluation (guidance error vs. how far from the target plane) needs more careful treatment. For Fig. 3, computing the MAE for a 6D vector containing both rotation & translation is not very interpretable (e.g., using radians instead of degrees may show different results). • Missing the discussion on algorithm processing speed: the framerate of guidance signal can be an important evaluation metric especially for cardiac applications because the view is always dynamic. We are unable to know whether this algorithm framework can run in real-time with commercially available hardware.
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
• Please comment on the final algorithm performance, is the current accuracy sufficient for the clinical task? How will you evaluate the successfulness of the work? • The notation (a_{t,0}, a_{t,1}, …) in Fig.2 is suggesting the “dream” process can run iteratively. The output of cardiac dreamer is a “refinement” signal to the initial policy network output. This refinement process can be repeated using the output f_{t,0} and a_{t,1} to generate f_{t,1} and a_{t,2}. If the computation is fast enough (not sure from the text), it can be run multiple times and compute the average of a_{t,i} as final guidance signal. So why “dream only once”? Some discussion of the current implementation strategy could be helpful. • Please explain why a hand-over-hand robot setup is used for data collection instead of using free-hand tracking methods such as optical tracking or EM tracking? The cardiac ultrasound imaging usually requires fine maneuvering of the probe to locate the target plane – having the probe attached to a robot can introduce inconvenience for sonographers. • Please specify the scanning protocol – does the probe first start at some random location and then sonographer maneuver to search for the target plane? If these are all expert operators, does this dataset also include the image/pose that novice users could potentially reach (i.e., whether you specifically ask the operators to move randomly)? • As the authors mentioned “particularly significant individual variations” for heart structures in Introduction, please specify if the test data is selected from the unseen volunteers data for algorithm generalizability evaluation. In the original text, the number of volunteers for data collection is not mentioned and it was only stated “The dataset was split into 110 scans (about 151K images) for training and 15 scans (about 37K images) for testing”. • “The long training period … resulted in a significant shortage of skilled cardiac sonographers, …” It could be helpful to add some relevant references for a stronger statement.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
• In general, this work is very interesting. It introduces a novel way of learning the dense spatial structure of the heart using a transformer-based world model, which refines the output of the baseline (a direct regression of motion towards target) and is shown to achieve more stable results and lower error metric. • However, the main problem is that it is unknown whether the current performance of the algorithm would be sufficient for the clinical task given the >6mm & >9deg MAE in one axis. Some discussion on the actual usability of the work is necessary, as the cardiac imaging requires precise adjustments for getting standard views. Also, the framerate of running this algorithm is unknown, which leaves a question on the real-time guidance capability. In addition, more careful treatment is needed to present the “guidance error vs. how far from the target plane” evaluation results.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
This paper proposed an automatic probe guidance system for freehand echocardiography. The system adopted the data-driven imitation learning strategy and outputs a 6D guidance signal of the probe to locate the target plane. The probe pose information was learned from the internal sensors inside a robotic arm. The results showed superior general performance in terms of navigation guidance accuracy and consistency.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
It proposed a novel probe navigation system that provides a 6D guidance signal in echocardiography. The system is also validated using large-scale real-world ultrasound scans. The paper is overall well-written and technically sound. The theme of the paper is to provide scanning guidance for cardiac sonographers, which is clinically meaningful.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
My only concern is that there seems no discussion about the limitation of the work or any failure cases. About clinical feasibility, how easily it could be adapted to a clinical echocardiography environment? Does it require calibration of the probe to know its geometry?
- Please rate the clarity and organization of this paper
Excellent
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
Rather than a determinative single-step to reach the target plane, how likely is the system to be adapted to a guidance policy with step-by-step guidance approaching the final plane, that might benefit trainee sonographers?
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Accept — should be accepted, independent of rebuttal (5)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This work has innovative applications in clinical echocardiography. The experiments are also self-contained.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Author Feedback
We are grateful for the positive and insightful feedback received from the reviewers and the AC. We would like to answer the following points raised in the reviews.
- @R1/R3/R4 Clinical validation: (1) Indeed, the value of clinical translation is very important. Our ultimate goal is to ensure that this algorithm is effective in clinical settings and helps trainees acquire stronger echocardiographic scanning skills; (2) In future work, we will validate the performance of the algorithm in real medical scenarios, such as comparing the scanning/diagnostic outcomes of novices before and after using the navigation algorithm.
- @R1/R3 Data collection protocol: (1) The training and testing data were collected following the same protocol, and the test set included individuals not seen during training to evaluate the algorithm’s generalization; (2) We required the sonographer with 10 years of clinical experience to locate the standard views in the order of PLAX, PSAX-AV, and PSAX-MV, ensuring close contact with the patient’s body surface and obtaining the clearest possible ultrasound images; (3) There were no strict requirements for the initial position of the probe. It was generally placed near the heart side of the center point of the line between the two nipples.
- @R1 I don’t fully understand the meaning of “ideal translation.” I interpret the question from two perspectives: (1) The path to the standard view is not unique; (2) Some parameters affect ultrasound image quality. (1) Our framework is target-oriented and does not focus on the intermediate path. If the network’s output adjustment brings the probe closer to the target view, the action is considered valuable. (2) Based on clinical practice, each individual requires different zoom and depth parameters for optimal image quality. To align with clinical scenarios, we required sonographers to adjust the parameters to an optimal state during data collection.
- @R1 Related work: We will discuss and reference them in our article.
- @R1/R3 Comprehensibility of Fig.3: Thank you for pointing out the unclear part. To enhance the comprehensibility of Fig.3, we will try to separate the translation and rotation data or display the results across six dimensions to improve the clarity of the figure.
- @R3 Real-time inference: (1) We tested on a Nvidia RTX 3090, and the inference time for a single image is approximately 59.4ms; (2) Considering that humans cannot receive and execute instructions at a frequency of 15Hz—15 instructions per second—our inference speed fully meets practical application requirements.
- @R3 Why dream only once: (1) Each step taken by the sonographer aims to directly reach the standard views. Thus, the motivation behind the design of Dreamer is also target-oriented, aiming to reach the target position in a single step; (2) Since Dreamer is trained to reach the target in one step, it outputs the necessary adjustments after the first inference. Further adjustments based on these features will result in an action close to zero, indicating that multi-step adjustments do not provide additional benefits.
- @R3 Data collection device: (1) We also considered optical tracking, but during the procedure, we found that the markers were easily obscured, affecting data accuracy; (2) Electromagnetic tracking is susceptible to interference from metal instruments and medical equipment, resulting in insufficient accuracy; (3) Additionally, we designed a impedance control algorithm that enhances the convenience of the doctor’s operation.
- @R4 Step-by-step guidance: (1) Each step taken by the sonographer aims to directly reach the standard views, and our algorithm is designed with this motivation; (2) Meanwhile, when the user adjusts the probe based on the algorithm’s guidance and has not yet reached the standard view, the algorithm can output further adjustment signals based on the current ultrasound image, thereby achieving a multi-step navigation effect.
Meta-Review
Meta-review not available, early accepted paper.