Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Autonomous navigation for mechanical thrombectomy (MT) remains a critical challenge due to the complexity of vascular anatomy and the need for precise, real-time decision-making. Reinforcement learning (RL)-based approaches have demonstrated potential in automating endovascular navigation, but current methods often struggle with generalization across multiple patient vasculatures and long-horizon tasks. We propose a world model for autonomous endovascular navigation using TD-MPC2, a model-based RL algorithm. We trained a single RL agent across multiple endovascular navigation tasks in ten real patient vasculatures, comparing performance against the state-of-the-art Soft Actor-Critic (SAC) method. Results indicate that TD-MPC2 significantly outperforms SAC in multi-task learning, achieving a 65% mean success rate compared to SAC’s 37% (p < 0.001), with notable improvements in path ratio. TD-MPC2 exhibited increased procedure times, suggesting a trade-off between success rate and execution speed. These findings highlight the potential of world models for improving autonomous endovascular navigation and lay the foundation for future research in generalizable AI-driven robotic interventions.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3014_paper.pdf

SharedIt Link: https://rdcu.be/eHw2u

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05114-1_65

Supplementary Material: https://papers.miccai.org/miccai-2025/supp/3014_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{RobHar_World_MICCAI2025,
        author = { Robertshaw, Harry AND Wu, Han-Ru AND Granados, Alejandro AND Booth, Thomas C.},
        title = { { World Model for AI Autonomous Navigation in Mechanical Thrombectomy } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {680 -- 690}
}

Reviews

Review #1

Please describe the contribution of the paper

The paper addresses the clinical context of robotic catheter navigation for manual thrombectomy using reinforcement learning. RL algorithms are difficult to configure, cannot perform over a long time, and hardly adapt to different tasks (this requires large models, that are prohibitively expensive to train). The paper develops a world model to benefit from a virtual setting for training for multiple tasks and shows how it can improve catheter nagivation from the iliac artery up till the left or right internal carotid arteries. Ten virtual geometric models are built based on actual patient data. This dataset is augmented with geometric transform such as scaling. A single catheter model is used. World models built using SAC (model-free) and TD-MPC2 (model-based) RL agents are compared. The latter demonstrates better performances concerning success rates and path ratio, but displays longer procedure times.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The clinical argument is solid and the addressed problem is actually complicated. The virtual setup is carefully built (geometric vascular models, mechanical model for the catheter). Results are convincing of a clear performance improvement.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Serious limiting hypotheses are made, which makes it hard projecting this work into an actual clinical setting. These include rigid and static vessels, while motion is to be expected around the aortic arch or the aorta (heart and breathing motion), and deformation is very common, if not systematic in the common and internal carotid arteries. No experiment is provided on data closer to reality, e.g. phantom data. No pathologies such as atheromatous or calcified plaques are considered. Both considered RL agents were previously published and the contribution appears more on the preparation of the virtual setups necessary to construct world models. However, this construction implies design choices (number of patients, parameters for data augmentation…) that are not studied or discussed.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Limiting contact is very important during navigation and it is a pity that it was not discussed more in the paper, while there are indications that the authors have considered this aspect (erroneous mention of mean contact force in table 1). The authors only considered a policy based on the position of the tip in X-ray images. How hard would it be to add limited contact force and duration? The authors reckon that this would have a positive impact on procedure time reduction. Do they have arguments or preliminary results to discuss?

Since world models are considered here, why could not a such model rely on 3D positions of the catheter tip?

In Table 2, results are less good for A2L and A3L: is this related to the less good performance of the base agents for these single tasks (see Table 1)?

Minor comments:
- table 1: mean force in caption, not a column though. Please rearrange the order of columns or their description so that they fit (inversion of path ratio and procedure time)
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The contribution consists in data collection and in some methodological aspects to build world models for reinforcement learning. The innovation does not appear so definite.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I thank the authors for their answers that clarify a number of important point raised in the reviews. I first acknowledge the modification proposed for the TD-MPC2 world model, that I had overlooked. Second, I appreciate the authors’ intent to share the code and simulation scenes. Third, opening the discussion to other models such as DreamerV3, with hands-on, even if preliminary, experience is also appreciated. Finally, I conquer with R2 that RL approaches should be represented and discussed in MICCAI. Therefore I update my opinion to accept the paper.

Review #2

Please describe the contribution of the paper

The paper presents a reinforcement learning approach for automatic catheter navigation. Rather than using images as input, the proposed method relies on the pose of the catheter tip as the observation.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The reinforcement learning approach demonstrates promising results.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The reinforcement learning agent currently makes navigation decisions based only on the pose of the catheter tip, without incorporating any visual input. This setup may be overly simplified compared to real clinical practice, where physicians rely on fluoroscopic imaging—and sometimes contrast agents, especially at bifurcations—for guidance. Expanding the observation space could improve realism and applicability.
2. It would be helpful if the paper provided more details on how the coordinate system is defined. Since the policy relies entirely on the tip position, a clearer explanation would enhance the reproducibility and clarity of the method.
3. For future adaptation to real-world scenarios, it would be valuable to discuss what technology might be used to determine the position of the three points at the catheter tip. This could help bridge the gap between simulation and practical deployment.
4. This work appears to serve primarily as a proof of concept demonstrating the feasibility of using reinforcement learning for catheter navigation. While this is a worthwhile direction, the core simulation environment has already been introduced in a previous paper by the authors, which somewhat limits the novelty of the current submission in its present form.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper explores an interesting application of reinforcement learning for catheter navigation, and the direction is promising. However, the current setup is overly simplified and lacks key elements that are important for real-world relevance, such as visual input and realistic observation spaces. The reliance on tip position alone does not reflect clinical practice, and important implementation details—like the coordinate system definition—are missing. Additionally, since the simulation environment has already been introduced in prior work by the authors, the novelty of the current contribution is somewhat limited. With further development and clarification, this work could become more impactful.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The authors propose a world model based on TD-MPC2 for mechanical thrombectomy. The approach is evaluated in a SOFA simulation environment, employing different vasculatures of real patients. They find that TD-MPC2 outperforms SAC, achieving 65% success rate, but at a longer procedure time.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Realism: Simulation is informed by real-world data. For instance, guidewire/catheter behavior was tested using a tensile testing machine.
- First study to use world models for endovascular surgery.
- Discussion raises important points about unseen anatomies and realism.
- Makes use of the Open Source ecosystem, namely 3D Slicer and SOFA.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Simulation code and data are not released, making this work difficult to reproduce.
- Table 1: “Mean force” is mentioned in caption, but not reported in table
- Evaluation could have been more thorough. Only two models were evaluated, whereas others (e.g. DreamerV3) were mentioned in related work.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

It would be great if the SOFA scene (without patient models) and frameworks for evaluation could be made available
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Very interesting study, and the number of CAI papers (especially with RL) is generally low at MICCAI. Interesting discussion, and a good experimental design. However, for an application report, I would expect a more thorough evaluation (more models, unseen anatomies), hence the “weak” accept.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I recommended to accept the paper in the first round already. The authors promise to amend details to the paper and to publish the SOFA simulation code. They were also bringing light into the question why certain evaluations could not be included, resolving the points I identified as weakness.

Author Feedback

We thank reviewers (R) for their valuable feedback. Below, we address the main concerns, organised into key themes raised.

Contribution: R1,3 To clarify, this work represents the first application of world models for any endovascular use, marking a significant step toward generalisable, AI-driven robotic interventions. World models have recently shown scalability across 80–150 tasks in non-medical synthetic domains [9,7,10], and we design them for a real-world medical use case. Specifically, we decompose the navigation task into sequential sections, improving efficiency and performance across multiple real patient anatomies. Moreover, we modified the TD-MPC2 world model to incorporate an LSTM layer, enabling trajectory-dependent state representations—an extension not explored before. We hope this shifts the field toward multi-task world models over isolated task-specific approaches. (Data augmentation is built upon [15])

Testbed realism: R1 For A1 and A2, vessels are almost entirely rigid in practice. Atheromatous changes from 10 real vasculatures are included. While deformation and motion improve realism, these will be addressed in future in vitro work, as similar in silico setups have made this transition [14,15]. We will add this to the Discussion for A3 as we acknowledge the concern and clarify that rigid vessels were a deliberate choice to prioritise controllability and reproducibility, consistent with SOTA autonomous endovascular navigation setups [15], enabling effective benchmarking of multi-task RL. Given the many variables in clinical translation, we focused on the effectiveness of a multi-task world model as our core contribution.

Contact force: R1,2 Force inclusion has been proposed in our previous work [3], demonstrating reduced forces, higher success rates, and shorter procedure times. Because we had discussed previously in [3], given space constraints we forfeited non-significant force data in Table 1–all forces were well below the vessel rupture threshold of 1.5 N. However, we will amend the paper to acknowledge this and correct the Table 1 caption.

Observation space: R1,3 We used normalised 2D Cartesian tracking coordinates, replicating the 2D anteroposterior fluoroscopy used in aortic arch navigation. For clinical translation, this approach would use image-tracking systems that derive device tip positions from fluoroscopic images [Eyberg et al., 2022]. While incorporating visual inputs could improve realism, similar position-based setups have successfully trained in silico models for ex vivo endovascular navigation under fluoroscopy [14]. In the Discussion we will acknowledge that a human-like visual may give added value. However, this is very different to the current work – which demonstrates that the current lean data approach alone works and is valuable to the medical community.

Performance: R1 The reduced performance of the base agents reflects clinical scenarios where navigating the left CCA is more challenging and typically requires shaped catheters. This corresponds to the final decreased performance on A2L and A3L and will be clarified in the Discussion.

Reproducibility: R1,2,3 We will include links to the SOFA simulation and RL algorithm code for stEVE [15] and TD-MPC2 [10]. This commitment to reproducibility is emphasised in the Conclusion, where we leverage existing open-source repositories to maximise reproducibility. We will also emphasise this in the Introduction.

Evaluation: R2 SAC is the SOTA for autonomous endovascular interventions, with successful in silico to in vitro/ex vivo translation [14,15]. As mentioned in the Introduction, TD-MPC2 was selected for its superiority over other RL algorithms, including DreamerV3, in state-based tasks [10]. While DreamerV3 excels in image-based tasks, our problem is state-based, and due to limited world models in this domain, comparisons were not possible. We will test DreamerV3 and on unseen anatomies and will amend the Discussion to reflect this

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This paper introduces the first use of world models for robotic catheter navigation in endovascular surgery. While certain limitations around realism and generalizability are acknowledged, the authors make a case for the significance of their methodological innovation and demonstrate convincing improvements in navigation performance across multiple anatomies. The reviewers appreciated the clinical relevance, design, and rebuttal, and while there remains room for improvement in simulation fidelity and evaluation breadth, the paper constitutes a valuable step forward and merits acceptance.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This paper is a valuable and timely contribution. It introduces a novel application of advanced model-based reinforcement learning to a complex clinical problem, provides a well-grounded simulation environment, and promises to share its code and assets with the community.

back to top

World Model for AI Autonomous Navigation in Mechanical Thrombectomy

Author(s):