Abstract

Tubular tree structures, such as blood vessels and airways, are essential in human anatomy, and accurately tracking them while preserving their topology is crucial for various downstream tasks. Trexplorer is a recurrent model designed for centerline tracking in 3D medical images, but it is prone to predicting duplicate branches and terminating tracking prematurely. To address these issues, we present Trexplorer Super, an enhanced version that substantially improves performance through several novel advancements. Evaluating centerline tracking models is challenging due to the lack of public benchmark datasets. To enable thorough evaluation, we develop three centerline datasets, one synthetic and two real, each with increasing difficulty. Using these datasets, we perform a comprehensive comparison of existing state-of-the-art (SOTA) models with our approach. Trexplorer Super outperforms previous SOTA models on every dataset. Our results also highlight that strong performance on synthetic data does not necessarily translate to real datasets. The code and datasets are available at https://github.com/RomStriker/Trexplorer-Super.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0795_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/RomStriker/Trexplorer-Super

Link to the Dataset(s)

https://atm22.grand-challenge.org/ https://parse2022.grand-challenge.org/

BibTex

@InProceedings{NaeRom_Trexplorer_MICCAI2025,
        author = { Naeem, Roman and Hagerman, David and Alvén, Jennifer and Svensson, Lennart and Kahl, Fredrik},
        title = { { Trexplorer Super: Topologically Correct Centerline Tree Tracking of Tubular Objects in CT Volumes } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {600 -- 610}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors address the problem of bronchial tree centerline extraction by proposing three modules or strategies: STT, FCA, and TA. These improvements enhance Trexplorer, making it applicable to human data.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed method achieves favorable results on human data, with clear visualizations and distinct advantages.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. I think this work lacks innovation. Trexplorer has already succeeded at MICCAI 2024, and the proposed method only extends its applicability to human data rather than representing a breakthrough in the field. This is the main reason for the low score. In the ablation study, the most effective STT expands the reference range for past trajectory information without significant technical innovation. Furthermore, the authors only describe FCA’s function but do not detail its implementation. The authors should clarify how it modifies Trexplorer’s Transformer cross-attention mechanism.
    2. I think the clinical significance of this work is unclear. I do not believe that extracting the complete bronchial tree centerline holds significant clinical value. In bronchoscopy path planning, an optimal path to the lesion can be computed using a path planning algorithm with collision detection without extracting all centerlines. Even if the proposed method can extract higher-order centerlines, its clinical relevance is questionable, as lesions in peripheral lung branches are usually accessed via percutaneous biopsy rather than bronchoscopy. Additionally, the author’s statement gives the impression that a universal method for tubular tree structures has been proposed, and mentions vascular trees, but provides no experiments or discussions to support this.
    3. The extremely high number of training iterations, combined with potentially insufficient training data (as mentioned by the authors), raises concerns about overfitting. Additionally, the current data split consists only of in-distribution test sets, hindering the assessment of overfitting and generalization.
    4. The authors compare their method with only two methods, one of which (Trexplorer) is the baseline they improve upon. Moreover, based on the visualized results, these two methods lack comparability to human data. Despite critiquing methods like DeepVesselNet and RelationFormer in the introduction, experimental results from them should be included to support their claims.
    5. The authors mention that “TA reduces duplicates, resulting in faster inference” but this needs quantitative support. The paper lacks runtime metrics, which should be included to determine whether the method is within an acceptable range for downstream tasks such as preoperative planning or significantly reduces preparation time to enhance its practical relevance, especially compared with Trexplorer. The improvements introduced in Trexplorer Super lack significant innovation and provide limited clinical relevance. Since Trexplorer has already shown success at MICCAI 2024, additional modifications appear uninspiring for MICCAI 2025. This is the main reason for rejection.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The improvements introduced in Trexplorer Super lack significant innovation and provide limited clinical relevance. Since Trexplorer has already shown success at MICCAI 2024, additional modifications appear uninspiring for MICCAI 2025. This is the main reason for rejection.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    The authors’ rebuttal addressed my doubts regarding model training; however, discrepancies between the rebuttal and the manuscript introduced further contradictions, which only deepened my confusion. I will maintain my previous decision to reject the paper:

    1. As I previously noted, this work merely applies improvements from Trexplorer (MICCAI 2024) to real-world medical data. This form of modular enhancement lacks sufficient innovation and contribution for MICCAI 2025. The authors list “present, for the first time, results on two new real-world datasets” and the “release of new datasets” (Rebuttal Point 7) as contributions. Yet, these are expected responsibilities and therefore appear weak. Moreover, the authors did not directly respond to my first comment regarding “how FCA modifies Trexplorer’s Transformer cross-attention mechanism.” This is crucial for highlighting the contribution and makes me question the originality of FCA.
    2. Regarding the data volume, the manuscript and rebuttal contain contradictory statements. In the second-to-last paragraph of Section 3.4, the authors state: “Analyzing the inputs revealed no immediately apparent cause, but one possible factor could be insufficient training data.” Yet in the final point of the rebuttal, they write: “Each dataset contains more than a million (~1.4 to 2.2) centerline points. We extract crops around randomly sampled points to create a large training set.” The phrases “insufficient training data” and “large training set” are clearly contradictory. This raises questions about the actual data conditions under which the experiments were conducted and causes me to doubt the reliability of the experimental section.
    3. I remain concerned about the generalizability of the model. There are inconsistencies between the authors’ statements in Rebuttal Points 7 and 8 that confuse me. The authors claim “our method is applicable to hepatic, renal, cerebral, and coronary vessels and other tubular structures” (Rebuttal Point 7), which is a bold and potentially decisive statement. Such strong generalizability would indeed be appealing, but it is unclear whether this claim is supported by only theoretical analysis or actual experimental validation on other organs. In contrast, in Rebuttal Point 8, the authors state: “Testing on out-of-distribution data is infeasible due to anatomical differences (airways vs. arteries) between our datasets and the lack of other public centerline datasets.” This contradiction makes me suspect that the claims in Point 7 are exaggerated. After all, the evaluations in the manuscript are all in-distribution tests on synthetic vascular trees, airway CT, and pulmonary artery CT datasets. The data splits used are inadequate to demonstrate generalization. Therefore, the claimed generalizability to other organs should be supported by sufficient experimental evidence using data from those organs.
    4. I am still concerned about the model’s runtime performance. The authors did not provide a direct comparison of the proposed method’s speed against its baseline Trexplorer. Given that long-range dependencies (FCA) typically come at the cost of higher computational complexity and inference time, it remains unclear whether the proposed method can meet the real-time requirements of clinical preoperative planning (as mentioned in Rebuttal Point 7). This should be further clarified.



Review #2

  • Please describe the contribution of the paper

    The author proposed Trexplorer Super based on the original Trexplorer model. By introducing three new techniques - Super Trajectory Training, Focal Cross Attention, and Target Augmentation, the tracking performance of the centerline tree of tubular structures (such as blood vessels and airways) in CT images was significantly improved. These improvements effectively address the shortcomings of the Trexplorer model in premature termination tracking and duplicate branch detection, while improving the detection capability of new branches and maintaining the correctness of the topology structure. The experimental results indicate that Trexplorer Super outperforms existing state-of-the-art models on multiple datasets.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The continuity training mechanism of Super Trajectory Training overcomes the problems of premature termination and branch omission caused by information loss in Trexplorer by retaining and reusing historical trajectory information in multiple tracking steps. By selectively focusing on the focal region in high-resolution image features while preserving broader contextual information, the refinement and computational efficiency of feature representation are optimized. By introducing a data augmentation strategy based on fork point radius, this technology improves the detection of forks and new branches while reducing the generation of duplicate branches.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    1) Insufficient experimental comparison: Only comparing with Vesselformer and the original Trexplorer, without incorporating other SOTA methods, making it difficult to fully demonstrate the advantages of the methods. For example, connectivity-enhanced segmentation methods (E.g., Kirchhoff, Yannick, et al. “Skeleton recall loss for connectivity conserving and resource efficient segmentation of thin tubular structures.” European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2024.) and other tracking methods (E.g., Alblas, Dieuwertje, et al. “SIRE: Scale-invariant, rotation-equivariant estimation of artery orientations using graph neural networks.” Medical image analysis 101 (2025): 103467.) 2) Missing illustration of Focal Cross Attention and Target Augmentation. Text-only descriptions are insufficient to describe the Focal Cross Attention block and Target Augmentation procedure. Please add figures for better illustration. 3) The analysis of failed cases is insufficient, only mentioning “a few samples completely failed”, but no direct and obvious reasons have been found, which weakens the reliability of the conclusion.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper shows promising contributions and solid performance, but would benefit from broader comparisons, better method visualization, and more complete analysis.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper focuses on tracking tubular tree structures. The authors build upon an existing tracing model, Trexplorer, and propose three key improvements:

    1) Super Trajectory Training: This training strategy enhances contextual information to guide the prediction of the next step. It extends the input of past nodes to include 10-node-long sub-trajectories, providing more context for the next node prediction.

    2) Focal Cross Attention: This novel approach computes cross attention in a way that captures long-range dependencies without the need to process the entire volume, which can be computationally prohibitive. Instead, the author propose to use a dedicated network to extract features from a large image region and to limit the cross attention computation to the local area where tracking occurs.

    3) Target Augmentation: The author propose a method to address the ambiguity concerning the positions of bifurcation points, especially in large-radius branches. They augments the position of branching nodes by adding an offset sampled from a Laplace distribution, with a scale that is proportional to the bifurcation radius.

    The effectiveness of these improvements in tracking performance is demonstrated through comparisons with state-of-the-art methods, including the original Trexplorer, and by conducting an ablation study. In addition to these methodological contributions, the paper introduces new datasets to train and evaluate their method, which will be made publicly available upon acceptance. This new dataset comprises one synthetic dataset and two real CT scan datasets designed for tracking airways (based on segmentations from the ATM ‘22 dataset) and the pulmonary artery (based on segmentations from the Parse 2022 dataset).

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The approach of using Transformers to track tubular structures sequentially directly from raw images is novel. To my knowledge, there has only been one previous attempt (VesselFormer), which the authors included in their evaluation.

    2) Trexplorer Super shows impressive results compared to the original Trexplorer and the previous state-of-the-art method VesselFormer. As supported by the ablation study, the three new contributions to the Trexplorer model led to a significant performance improvement.

    3) As many downstream applications are sensitive to incorrect topology or missing branches, it is essential to assess the morphological and topological accuracy of the tracking result. The validation conducted in this work includes not only node-level metrics but also branch-level and topological metrics, which makes it very convincing.

    4) The evaluation includes real datasets for the first time, representing a significant advancement over previous work. Even with the proposed improvements, the results show a degradation of the model performance when tested on real images compared to synthetic data. This is an interesting result as it shows that (1) it is essential to evaluate the model performance also on real data and (2) many challenges still remain to achieve tubular structure tracking.

    5) The public release of new datasets also constitute a strength of this paper.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    1) The method cannot handle cycles in the same way that VesselFormer does, which could be problematic for structures like the circle of Willis. However, there are many trees in biology so this trade-off is acceptable. This also explains the good results observed in the Betti numbers, as the tree-structure constraint is integrated into the model.

    2) The authors explicitly avoided crossings when creating their synthetic dataset, mentioning that “self intersections and intersections between trees [are] artifacts that do not accurately represent real vessel structures”. However, such crossings can happen in real life datasets, especially in 2D scenarios like retinal images, or in low-resolution 3D images. The method’s ability to resolve crossings is uncertain, I am curious to see how well it would perform on retinal images, for example.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The improvements proposed in this work have led to a significant increase in the performance of the existing Trexplorer model, establishing a new state-of-the-art in tubular tree structure tracking. The new datasets created will be useful to the community by allowing to train and evaluate tracing models on real-life images. For these reasons I believe the paper should be accepted.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have provided a convincing rebuttal putting forward the contributions of their work, especially the new public dataset. I maintain my decision to accept this paper.




Author Feedback

Thank you to the reviewers for their valuable feedback and for recognizing our novel model contributions, which show major improvement over the previous state-of-the-art (SOTA) across multiple new datasets and comprehensive metrics. We will address all minor issues in the revision and respond to the major concerns below. (R1) Cycles and intersections: We explicitly incorporate tree topology into our model to ensure tree-structured centerline predictions. In future work, the model could detect intersections and cycles by keeping track of visited positions. (R2) Failure cases: We will provide more details about the failure cases and state that image augmentation could help mitigate them. Note: We avoided augmentation to ensure a fair comparison with Trexplorer. (R2, R3) Details on FCA and TA: We agree that the current explanation can be improved. We will expand the main figure to include FCA and TA, refine our description for better clarity, and trim less important details in other parts. (R2, R3) Comparative experiments: Our study includes recent state-of-the-art (SOTA) image-to-graph methods, all of which are directly comparable to our model. We compare against VesselFormer, an extension of RelationFormer (R3). SIRE (R2) requires a segmentation mask to generate seed points for tree extraction, and its code is not publicly available. DeepVesselNet (R3) and ‘Skeleton Recall Loss…’ (R2) are image-to-segmentation methods that output segmentation masks rather than centerline graphs. Unlike our model, they require ground-truth segmentation masks for training and evaluation. Direct comparison with these models is difficult due to output differences and unfair because they rely on dense voxel-level labels, which are significantly harder to obtain. (R3) Inference times: The quoted statement highlights that fewer duplicates result in fewer endpoints, reducing the number of patches to process. We will clarify this in the paper and provide runtimes for the reported experiments. (R3) Significance of our contribution: In this work, we present, for the first time, results on two new real-world datasets and show that our proposed method, which incorporates several novel techniques, significantly outperforms previous SOTA approaches. For robust evaluation, we trained all models five times across three datasets to establish a benchmark for future research. Our model innovations, together with the release of new datasets, represent a major contribution to the field. (R3) Clinical relevance and generality: Full centerline extraction may not be essential for treating peripheral lung lesions, but it is valuable for wide-ranging applications. For lungs, complete airway centerlines help identify anatomical variants, sharp bifurcations, and obstructions, aiding safe bronchoscopy and procedures like foreign body removal. Radius-aware 3D airway maps also support respiratory drug development by modeling drug diffusion and aerosol delivery. For pulmonary arteries, centerlines with radii assist in diagnosing and planning treatment for conditions like AVMs, aneurysms, thrombi, and stenoses, guiding interventions such as thrombectomy and angioplasty. Our method universally extracts centerlines and radii of tubular tree structures in 3D medical images. Validated on synthetic vascular trees, lung airways, and pulmonary arteries in CT scans, it generalizes across diverse tubular systems without any architectural changes. Beyond pulmonary anatomy, our method is applicable to hepatic, renal, cerebral, and coronary vessels and other tubular structures. (R3) Overfitting and testing: Each dataset contains more than a million (~1.4 to 2.2) centerline points. We extract crops around randomly sampled points to create a large training set, and monitor overfitting using a separate validation set. Testing on out-of-distribution data is infeasible due to anatomical differences (airways vs. arteries) between our datasets and the lack of other public centerline datasets.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper introduces a promising method with seemingly strong empirical results. However, it falls short of the MICCAI standard due to several critical issues outlined below:

    1) Insufficient detail on core contributions (FCA and TA): The proposed modules—FCA and TA—are central to the method, yet their design and implementation are not described in sufficient technical detail. This lack of clarity limits the paper’s value as a technical contribution and makes it difficult to assess the novelty and effectiveness of these components.

    2) Lack of discussion on the ground truth generation and evaluation metric: If the ground truth of the two new datasets were largely produced using Kimimaro, it is important to benchmark against a simple baseline of binary segmentation (which is a simpler task) followed by Kimimaro post-processing. Additionally, known limitations of Kimimaro, such as centerline quality issues, and the choice of F1 score with a strict 1.5 voxel threshold, raise concerns about potential bias toward methods that overfit to training data. A more robust evaluation—e.g., using the SMD metric from the Trexplorer paper, which avoids hard thresholding—would strengthen the experimental validation.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors propose Trexplorer Super, an enhanced model for tracking tubular tree structures, which improves performance over the original Trexplorer. Reviewer #1 and #2 recognize the methodological novelty, evaluation, and the value of the new datasets. While Reviewer #3 questions the clinical relevance and innovation, the paper presents clear technical improvements and contributes useful resources to the community. I recommend acceptance.



back to top