Abstract

Anatomical trees play a central role in clinical diagnosis and treatment planning. However, accurately representing anatomical trees is challenging due to their varying and complex topology and geometry. Traditional methods for representing tree structures, captured using medical imaging, while invaluable for visualizing vascular and bronchial networks, exhibit drawbacks in terms of limited resolution, flexibility, and efficiency. Recently, implicit neural representations (INRs) have emerged as a powerful tool for representing shapes accurately and efficiently. We propose a novel approach, TrIND, for representing anatomical trees using INR, while also capturing the distribution of a set of trees via denoising diffusion in the space of INRs. We accurately capture the intricate geometries and topologies of anatomical trees at any desired resolution. Through extensive qualitative and quantitative evaluation, we demonstrate high-fidelity tree reconstruction with arbitrary resolution yet compact storage, and versatility across anatomical sites and tree complexities. Our code is available \href{https://github.com/sfu-mial/TreeDiffusion}{here}.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2477_paper.pdf

SharedIt Link: https://rdcu.be/dY6f2

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72390-2_33

Supplementary Material: N/A

Link to the Code Repository

https://github.com/sfu-mial/TreeDiffusion

Link to the Dataset(s)

https://vascusynth.cs.sfu.ca https://github.com/intra3d2019/IntrA https://www.kaggle.com/datasets/andrewmvd/drive-digital-retinal-images-for-vessel-extraction https://han-seg2023.grand-challenge.org/ https://www.kaggle.com/datasets/awsaf49/brats2020-training-data

BibTex

@InProceedings{Sin_TrIND_MICCAI2024,
        author = { Sinha, Ashish and Hamarneh, Ghassan},
        title = { { TrIND: Representing Anatomical Trees by Denoising Diffusion of Implicit Neural Fields } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15012},
        month = {October},
        page = {344 -- 354}
}

Reviews

Review #1

Please describe the contribution of the paper

The key contribution lies in the integration of INRs for faithful representation of complex anatomical structures and the use of diffusion models to capture their statistical distribution. This methodology enables high-fidelity reconstruction of anatomical trees, synthesis of plausible new trees, and segmentation of medical images, showcasing versatility and efficiency in various medical imaging modalities and anatomical complexities.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1.The paper combines Implicit Neural Representations (INR) with diffusion models to achieve high-fidelity representation and generation of anatomical tree structures, offering a new methodology for medical image processing.
1. Through comprehensive evaluations across various medical image datasets and anatomical tree structures, the paper showcases the method’s effectiveness and robustness in tasks such as reconstruction, synthesis, and segmentation.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The paper lacks comparison with existing state-of-the-art medical image processing methods or anatomical tree representation techniques. While the paper introduces a approach that combines INR with diffusion models, it is necessary to benchmark it against established techniques. This includes some explicit anatomical tree representation methods or other reconstruction methods based on deep learning, such as:

[1]Park, Sihwa, et al. “3D Teeth Reconstruction from Panoramic Radiographs Using Neural Implicit Functions.” International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2023. [2]Han, Zeyu, et al. “Contrastive diffusion model with auxiliary guidance for coarse-to-fine PET reconstruction.” International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2023.

In addition, I don’t think this is the ‘first work to use implicit neural fields for faithful representation of topologically-complex anatomical trees’, as stated in the Conclusion. What about the work like ref [1] and below: Khan, M.O., Fang, Y. (2022). Implicit Neural Representations for Medical Imaging Segmentation. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1. The caption for Figure 5 does not match the image display, with an additional explanation “(e)” included.
2. Please add corresponding comparative experiments to enhance the persuasiveness of the paper.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Reject — could be rejected, dependent on rebuttal (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper introduces an approach by combining INR with diffusion models and extensively evaluates it across tasks like anatomical tree reconstruction, synthesis, and image segmentation. However, it lacks some comparative experiments, making it difficult for readers to intuitively understand the strengths and weaknesses of the proposed method compared to existing techniques.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Reject — could be rejected, dependent on rebuttal (3)
[Post rebuttal] Please justify your decision

This paper introduces an approach by combining INR with diffusion models and extensively evaluates it across tasks like anatomical tree reconstruction, synthesis, and image segmentation. However, it lacks some comparative experiments, making it difficult for readers to intuitively understand the strengths and weaknesses of the proposed method compared to existing techniques. Thank you for the rebuttal, but it didn’t address my concerns. Hence, I maintain my decision.

Review #2

Please describe the contribution of the paper

The authors proposes a novel methodology for representing and processing anatomical trees in medical images, employing Implicit Neural Representations (INRs). Additionally, they extend this representation by integrating diffusion models to learn the distribution of these tree-like structures, and thus generate novel, plausible trees with complex topologies. They have conducted qualitative and quantitative evaluation, to demonstrate the fidelity of tree reconstruction at arbitrary resolutions, while maintaining versatility across various tree complexities.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The concept of utilizing INRs to model 3D anatomical trees is intriguing. The capability of INRs to learn an implicit function for each volume and their flexibility in being sampled at any resolution could provide insights and precision in the visualization and analysis of complex anatomical structures.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The authors’ approach to generating novel tree structures, while innovative, is not conditioned on ensuring anatomical correctness. This leads to results that might not always align with true anatomical structures. While the paper integrates advanced models from the broader computer vision community, the application seems to drift from clinical relevance.
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
- The authors want to represent anatomical trees because they are important in clinical diagnosis and treatment planning. While in the paper they claim that they generate plausible novel trees with complex topology, the generation of the tree-like structures is unconditioned, thus having no guarantee that they will be anatomically correct. This aspect makes them inpractical in clinical settings.
- Regarding the methodology, the paper shows to successfully combine various advanced models from the computer vision community, yet it does not present a clear novelty in its approach. A novel integration or unique adaptation tailored to medical imaging would enhance the contribution significantly.
- The results presented in Figure 6 from the segmentation experiment are expected, as INR networks typically overfit to target data. Although this shows technical capability, it may not adequately demonstrate the method’s effectiveness in realistic or varied clinical scenarios.
- While a big focus is put on the footprint of the network, it would have been more beneficial if more evaluations were performed on the medical application side, particularly by validating whether the generated results maintain anatomical accuracy.
- Figure 1 in the manuscript lacks a clear sequential order. It would improve readability and understanding if parts e) and f) were separated into a different figure, and a clearer depiction of the sequence for a), b), c), and d) is made.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Reject — could be rejected, dependent on rebuttal (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The use of INRs to model 3D anatomical trees is promising, showing potential for detailed visualization in medical imaging. However, the lack of conditioning on anatomical correctness limits the clinical applicability of the results. Although technically sophisticated, the paper’s relevance to practical medical scenarios is unclear, as it drifts from clinical application.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Accept — could be accepted, dependent on rebuttal (4)
[Post rebuttal] Please justify your decision

Authors successfully addressed my concerns

Review #3

Please describe the contribution of the paper

The authors fit an INR to different kinds of anatomical trees (overfitting an MLP for each tree in the dataset). Later they use diffusion to capture the statistical distribution of the trees and generate novel trees
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1- Broad scope in addressing complex anatomical tree representations beyond vascular trees alone. By encompassing a variety of anatomical structures, including intracranial vasculature, brain structures, head and neck anatomy, retinal vessels, bronchial trees, and the Circle of Willis, the authors demonstrate the versatility and applicability of their proposed method across different anatomical domains. Furthermore, the diversity in the datasets used for evaluation, ranging from synthetic vascular trees to real medical imaging data such as MRI brain scans, CTA scans, retinal fundus images, and CT scans, underscores the robustness and generalizability of the proposed method. By demonstrating its effectiveness across various dimensions of dimensionality, complexity, and anatomy, the authors provide compelling evidence of the versatility and efficacy of their approach. 2- innovative use of a transformer diffusion approach, not only enables the generation of plausible vascular structures but also offers the potential for capturing the complexity and variability observed in real vascular systems.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1- lack of clarity regarding the decision to focus solely on the VascuSynth dataset for synthetic tree generation. While the paper boasts the inclusion of various anatomical trees and datasets, it appears that only one of these datasets is utilized for generating synthetic trees. This decision prompts questions about whether the complexity of other anatomical trees influenced the authors’ choice of datasets for tree snthesis Furthermore, given the potential intricacies involved in representing diverse anatomical structures, it would be beneficial for the authors to explicitly address any limitations of their proposed model in generating trees beyond the vascusynth dataset. Authors should clarify whether the decision to primarily utilize the VascuSynth dataset for synthetic tree generation was influenced by the complexity of vascular trees or other considerations would provide valuable insights into the scope and applicability of the proposed methodology. I recommend that the authors discuss these aspects, including any inherent limitations in the model’s ability to generate trees of varying complexities, to provide a more comprehensive understanding of the proposed approach. 3- Merely stating the values of the evaluation metrics without comparing them to baselines or existing methods is insufficient for assessing the adequacy of the generative model’s performance. While the reported metrics provide some indication of the model’s quality, diversity, and plausibility, meaningful interpretation requires context and comparison with established benchmarks or alternative approaches. Without such comparisons, it’s challenging to determine whether the reported values represent an improvement over existing methods or meet the desired level of performance for the specific task and dataset.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

1- Authors should clarify whether the decision to primarily utilize the VascuSynth dataset for synthetic tree generation was influenced by the complexity of vascular trees or other considerations would provide valuable insights into the scope and applicability of the proposed methodology. I recommend that the authors discuss these aspects, including any inherent limitations in the model’s ability to generate trees of varying complexities, to provide a more comprehensive understanding of the proposed approach.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Accept — should be accepted, independent of rebuttal (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I believe the strengths of the paper outweight the weaknesses that can easily be addressed by the authors in the rebuttal
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Accept — should be accepted, independent of rebuttal (5)
[Post rebuttal] Please justify your decision

Authors successfully addressed my concerns

Author Feedback

We thank the reviewers for the constructive feedback and considering our work “novel/new methodology” (R3/R1) “an innovative use of transformer for diffusion” (R4), “technically sophisticated” (R3), “an intriguing and promising approach for detailed visualization” (R3), “this [INR-based image segmentation] shows technical capability” (R3), Excellent on “clarity and organization” (R3), “demonstrate the versatility and applicability […] across different anatomical domains” (R4); “underscores the robustness and generalizability of the proposed method” (R4). In this response, we focus on addressing the main concerns of R1,3,4.

R1.1: Comparisons with Park et al [1] & Han et al [2].

We first emphasize that our method comprises 2 stages: 1st, we overfit a model to each tree in the training set using INR realized as MLP, though other networks like [1], SIREN or FFN could also be used. We don’t consider these networks as competing methods but rather alternative design choices for stage 1 of our method. The uniqueness of our method for representing trees lies in flattening the network parameters to 1D vectors (one vector per network overfitted to each tree) and using them in stage 2, where we learn the distribution of these tree-representing vectors using denoising diffusion. Unlike [2] that performs diffusion on 2D PET image slices, we perform it on INR’s network parameters. Further, as shown in Fig.3 & Tab.2, INR is superior to volume and mesh representation, offering ~60% & ~20% lower memory footprint, and ~82% & ~17% higher reconstruction accuracy, respectively. Hence, we adopt INRs in stage 2 for diffusion. Plus, [2] uses DDPM, which is slower than DDIM that we use [31].

R1.2: Method is not the first to use INR for representing trees and points to [1] & Khan et al [3].

We address [1] above and note that [3] focuses on CT/MRI images for organ segmentation, not anatomical trees, and doesn’t use diffusion for modeling tree statistics or synthesis.

R3.1: Tree-like structures are unconditioned and may be anatomically incorrect.

Although we focus on unconditional tree generation, our quantitative results in Fig.6(c-d) show that our generated samples closely mimic the real data distribution & the qualitative results in Fig.8 shows visually plausible generated samples, indicating that the diffusion model learned a plausible tree manifold from INR vectors. Plus, R4 noted, “[method] enables generation of plausible vascular structures”. Conditional generation of trees is a promising future direction.

R3.2: Segmentation exp. shows technical capability, but doesn’t demonstrate method’s effectiveness.

As stated in Sec.3, Fig.7 was merely a proof-of-concept demonstration. Focusing on rigorous assessment of the utility of our representation for segmentation is left for future work.

R4.1: Why use only VascuSynth [9] for tree synthesis?

We used synthetic trees from [9] due to their diverse tree topologies, and showed qualitative samples with varied complexities in Fig.8. In contrast, synthesized trees from IntRA would look visually similar, as all depict the brain’s circle of Willis without easily observable differences.

R4.2: Comparisons with baselines/other methods for tree generation.

Synthesis of anatomical trees is an underexplored field, which motivated our work. As discussed in Sec.1, rule-based methods, eg, L-system & VascuSynth, are highly complex and are not designed to model, and sample from, the distribution of trees. While many methods extract/segment 3D trees from volumes, they don’t synthesize new trees. Moreover, methods such as “Manifold of Trees” (arxiv:1207.5371) are limited to simple 2D skeletons of trees (and even require domain expertise to do so), making them difficult to render and use in downstream medical applications. As reporting results here is not allowed, as a baseline, we will add to Tab.3 results of performing diffusion on voxel-grids of size 64^3 (262k params, compared to our INR’s 160k, ~1.6x less).

Meta-Review

Meta-review #1

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

N/A

back to top

TrIND: Representing Anatomical Trees by Denoising Diffusion of Implicit Neural Fields

Author(s):