Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing methods for spatio-temporal modeling of the human heart require shape correspondence over time or suffer from large memory requirements, making it difficult to use for complex anatomies. We introduce a novel conditional generative model, where the shape and movement is modeled implicitly in the form of a spatio-temporal neural distance field and conditioned on clinical demography. The model is based on an auto-decoder architecture and aims to disentangle the individual variations from that related to the clinical demography. It is tested on the left atrium (including the left atrial appendage), where it outperforms current state-of-the-art methods for anatomical sequence completion and generates synthetic sequences that realistically mimics the shape and motion of the real left atrium. In practice, this means we can infer functional measurements from a static image, generate synthetic populations with specified demography or disease and investigate how non-imaging clinical data effect the shape and motion of cardiac anatomies.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1946_paper.pdf

SharedIt Link: https://rdcu.be/dV1Wo

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72384-1_40

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1946_supp.zip

Link to the Code Repository

https://github.com/kristineaajuhl/spatio_temporal_generative_cardiac_model.git

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Sør_Spatiotemporal_MICCAI2024,
        author = { Sørensen, Kristine and Diez, Paula and Margeta, Jan and El Youssef, Yasmin and Pham, Michael and Pedersen, Jonas Jalili and Kühl, Tobias and de Backer, Ole and Kofoed, Klaus and Camara, Oscar and Paulsen, Rasmus},
        title = { { Spatio-temporal neural distance fields for conditional generative modeling of the heart } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {422 -- 432}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a method for generating atrial-motion mesh models using spatial-temporal Signed Distance Fields (SDF) integrated with clinical demographic data. The proposed method successfully separates clinical demographics from individual variations, facilitating the identification and generation of patients across different cohorts.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The task at hand involves the challenging and critical reconstruction of the left atrium and its appendage, essential for identifying heart strokes. The proposed method utilizes a SDF for efficient mesh model generation and introduces an innovative approach to integrating clinical demography into the process.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The paper lacks a thorough discussion on the core innovations of the proposed method, particularly the representativeness of ( z_c ) and a covariance analysis between ( z_r ) and ( z_c ). The results display disappointing reconstruction fidelity and quality, indicated by a large Hausdorff distance and the absence of consistent normalcy in the generated meshes. Additionally, it lacks segmentation accuracy metrics, such as Dice score, which could be applied through voxelization of the generated meshes.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1. In the introduction, it’s stated that “the continuous nature of neural distance fields allows for modeling highly complex structures without requiring correspondence between samples.” However, this is not entirely accurate to claim that as an advantage as dense correspondence may be necessary for certain biomechanical simulations.
2. On page 3, the notation ( f_\theta(d) ) should likely be ( f_\theta(x) ).
3. The notation should use concatenation between ( g_\phi(c_n) ), ( z_r ), and ( s ) to clarify the operations being performed.
4. More details are needed regarding the sampling of ( K = 110,000 ) space-time coordinates mentioned on page 3, and how sequence completion is handled. Specifically, is the entire sequence generated from a single ( z_r )?
5. An ablation study should be provided to demonstrate the effectiveness of adding ( z_c ) to the network compared to using ( z_r ) alone, and the representational ability of ( z_c ) should be validated through downstream tasks, such as demographic classification.
6. An analysis of the covariance between ( z_c ) and ( z_r ) should be included to assess their interdependencies.
7. The distribution of ( z_r ) in both training and generation needs discussion, especially since the similar trends shown in Figure 3 might suggest that ( z_r ) could be sampled from a Gaussian distribution rather than being a trainable variable.
8. Metrics should be provided for evaluating the generated left atrium appendage to enhance the assessment of related diseases.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Reject — could be rejected, dependent on rebuttal (3)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper lacks experiments and analysis that verify the necessity of integrating clinical demography into the mesh generation network, which is claimed as the novelty of the proposed method. This oversight raises questions about the impact and effectiveness of including demographic data in the model.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors proposed a conditional generative model using SDF as the representation of left atrium from 4D CT scan. The performance in sequence completion and generation is validated and compared to recently SOTA models, achieving competitive results.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. A generative SDF representation of left atrium with better efficiency than voxels.
2. Integration of clinical factors into image generation.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. Need more rigorous formulation of the training of auto-decoder. The prior of the latent space for the training of auto-decoder is not clear.
2. Need more comparison. The authors compared their SDF auto-decoder to the voxel auto-encoder (CHeart) and achieved better results. It should be noted that the CHeart can be also used as an auto-decoder model (keeping the decoder) for test-time latent optimisation. This could lead to better results after optimisation.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

I recommend the authors to release their segmentation/SDF of their datasets public which would benefit the community (could be a strength). For example: [1] Savioli, Nicolo; de Marvao, Antonio; O’Regan, Declan (2021), “Cardiac super-resolution label maps”, Mendeley Data, V1, doi: 10.17632/pw87p286yx.1
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
The authors did exellent work on generative modelling of the left atrium. Here are some my comments:
1. The title is not appropriate: shoud change heart to left atrium
2. More comparison or discussion between the auto-encoder and auto-decoder are encouraged, e.g. test-time optimisation of auto-encoder v.s. auto-decoder (e.g. [1])
3. Better mathematical formulation [1] Wang, Shuo, et al. Joint motion correction and super resolution for cardiac segmentation via latent optimisation. MICCAI 2021.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Good work with novelty.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

Authors propose a generative method for spatio-temporal heart surface models, via neural distance fields. Both shape and movement are modeled implicitly as signed distance fields (SDF), encoded as small MLP neural networks. Additionally, authors encode patient demographics, which do not improve reconstruction errors, but improve functional measurements. Results on a dataset with 667 subjects (301/366 male/female) are quite convincing.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Spatio-temporal model
- Incorporation of demographics as latents, which can be used as generative priors
- Convincing results
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Incomplete consideration of related works (statistical shape models/SSMs)
- Comparison to only one related works from literature (CHeart), e.g. no SSM method was considered
Please rate the clarity and organization of this paper

Excellent
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

Even though the paper describes the method quite clearly, I would recommend publishing an open-source code repo along with the paper (nothing in that direction was mentioned by the authors).
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
:: Novelty / Related work ::
- I would consider the method marginally novel. Implicit neural representations of surfaces via SDF are well-known, and spatio-temporal extensions have been proposed before as well. The integration of demographics as a supplemental latent space was a nice idea, and it was not not very complex to model either.
- In their related works section, and in the results/comparison to SOTA, authors only considered neural methods and mostly neural distance fields for shape modeling and generation. But there is a very rich line of literature on statistical shape modeling (SSM), which was not considered. For future extensions of this manuscript, I would highly recommend to incorporate these works. Traditional methods based on point-distribution models (PDM) and PCA can already be used as generative models [Cootes95], and there has been extensive work on solving the correspondence problem, some solutions are quite elegant (e.g. [Davies02]). Today, all of this is readily available as open-source software (e.g. [ShapeWorks] and [SlicerSALT]). There even exist neural extensions (presented at MICCAI 2023, see [Adams23]). I find the work interesting enough for a discussion at MICCAI, but for a journal extension, I would 100% expect a proper handling of this literature field, and comparison to established methods, beyond only CHeart.
:: Soundness ::
- The method design is quite sound, particularly the separation into a latent space for demographics and residuals, and the results demonstrate that this design decision/assumption was realistic.
- sec.1: “explicit voxelmap representation however suffer from large memory requirements”: this is not a very compelling argument nowadays, we can easily store and (!) process volumes at 256^3 resolution with deep models (e.g. UNets) in a single GPU with 48GB VRAM (batch-size 1, but doable). This also helps with representing smooth surfaces, e.g. a cropped heart FOV at 256^3 should be enough to have very smooth heart surfaces. Future hardware will make this even easier to handle. But I agree with the authors that even though this can be handled, the cubic memory reqs are not elegant.
- Fig 3: Results in Fig 3 are quite convincing, interesting also to see that the long tails from the real data distributions are not well captured by the model (which is somewhat expected). Authors attribute the long tails to over-/under-segmentations from the pre-processing step.
- Fig 4: here as well, results are convincing.
:: Clarity ::
- The paper is written very clearly, both regarding methods and results, especially the figures are well crafted.
[Cootes95] Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active Shape Models-Their Training and Application. In Computer Vision and Image Understanding (Vol. 61, Issue 1, pp. 38–59). Elsevier BV. https://doi.org/10.1006/cviu.1995.1004 [Davies02] R. H. Davies, C. J. Twining, T. F. Cootes, J. C. Waterton and C. J. Taylor, “A minimum description length approach to statistical shape modeling,” in IEEE Transactions on Medical Imaging, vol. 21, no. 5, pp. 525-537, May 2002, doi: 10.1109/TMI.2002.1009388 [Adams23] Adams, J., Elhabian, S.Y. (2023). Can Point Cloud Networks Learn Statistical Shape Models of Anatomies?. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14220. Springer, Cham. https://doi.org/10.1007/978-3-031-43907-0_47 [ShapeWorks] https://sciinstitute.github.io/ShapeWorks/latest/ [SlicerSALT] https://salt.slicer.org/
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Interesting method, convincing results. The incomplete review of related literature and comparison to only a single reference method (CHeart) is unfortunate though.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Accept — could be accepted, dependent on rebuttal (4)
[Post rebuttal] Please justify your decision

Authors addressed the points raised in my review. SSM literature will be addressed in the intro, and there’s a partial justification why it wasn’t possible to include in the methods comparison, i.e. including demographics is not obvious in SSMs. I don’t consider this an “ill-posed problem” or that “no single correct correspondence can be defined”. I have worked with SSMs for several years, and I’ve worked with both PDM- and particle-based SSMs, and both should work on the shapes illustrated in the paper and supp. material. For a journal extension, I would highly recommend authors to try SSMs using established toolkits and compare surface recon metrics (ignoring demographics). This can only help corroborate the method’s performance.

Author Feedback

We would like to thank the reviewers for relevant and constructive comments on our manuscripts. The main concerns that were raised are addressed in the following bullet points.

Motivation for integrating clinical data By integrating clinical data into the generative framework we can explore associations between clinical data and anatomical shape and motion. We consider this an impactful contribution since it allows for investigating how a change in a clinical variable affects the shape and motion and for generating plausible anatomical sequences with characteristics related to the given clinical information. Sequence generation without integration of clinical data would not allow for this kind of controlled generation. For sequence completion, we already included the ablation study on the clinical demography that is requested by reviewer #3 (See Table 1 row 2) and demonstrated that clinical information improved the estimation of the functional parameters. A further improvement is expected if more clinical variables were included in the model.

Thorough discussion of the methods and its effectiveness We recognize that the reviewers would prefer a more in-depth analysis of i.e. the latent spaces and how they are derived. Due to the limited paper length, we have not been able to include thorough discussions on the underlying mechanisms of an auto-decoder but refer the reader to the seminal deepSDF paper [17]. We will however include a short discussion of the differences to a standard auto-encoder. We appreciate the reviewer’s suggestions for analyzing the covariance of the latent spaces and we aim to work towards a journal paper where the properties of the latent spaces will be further explored. Despite the simple setup, we argue that the method produces convincing results, where the low chamfer distance suggests an overall good reconstruction accuracy and the meshes vary smoothly in both space and time (See full movies in the supplementary material). The detail level of the meshes can still be improved, but the proposed method outperforms current SOTA by 18-30% across the different metrics on a large test set of 367 unique sequences, which we consider a significant step in the right direction.

Comparison to other methods and datasets We have evaluated it against the CHeart method [22] since it is considered SOTA within conditional generation of anatomical sequences and their goals are similar to ours. Additional comparisons have not been feasible due to limitations on time, paper length and computational resources, where 48 GB GPUs are not standard at most clinical sites. While a separate appendage evaluation would provide additional evaluation insights, it has not been performed since automatically finding the appendage ostium in a robust way is not a trivial task.

Statistical shape modeling (SSM) In the introduction, we will add a short paragraph about the great body of SSM work that has been done in this field. Obtaining point correspondence across the complex and diverse left atrial appendage shapes is however an ill-posed problem, where no single correct correspondence can be defined. Integrating clinical data into an SSM is furthermore an unsolved problem. Future extensions of the method might include utilizing the gradients of the neural distance fields to obtain point correspondence over a temporal sequence since this is required in certain applications as correctly pointed out by the reviewers.

Shared code and data All code will be made publicly available upon acceptance. Due to GDPR, we are however not able to share images or segmentations as the unique shape of the left atrial appendage makes full anonymization impossible.

Additional comments The typo on page 3 will be corrected and a concatenation will be added to equation 2 and 3 to correspond to the notation in Figure 1.

We hope this rebuttal clarifies the main concerns and that you will consider accepting the paper for MICCAI 2024.

Meta-Review

Meta-review #1

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

N/A

back to top

Spatio-temporal neural distance fields for conditional generative modeling of the heart

Author(s):