Abstract

Generative modeling of anatomical structures plays a crucial role in virtual imaging trials, which allow researchers to perform studies without the costs and constraints inherent to in vivo and phantom studies. For clinical relevance, generative models should allow targeted control to simulate specific patient populations rather than relying on purely random sampling. In this work, we propose a steerable generative model based on implicit neural representations. Implicit neural representations naturally support topology changes, making them well-suited for anatomical structures with varying topology, such as the thyroid. Our model learns a disentangled latent representation, enabling fine-grained control over shape variations. Evaluation includes reconstruction accuracy and anatomical plausibility. Our results demonstrate that the proposed model achieves high-quality shape generation while enabling targeted anatomical modifications.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2874_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/MIAGroupUT/steerable-shape-synthesis

Link to the Dataset(s)

Created dataset: https://zenodo.org/records/15100852 Used dataset: https://zenodo.org/records/10047292

BibTex

@InProceedings{deBra_Steerable_MICCAI2025,
        author = { de Wilde, Bram and Rietberg, Max T. and Lajoinie, Guillaume and Wolterink, Jelmer M.},
        title = { { Steerable Anatomical Shape Synthesis with Implicit Neural Representations } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15962},
        month = {September},
        page = {638 -- 648}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a method for generation of 3D shapes of the thyroid gland. The paper presents an implicit neural network method that can be conditioned on geometrical descriptions to generate thyroids in a controlled way.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well written and easy to read.

    • Thyroid glands serve as a suitable use-case to demonstrate the INR ability to represent changes in topology.

    • It is not straightforward to evaluate the performance of generative models. The authors make a reasonable validation by basing it on anatomical plausibility.

    • The paper has some good insights into the correlation between the conditions.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The paper has limited novelty compared to 1, who also generates 3D shapes using INRs and conditions on both fixed and conditional latents. This work is not mentioned by the authors.

    • There are no comparison to other SOTA methods such as the ones mentioned in the introduction [11,12] or standard generative frameworks i.e. CGAN 2 or CVAE 3.

    • The introduction focuses on virtual imaging trials. It is however not described how the generation of shapes are of benefit in a virtual imaging trial.

    • The paper is missing some motivation for why it is interesting to generate shapes based on geometrical conditioning. The confounding factors mentioned in the introduction (skin color in pulse oximetry, BMI in X-ray etc.) are quite different than the proposed conditions that are directly related to the geometry.

    1 Sørensen, et al. “Spatio-Temporal Neural Distance Fields for Conditional Generative Modeling of the Heart.” Proceedings of 27th Conference of Medical Image Computing and Computer Assisted Interventions, 2024, pp. 422–32, https://doi.org/10.1007/978-3-031-72384-1_40.

    2 Mirza, M., Osindero, S.: “Conditional generative adversarial nets.” arXiv preprint, arXiv:1411.1784 (2014)

    3 Sohn, K., Lee, H., Yan, X.: “Learning structured output representation using deep conditional generative models”. Advances in neural information processing systems 28 (2015)

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the application is interesting the novelty seems limited and there are no comparison with other generative methods.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The methodological contribution is limited and despite differences in representations I believe a comparison to other works would be beneficial. The paper is however very well written and offers some interesting insights into disentanglement and correlation of conditions and features. For this reason, I think, we can consider accepting the paper.



Review #2

  • Please describe the contribution of the paper

    Summary of Paper

    • The paper aims to generate different 3D thyroid objects with varying geometry and topology.
    • The main principle is to use a DeepSDF inspired architecture which learns to represent the thyroids.
    • A fixed part in the autodecoder latent allows steering.
    • The results indicate that the steering works quite well.
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Strengths

    • The authors show ingenuity through their steerable conditioning method.
    • Fig.4 is a very nice display of the effects caused by altering the different conditioning variables.
    • The manuscript was easy to follow, well done.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Weaknesses

    • A big concern is that the generated meshes might simply reproduce shapes already present in the training set, as identical outputs would naturally maximize the used metrics. The authors should explicitly evaluate whether the newly generated thyroid shapes differ meaningfully from those seen during training (see detailed comments below). While Figure 4 suggests that new shapes are indeed being generated, I find quantitative confirmation very important.
    • Regarding novelty: The method used is mostly DeepSDF with a fixed part in the latent vector used for steering. There are certainly quite a few works that alter or interpolate the latent for steering (in related fields), these should be discussed in the introduction.
    • I find the object (thyroid) that was chosen quite simple. For such a basic shape, I can imagine one could even define a algorithmic approach for data generation.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Comments

    • In the introduction it is stated that INRs represent surfaces as the zero-level set of a SDF. This is generally not correct, as there are other implicit representations (such as occupancy).
    • Although citation [13] is also in a biomedical context, it focuses on single cell representation learning. I don’t find this is the ideal citation for work focusing on organ-scale objects. I would either make it more clear why this work is cited (the organic and varying shapes) or remove the mentioning of the biomedical context.
    • The authors should mention the intuition behind adding the L2 regularization to the latent code (it is mentioned in the DeepSDF paper that is cited).
    • I recommend that the authors upload the anonymized code to an anonymized git repository for future submissions. I as the reviewer have no way of confidently telling if there will be actual code available.
    • In 3.1 I find it quite difficult to tell if an error of 1.6mm is good or bad, as I am lacking an understanding of the size variance of the thyroid. From what I understand they can be very small (in which case an error of 1.6mm is quite large?). Perhaps the authors could provide the reader with the average (+ standard deviation) size of the evaluated thyroids?
    • So 3.1 is mostly answering the question: How well does the MLP learn to represent the training data. I believe it would have been highly valuable to vary the MLP size to find out if a large model would have resulted in a decreased error.
    • I see a major flaw the evaluation of 3.2, perhaps the authors should comment if I understand this. A model that simply outputs the exact shapes from the training data would maximize the evaluated score, but it would completely miss the point of the paper (which is to generate new meshes). The authors must also evaluate that the newly generated thyroids are actually different from the training dataset. For example, by measuring the similarity between thyroids in the training data and then showing the similarity of newly generated thyroids to the original dataset. They should be equally different.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a clear and well-executed approach to steerable 3D thyroid shape generation using a DeepSDF-style model. While the idea is good and visual results are promising, I am concerned about the novelty and especially the lack of evaluation for shape diversity beyond the training set.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Response to Comment 1:
    The author acknowledged the missing citations. The provided argumentation seems reasonable.

    Response to Comment 2:
    The reasoning provided is not convincing to me. Particularly for simple geometry such as thyroids, the voxel-grid resolution is unlikely to cause issues. Moreover, the authors do not use positional encoding themselves, which likely limits the learning of high-frequency details anyway.

    Response to Comment 3:
    The authors presented new experimental results in their rebuttal. While these results appear to address the concern I raised, I must emphasize that introducing new data during the rebuttal phase may not be in line with the rebuttal guidelines. As I as a reviewer do not make this call, I will not considered these new results in my evaluation. My score and assessment are based solely on the original submission and clarifications that are within scope of the rebuttal policy. Although not quantitative, I do believe that even without the new results, Fig. 4. does show that diverse/new thyroids are generated.

    Response to Comment 4:
    Reference 7 does not describe an approach for generating random thyroid shapes. Rather, it describes a deterministic thyroid shape, with complexity stemming from the chosen representation. Thus, it does not adequately support the authors’ claims. A simpler approach uses template thyroids randomly deformed by control parameters, avoiding the need for explicit mathematical descriptions. Such an approach, however, may only be effective for simple shapes. Since this remains conjectural, and since learning from data is arguably more elegant, I will overlook this response.

    Thank you for clarifying the issue regarding size.



Review #3

  • Please describe the contribution of the paper

    The paper introduces a method by which the morphology of thyroid volumes modeled by an INR can be steered via disentangled high-level features.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper clearly establishes the problem statement. The description of their method is concise and clear.

    While the individual tools employed by the paper aren’t novel themselves (introducing high-level features into the conditioning, disentangling, shape modeling via INR), the way these are combined under an INR setting is novel, specially under the medical setting. The paper clearly highlights the benefits of their methods and how their approach approach are overcome current unsolved issues in INR based modeling of the medical domain.

    Extensive evaluation is provided on the disentangling and correlation of features. I appreciate the section on reconstruction quality to verify that the added conditioning has no impact on the modeling process’ performance.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    While the size of the conditioning vector is given (N=64), it is not clear to me how large the fixed part of the vector is. Are the fixed features presented as 3 scalars (volume, IA, symmetry) and concatenated with the trainable feature vector? Are they projected into a larger vector and then added to the trainable vector? While as a reviewer familiar with the INR field, I am aware of the general architectural aspects, I think a simple overview figure could have gone a long way for unfamiliar readers.

    The discussion section touches upon the frequency bias of ReLU-based INRs and the relative smoothmess of the thyroid occupancy volumes. However, for a 2025 INR paper, I would expect the decision to not use eg. a positional encoder to be discussed in the method’s section (as opposed to an after-thought on future research at the end of the paper).

    Furthermore, the paper appears to input the raw anatomical feature values into the network. I can understand the reason for not using positional encoders for the anatomical features as they are low-frequency signals, but I would expect these types of methodological decisions to be explicitly described in an INR paper.

    While the paper has a surface-level introduction on INRs, I’d argue there is a lacking overview on medical applications, specially for submission to MICCAI. Previous MICCAI submission have explored the idea of anatomical feature-based conditioning 1. This paper offers an extensive background section that may of use to the authors. Additionally, other fields of the machine learning such as GANs and VAEs have long explored vector-conditioned networks using disentangled features, yet this paper introduces disentangled conditioning without any mention of previous works.

    1 - Dannecker, M., Kyriakopoulou, V., Cordero-Grande, L., Price, A.N., Hajnal, J.V., Rueckert, D. (2024). CINA: Conditional Implicit Neural Atlas for Spatio-Temporal Representation of Fetal Brains. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Grammar: Page1, 2nd sentence: “…, but since recreating human tissue is complex, thus experiments with high mimicking accuracy remain costly.” The ‘thus’ should not be there.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Paper has clear problem statement, methodology and evaluation. Only minor critiques can be made about the clearness and motivation of the methodology, as well as the thoroughness of the background section.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Overall, concerns brought forward by the all 3 reviewers were minimal. In my opinion, the author’s clarified the majority of the reviews’ main concerns in their rebuttal.




Author Feedback

We thank the reviewers for their valuable feedback and thoughtful suggestions. Below, we address the main issues identified by the (meta-)reviewers.

  1. Novelty The reviewers correctly identify that our work follows the autodecoder approach proposed in DeepSDF 1. Similarly, other works in medical shape modelling have used this approach [2,3], including the work by Sørensen et al. 4. We thank the reviewers for pointing out this missed reference. While the work by Sørensen et al. also conditions shape synthesis on fixed codes, in their case representing clinical demography (gender, age and systolic blood pressure), the novelty in our work is in the explicit disentanglement of fixed and learned latent space dimensions through an explicit correlation loss. With our experiments, we show that including this correlation loss can lead to improved results. In our case, this led to an improvement in terms of correlation between conditioned and generated features, e.g. PCC = 0.88 without correlation loss versus PCC = 0.93 with correlation loss for the isthmus area.

  2. Comparison to GAN and VAE models For the INR-based shape representations that we here consider, it is common practice to use an autoencoder approach 1, as encoding of continuous shapes (as required in a GAN or VAE) is challenging. The benefit of this INR-approach is that we can obtain smooth surfaces that are not limited to any resolution. In contrast, CGAN 5 and CVAE 6 models for shape synthesis performing latent space disentanglement use voxel-based representations, which are limited to the chosen resolution. Hence, a comparison with CGAN/CVAE would not only compare models, but also representations, which we consider beyond the scope of the work and leave for future work.

  1. Shape variation To verify that we are – in fact – generating new shapes, we performed the evaluation proposed by R3. We computed the Chamfer distance between all shapes in the training set. This showed that the average Chamfer distance to the closest shape was 4.13 ± 1.33 (std). When randomly generating 1000 shapes, the average distance to the closest shape in the training set was 3.67 ± 0.51, 3.72 ± 0.95, and 3.69 ± 0.76 for the baseline, fixed and correlated versions, respectively. This indicated that synthesized shapes are different from training shapes.

  2. Other Regarding the complexity of the synthesized shapes (R3), a prior model-based approach 7 generates thyroids that are a strong simplification of the actual representative thyroids shown in Fig 1 of our work. The “butterfly-shape” of a typical thyroid is not easily captured in equations, let alone the diversity across the patient population.

Regarding the interpretation of results (R3), we’d like to clarify that the average volume of a thyroid in our training set is 14.52 ± 7.40 mL with their dimensions spanning on average 44.75 ± 9.79 x 29.32 ± 5.91 x 52.18 ± 7.03 mm, meaning that the error of 1.6 mm we note in our manuscript is relatively low.

Regarding the dimensionality of our inputs (R2), latent codes are 64-dimensional, and features are indeed added as 3-vectors.

With these comments, we hope that we have answered the (main) points from the reviewers. If accepted, we will incorporate the textual feedback and discus the missed references. We again would like to thank the reviewers for their valuable feedback and thoughtful suggestions




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    This paper presents a clear and well-written study on generating steerable 3D thyroid shapes using a conditional INR based on DeepSDF. The application is relevant, and the visual results demonstrating control are promising. However, there are significant concerns that temper enthusiasm. The methodological novelty is questionable, particularly in light of recent related work presented at MICCAI using similar conditional INR approaches. Critically, the paper lacks quantitative comparisons to any other generative methods and, most importantly, fails to provide evidence that the generated shapes are truly novel rather than mere reproductions from the training set – a fundamental requirement for evaluating a generative model. While R2 recommends acceptance based on the combination of tools and clarity, R1 and R3 highlight these major gaps in novelty and evaluation. Addressing the novelty claims (especially w.r.t. cited prior work) and providing results of generating diverse, new shapes beyond the training set are crucial for this paper.

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This is a borderline paper for me as the methodological novelty is limited and the experimental setup seems to lack real baseline comparisons. However, the reviewers are in favor of the paper and it is well-executed. As suggested by the reviewers the related work section of the final version needs to be substantially updated prior to submission.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top