Abstract

Accurate analysis of cardiac motion is crucial for evaluating cardiac function. While dynamic cardiac magnetic resonance imaging (CMR) can capture detailed tissue motion throughout the cardiac cycle, fine-grained 4D cardiac motion tracking remains challenging due to the homogeneous nature of myocardial tissue and the lack of distinctive fea tures. Existing approaches can be broadly categorized into image-based and representation-based, each with their own limitations. Image-based methods, including both traditional and deep learning-based registration approaches, either struggle with topological consistency or rely heavily on extensive training data. Representation-based methods, while promising, often suffer from loss of image-level details. To address these limitations, we propose Dynamic 3D Gaussian Representation (Dyna3DGR), a novel framework that combines explicit 3D Gaussian representation with implicit neural motion field modeling. Our method simultaneously optimizes cardiac structure and motion in a self-supervised manner, eliminating the need for extensive training data or point-to-point correspondences. Through differentiable volumetric rendering, Dyna3DGR efficiently bridges continuous motion representation with image-space alignment while preserving both topological and temporal consistency. Comprehensive evaluations on the ACDC dataset demonstrate that our approach surpasses state-of-the-art deep learning-based diffeomorphic registration methods in tracking accuracy. The code will be available in https://github.com/windrise/Dyna3DGR.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2303_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/windrise/Dyna3DGR

Link to the Dataset(s)

N/A

BibTex

@InProceedings{FuXue_Dyna3DGR_MICCAI2025,
        author = { Fu, Xueming and Wu, Pei and Li, Yingtai and Luo, Xin and Jiang, Zihang and Mei, Junhao and Lu, Jian and Teng, Gao-Jun and Zhou, S. Kevin},
        title = { { Dyna3DGR: 4D Cardiac Motion Tracking with Dynamic 3D Gaussian Representation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {163 -- 173}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper propose Dyna3DGR, a novel framework that combines explicit 3D Gaussian representation with implicit neural motion field modeling, simultaneously optimizing cardiac structure and motion reconstruction in a self-supervised manner. They conduct experiments on ACDC dataset to evaluate the performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. integrate image and and representation information is integrated.
    2. the method is clear presented in both text and figures
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The method is only evaluated on one 2D dataset: ACDC, which is not enough to verify the validity and the generalizability of the method. please justify why the 2D cardiac MRI is adopted, why not 3D brain MRI.
    2. the generation of 3D Gaussian representation is not described in the paper, make it hard to understand and reproduce the results.
    3. MRI do not contain camera parameters, I wondering how the rendering is performed without camera parameters?
    4. Please explain clearly the innovation or improvement comparing to reference [10] in the paper.
    5. The authors assert that their framework integrates both image-space and representation-space registration. However, in practice they fuse a 3D Gaussian representation with an implicit neural representation—an approach that is fundamentally coordinate-based. This differs substantially from the “image-based” registration methods that most deep‐learning approaches employ. Moreover, the introduction’s description of “image-level registration” does not match the coordinate-based methodology actually implemented.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    As mentioned in section 7, the authors should explain following aspects:

    1. why the 2D cardiac MRI is adopted, why not 3D brain MRI.
    2. the generation of 3D Gaussian representation is not described
    3. how the rendering is performed without camera parameters?
    4. explain the innovation upon reference [10] in the paper
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    All my concerns are solved



Review #2

  • Please describe the contribution of the paper

    This paper proposes a novel cardiac motion tracking framework based on implicit Gaussian representations. The results showed improved motion tracking, reconstruction, and volume preserving performance against previous work by a large margin.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea of using implicit Gaussian representations for cardiac motion tracking is novel.
    2. The method achieves significant improvement in terms of motion tracking, reconstruction accuracy, and volume preserving effect against previous approaches.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The framework is based on instance optimisation, which is time consuming in practice. It would be better to see how the proposed method can be extended to a learning-based framework.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed framework is novel and achieves good performance for cardiac motion tracking. However, there is concern on computational burdens of the model and why a learning-based variant is not considered.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Despite its instance optimisation scheme, I recommend acceptance of the paper due to its novel formulation.



Review #3

  • Please describe the contribution of the paper

    The authors proposed a 3D Gaussian representation-based volume rendering and motion tracking method for cardiac motion tracking. Empirical results show outstanding performance compared to other methods in comparison.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A novel methodology by employing 3D gaussian representation for volume rendering and motion tracking. The proposed method promotes motion field smoothness by interpolating control nodes.
    • Empirical results show significant improvement in anotomical consistency.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The proposed is a per-instance optimization method. The authors should clarify if other methods in comparison underwent the same test-time-optimization process to ensure the fairness of comparison. For example, voxel-morph regisration can be also optimized at test time on the given instance by minimizing the consistency and smoothness penalty.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The presented method is an interesting application of 3D Gaussian representation in cardiac motion tracking. The strong empirical results indicates promising clinical implications. However, I have concerns about reproducibility and fair comparison between methods. In general I think the merits overweighs the weakness and thus recommend a weak accept.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I am skeptical about the “common evaluation protocol established “ in authors’ rebuttal. Deep learning methods can be seamlessly integrated with instance optimization, see the voxel morph paper Fig. 8 “https://arxiv.org/pdf/1809.05231”. However, I still see benefits of the interesting methodology in cardiac motion tracking.




Author Feedback

We appreciate the reviewers’ thorough evaluation and insightful suggestions, especially some thought-provoking proposals (R2,R3,R4). We thank the reviewers for acknowledging the novelty (R2,R3,R4), interest (R4), clear presentation and organization (R3), and good experimental results (R2,R3,R4) of our work. Below, we address each concern raised:

R2: We acknowledge that the Dyna3DGR requires more time during inference (~11 minutes) compared to pre-trained learning-based models. However, this trade-off brings significant performance gains (17.73% increase in Dice score) without requiring extra datasets. Importantly, the task typically does not require real-time processing in clinical settings. Our approach is particularly valuable in scenarios with limited data availability or for abnormal cardiac conditions that fall outside the distribution of standard training sets. We agree with the reviewer that extending our framework to a learning-based approach represents an important direction for future work.

R3: Q1: We clarify that cardiac MRI serves as the clinical standard for cardiac function and disease evaluation, and ACDC is a well-established benchmark in this domain. We process each patient’s data as an entire 3D volume(128×128×32 dimensions) rather than as single 2D slice. And 3D brain MRI is unsuitable for motion tracking as it remains relatively static during imaging. Q2: We will enhance existing explanation about the generation of 3D Gaussian representation for clarity. We clarify that the generation comprises both initialization and optimization processes, as illustrated in Fig. 1(b). For initialization, we describe in the Implementation Details section that we derive the initial 3D Gaussian primitive positions through uniform sampling of the ED phase segmentation mask. Control nodes are initialized from these same Gaussian positions. During optimization, the 3D Gaussians are rendered into volumetric data and it is then used to calculate loss against ground truth imaging. This iterative process gradually refines both Gaussian parameters and control node properties to achieve accurate cardiac structure and motion representation. Q3: Unlike natural image rendering where colors are view-dependent, medical volumetric data contains intensity values that remain consistent regardless of viewing angle. Our approach implements rendering by aggregating the influence of each 3D Gaussian within the volumetric grid using Equations (1) and (2) without requiring camera parameters. Q4: We extend reference [10] (introduced for novel view synthesis of dynamic scenes) with cardiac-specific innovations for medical imaging: a. we adapt the Gaussian representation and critically volumetric rendering for cardiac data. b. we address cardiac motion’s unique characteristics by simultaneously modeling large cardiac contractions and subtle tissue thickness changes, and employ implicit neural motion to learn both Gaussian position changes (capturing large-scale motion) and radius zoom (representing tissue deformation), maintaining anatomical consistency throughout the cardiac cycle while accurately representing complex deformations. Q5: Our method indeed bridges coordinate-based motion representation with image-space consistency through differentiable volumetric rendering, which indeed differs from deep-learning methods. This creates a feedback loop where image-space supervision guides the optimization of motion representation.

R4: Q1: We follow the common evaluation protocol established in cardiac motion tracking literature where traditional registration methods undergo test-time optimization while deep learning-based approaches are evaluated through direct inference without additional optimization. For fair comparison, we evaluated all methods according to their intended paradigm and followed the same preprocessing protocol. Q2: we will release the source code upon acceptance of the submission.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper presents Dyna3DGR, a novel framework for cardiac motion tracking that integrates explicit 3D Gaussian representations with implicit neural motion fields in a self-supervised setting. The method is applied to 3D cardiac MRI data from the ACDC dataset, and the results demonstrate notable improvements in motion tracking accuracy, structure reconstruction, and volume consistency over prior methods. The use of differentiable volumetric rendering without requiring camera parameters, combined with an implicit representation of motion, constitutes an original and technically sound contribution.

    All three reviewers recognised the novelty and strong empirical performance of the approach. They highlighted the elegant formulation, effective integration of representation and image domains, and clear presentation. Some initial concerns were raised: Reviewer 2 questioned the scalability and computational burden of the instance-optimisation strategy; Reviewer 3 sought clarification on the rendering mechanism, dataset choice, and the distinction from prior work; Reviewer 4 raised a fairness concern regarding the comparison with learning-based baselines. These points were thoroughly addressed in the rebuttal, which clarified that 3D cardiac volumes—not 2D slices—were used, explained the rationale for instance optimisation in this context, and detailed how the Gaussian representation was initialized and refined.

    Although the framework requires longer inference time, the reviewers ultimately agreed that the gains in accuracy and generalisation to limited data settings make this a reasonable trade-off, particularly in non-real-time clinical scenarios. By the end of the discussion, all reviewers converged on acceptance.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper is very well-written and easy to follow. It proposes a novel method that combines 3D Gaussian representation with implicit neural fields. The authors demonstrate both the superiority and robustness of the proposed method.



back to top