Abstract

Longitudinal prediction of infant brain MRIs is crucial for individualized neurodevelopment tracking and disorder forecasting. However, existing methods, such as diffusion-based generative models, often struggle to capture the complex spatiotemporal dynamics of developing brains, leading to unreliable predictions that lack subject-specific, anatomically consistent growth patterns. To address this, we propose a \textbf{Flexibly Distilled 3D Rectified Flow (FDRF)} framework, which integrates anatomical constraints for dual-stream predictions of volumetric images and tissue maps along developmental trajectories. Our framework features an age-conditioned feature fusion module for controllable prediction with targeted age appearances and employs anatomical constraints derived from segmentation labels and high-frequency image details to ensure subject-level spatiotemporal consistency. Additionally, we introduce a flexible distillation of rectified flow, enabling a unified one-step generative model for high-fidelity cross-time predictions while preserving individualized anatomical details. Given 6-month MRIs and tissue maps as the input, our model reliably predicts their spatiotemporal growths at 12 and 24 months, outperforming existing diffusion-based baselines by relatively large margins. Our codes can be found at \url{https://github.com/ladderlab-xjtu/FDRF}

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1363_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/ladderlab-xjtu/FDRF

Link to the Dataset(s)

N/A

BibTex

@InProceedings{WanHai_Flexibly_MICCAI2025,
        author = { Wang, Haifeng and Ren, Zehua and Chang, Heng and Qiu, Xinmei and Wang, Fan and Lian, Chunfeng and Ma, Jianhua},
        title = { { Flexibly Distilled 3D Rectified Flow with Anatomical Constraints for Developmental Infant Brain MRI Prediction } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15974},
        month = {September},
        page = {231 -- 240}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a new method for simulation of brain aging in infants based on a given MRI scan. The method is conditioned on the target age and employs a two-step training/distillation process. Validation is done on infant data comprising scans at age 6, 12, and 24 months.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall, the paper is written clearly and the method is in large parts well described, which makes it easy to understand. In my eyes, the clarity of the approach is a strength of the paper. The combination of loss functions (MSE, edge, segmentation) used for the distillation is interesting. Notably, data quality is ensured by three radiologists.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    My main concerns are two-fold:

    • The experimental validation is limited to follow-up predictions at 12 and 24 months, always starting at 6 months. While I understand that there are restrictions arising from available longitudinal data, the current experimental setup does not really validate the generalization of the method. This is especially important as the conditioning on the target age via the feature fusion model is a core contribution; in order to assess if it is actually working, I think more diverse inputs are needed.
    • The improvement over previous methods seems to be marginal. In Fig. 2, it is difficult to see a qualitative improvement over previous approaches, even in the highlighted regions (all of them look pretty good). This is also the case for the scores reported in Tables 2 & 3. Hence, I disagree with the authors that there is an improvement “by relatively large margins”. Confidence intervals or statistical tests were not presented but could help in clarifying the significance of the improvement.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Minor comments:

    • typo “000Specifically”
    • further validation of the method would be insightful, e.g., is the distillation step really necessary or could similar results be achieved when using the distillation loss already during training?
    • from Fig. 1, it seems like the velocity is stationary (not depending on t explicitly), whereas it is always written as v(z, t) in the formulas
    • it is unclear how loss function weights were chosen/tuned
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think that the main limitations (weak experimental setting and marginal improvement) cannot be addressed in a rebuttal. Hence, I recommend rejecting the paper.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    I thank the authors for their responses in the rebuttal. However, while I still like the approach and acknowledge its speed and simplicity, I think further validation is needed to show that it actually has the claimed improvement in prediction accuracy over pre-existing methods. Specifically, I would have expected the authors to report statistical tests or at least confidence intervals in the rebuttal (this is not a new result). Merely reporting plain bold numbers in tables is not enough in my eyes, especially given the small margins.



Review #2

  • Please describe the contribution of the paper
    1. This paper proposes a flexibly distilled 3D rectified flow, which integrates anatomical constraints for dual-stream predictions of volumetric images and tissue maps along developmental trajectories.
    2. The age-conditional feature fusion module is designed for controllable prediction with age appearances. A flexible distillation of rectified flow enable the generative model for high-fidelity cross-time predictions.
    3. The FDRF model is a unified model, only using 6-month data as inputs, can predict for longitudinal predictions of 12 and 24 months. It has great clincal value for individualized neurodevelopment and associated disorder forecasting.
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper introduces a dual-stream rectified flow framework that jointly predicts 3D MRI volumes and tissue segmentation maps, addressing a critical gap in longitudinal neurodevelopmental imaging. The flexible distillation reduces inference to one step and anatomical constraints ensure that high-frequency anatomical boundaries and tissue labels remain, compensating for oversmothing caused by distillation.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    As shown in Table1, the segmentation mask is the main contributions for performance improvement. (RF(FFM)* vs RF(FFM)). The improvement of edge loss and segmentation loss are subtle in generating 3D volumes. In tissue segmentation task, the difference of FDRF1 and FDRF2 is not significant. Please clarify the necessity of these two anatomical loss function.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper introduces a dual-stream rectified flow framework that jointly predicts 3D MRI volumes and tissue segmentation maps, addressing a critical gap in longitudinal neurodevelopmental imaging. This design enables accurate and efficient mapping of distributions without relying on classifier-free guidance (CFG) strategies, effectively capturing the essential characteristics of infant brain MRI and segmentation labels across different ages. Allthough the the effectiveness of designed component are not demonstrated well. The overall methodological framework is novel.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    addressing my concerns about the improvement of FDRF1 and FDRF2 is not significant



Review #3

  • Please describe the contribution of the paper

    The paper introduces Rectified Flows for generating infant brain images. Using input images and target segmentations from one time point, it predicts images and segmentations at 12 and 24 months, separately. The training process is enhanced with segmentation masks and edge losses, resulting in a one-step prediction of the FM.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The largest major strength: The method is novel and practically useful. The method, specifically the neural network, is also not convoluted, and adds just as much as necessary. The comparisons are mostly on level playing field.

    Loss Formulations: Showing that these losses improve the prediction of the proposed method

    Evaluation Results: Results like Table 2 are not always given, which is another good metric, since it is difficult to trust pixel level metrics in some cases.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    There are no fatal errors or weaknesses, but a few minor issues:

    • The introduction criticizes diffusion models without providing any sources on these critiques. The paper also does not show any of those claims.

    • The method only predicts a single Image. i.e. 6-> 12 Or 6->24. doing both in one would be beneficial, which was already done in some other diffusion papers.

    • There is no comparison of sampling times, which could highlight the speed advantages of Flows over Diffusion. Yet, this was claimed in the introduction.

    • Only using DDIM as a baseline is ok, but only when a vanilla DDIM is compared to RFs. Comparing the final method to DDIM we find a bit simple, but acceptable.

    • The validation methodology is not clearly described, and it is unclear if multiple splits were used.

    • The evaluation is limited to a single dataset.

    • Minor errors in notation, such as missing t∼U[0,1] in Equation 3, and inaccurate claims about diffusion models.

    • The novelty and distinction of the Feature Fusion Module (FFM) compared to baselines are not clearly established. From our experience DDIM and RFs need the same input dimensionality. We would have wished for further clarification on which architecture was used. Generally, this paper confounds the names of architecture, method, and module.
    • some sections of the paper are unclear, some parts are over-claiming. In general the clarity of the paper could be improved.

    • We would have wished for more details on training, model sizes, and so on. With the current page limit, we would rather have more details for re-implementation, than to have the limited space filled with Flows, where rich sources are available already.

    • The paper’s use of “distillation” is misleading, as it typically refers to student-teacher models. The proposed method involves training Rectified Flows (RF) with a uniform temporal distribution and inferring at t=0.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The application of Rfs in the medical field, introduced by the paper as Flexibly Distilled 3D Rectified Flows, is novel and shows substantial improvements over Diffusion models. Our experience with similar experiments confirms their efficiency and effectiveness, though the paper lacks further detailed experiments. But: the results have the potential to significantly impact the longitudinal modeling community. The paper’s weaknesses, primarily due to writing ambiguities, can be addressed. E.g., the terminology used, specifically and especially “distillation,” is inaccurate and should be revised for clarity.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The paper introduces useful methodological innovations. While modeling all time points within a single model is a promising direction for future work, we believe the current contribution is already valuable.

    As anticipated, RFs outperform DDIMs in terms of efficiency; we were only surprised it was not featured more prominently in the paper.

    Some of the weaknesses we initially identified will be revised. Although we maintain minuscule concerns, such as the appropriateness of the term “distillation,” which the authors only partially addressed, we consider the overall contribution to be substantial.




Author Feedback

We sincerely thank the reviewers for their insightful feedback. We have carefully considered all comments and provide the following responses:

  1. Regarding Model Performance, Quantitative Evaluation, and Baselines (Re #1, #2, #3) Concerns: Marginal quantitative improvement, difficulty seeing gains, lack of statistical tests, absence of sampling time comparison, clarity on FFM/baselines/architecture. Response:
    • While overall quantitative improvement might appear subtle (as noted by reviewers), our ablation studies (Table 2/3) clearly demonstrate the effectiveness of our individual design choices and components. We will consider adding statistical tests to our final evaluation. -High fidelity and efficiency are key advantages of our 1-step sampling (~0.09s) vs. DDIM’s 50+ steps (~23.29s); we will add a quantitative comparison of sampling time in the final version. -FFM is an additional block integrated into the same Diffusion UNet base architecture (from MONAI) used by baselines, ensuring fair comparison.
  2. Regarding Experimental Setup, Data Limitations, and Validation Strategy (Re #2, #3) Concerns: Limited prediction time points (6->12/24), lack of diverse inputs, single dataset, unclear validation method. Response: -Indeed, the limitation of the IBIS dataset, which only contains data at discrete time points (6, 12, 24 months), restricts the development and evaluation of the current model. Our long-term clinical goal is longitudinal prediction from earlier points and applicability to a wider range of time points. Future research will consider incorporating other datasets, and even real-world clinical data, for validation on a broader scale. -Regarding predicting both 12 and 24 months simultaneously, designed to predict a single target age conditioned via FFM, our current method aligns with our future goal of predicting any subsequent time point from an earlier one. -We acknowledge that our validation methodology, including the data split, was not clearly described. We used a fixed 75%/5%/20% split for training, validation, and testing, and multi-fold cross-validation was not performed due to resource constraints. We will add these details to Section 3.1 in the final version.

  3. Regarding Method Clarity, Formulation, and Implementation Details (Re #1, #2, #3) Response: -Regarding the necessity and effectiveness of the extra loss functions (FDRF1/FDRF2): These were primarily included to demonstrate the flexibility of our proposed distillation stage in incorporating various forms of anatomical supervision. While their impact on global metrics for 3D volume generation might appear subtle, they are designed to help preserve high-frequency anatomical details and boundary fidelity, which global metrics like PSNR or SSIM may not fully capture. The similar performance between FDRF1 and FDRF2 in segmentation suggests both loss formulations contribute comparably to integrating anatomical constraints within our framework. Their main role is to showcase the adaptability of our method to different anatomical guidance signals during distillation. -Distillation is indispensable for 1-step sampling; using imprecise intermediate outputs for full training supervision was ineffective. -RF learns a time-dependent velocity field v(z,t) to achieve the straight path. Loss weights are in Section 3.1, empirically tuned. Notation t~U[0,1] is after Formula 3; we will review diffusion statements.
    • We clarify our “distillation” usage refers to the end-to-end training phase targeting t=0 output, differing from standard RF distillation but enabling 1-step inference. We will further expand the training information in Section 3.1 and condense the introduction to flow. -To facilitate reproducibility and future research, we will publicly release our code and trained models.
  4. Regarding Introduction and Motivation (Re #3) Response: -We acknowledge this point and will revise the introduction to include appropriate references and refine statements.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper proposes FDRF, a novel framework for predicting infant brain MRI development using 3D rectified flow with anatomical constraints. However, several concerns raised by reviewers justify a rejection decision. The experimental validation is limited to specific time points (6 to 12 and 24 months) due to data constraints, which restricts the assessment of the method’s generalization capabilities. The improvement over existing diffusion-based methods is considered marginal, with qualitative and quantitative results not clearly demonstrating significant advancements. Additionally, the paper lacks sufficient details on the necessity and impact of the proposed anatomical loss functions, and the validation methodology is not clearly described. The authors also did not adequately address the unique characteristics of infant imaging, such as changes in brain volume, which is crucial for accurate neurodevelopmental predictions. Given these unresolved concerns and the lack of convincing responses to the reviewers’ critiques, the paper does not meet the standards required for acceptance.



back to top