List of Papers Browse by Subject Areas Author List
Abstract
Geometric deep learning has shown great potential for cortical surface analysis, but its performance often depends on a large-scale training set of cortical surfaces, which are traditionally derived from MRI scans through complex and time-consuming preprocessing pipelines. Although deep learning-based surface reconstruction methods have streamlined this process, they still rely on MRI data, limiting the availability of training data. To address this, we propose CortexGen, a geometric generative framework that synthesizes highly realistic cortical surfaces without requiring MRI scans. CortexGen employs geometric variational encoders to map cortical surfaces into a latent space, where latent flow matching models efficiently learn the true data distribution. This enables a two-stage cortical surface synthesis process: first, deforming an icosahedron-discretized sphere into a coarse cortical surface, and second, refining it into a high-resolution surface. Experiments show that CortexGen generates diverse, realistic cortical surfaces with 163,842 vertices in just 1.4 seconds per surface. Using these synthetic surfaces as augmented training data significantly improved learning-based cortical surface parcellation in few-shot settings. Our code and pretrained models are available at https://github.com/ladderlab-xjtu/CortexGen.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1498_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/ladderlab-xjtu/CortexGen
Link to the Dataset(s)
N/A
BibTex
@InProceedings{ZhuYua_CortexGen_MICCAI2025,
author = { Zhu, Yuanzhuo and Li, Kehan and Ma, Jianhua and Lian, Chunfeng and Wang, Fan},
title = { { CortexGen: A Geometric Generative Framework for Realistic Cortical Surface Generation Using Latent Flow Matching } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15961},
month = {September},
page = {110 -- 119}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper proposes CortexGen, a geometric generative framework for synthesizing realistic cortical surfaces without the need for MRI scans. The authors demonstrate that synthetic surfaces generated by CortexGen can be used for data augmentation in few-shot cortical surface parcellation, resulting in modest improvements in segmentation accuracy.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well-written and easy to follow.
- The methodology is intuitive and simple.
- The approach using a latent diffusion model and flow matching to generate highly realistic cortical surfaces is interesting.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Inappropriate architecture design: The problem of generating cortical surface meshes requires a manifold-aware diffusion process, as it is crucial for the generated surfaces to reflect geometric properties (e.g., geodesic distance, mean curvature, etc.) present in the training data. This is not a problem that can be resolved simply by using a geometric auto-encoder architecture. Instead, the diffusion process itself must be performed on the manifold space. However, the current approach appears to directly adopt a diffusion process defined in Euclidean space. For reference, please consider works such as (1) Huang et al., “Riemannian Diffusion Model”, NeurIPS 2022, (2) Lou et al., “Scaling Riemannian Diffusion Models”, NeurIPS 2023.
- Weak clinical or anatomical validation: Although the generated surfaces are utilized in parcellation tasks, there is no direct validation of the anatomical plausibility of the synthetic surfaces (e.g., self-intersection ratio, geometric distance/similarity between real cortical surfaces). This undermines confidence in their potential for clinical application.
- Limited dataset evaluation: All experiments are conducted using only the Baby Connectome Project dataset. This limited scope restricts the generalizability of the findings to other populations (e.g., healthy adults).
- Weak motivation: The motivation for the cortical surface generation task is limited to its use as data augmentation to slightly improve parcellation accuracy, which feels somewhat less appealing and less compelling. From a clinical analysis perspective, it raises the question of whether there could be a stronger rationale for proposing such a generative model. Additionally, it is unclear whether the unconditional generative model proposed in this paper offers any advantage over conditional models. What meaningful interpretation can be made from surfaces generated without any conditions?
- Lack of comparative experiments: The paper lacks qualitative comparisons with related works. Even if there are no existing studies that exactly match the task conducted in this paper, comparisons involving different architectural design choices would help determine the effectiveness of the proposed method.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(2) Reject — should be rejected, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
While the paper addresses an interesting challenge in neuroimaging—namely, synthesizing realistic cortical surfaces with modern generative diffusion models, there are several critical concerns that limit the strength and impact of the contribution in its current form.
First, the most significant issue lies in the inappropriate architectural design of the generative process. The paper proposes a diffusion process in the latent space of a geometric variational autoencoder, but this diffusion is defined in standard Euclidean space. This design neglects the intrinsic manifold structure of cortical surfaces, which is essential for faithfully preserving key geometric properties such as geodesic distances and curvatures.
Second, the paper lacks direct clinical or anatomical validation of the generated cortical surfaces. While the authors show improvements in few-shot parcellation performance using the generated data, there is no quantitative assessment of geometric fidelity, such as self-intersection rates or vertex-wise geometric similarity metrics against real surfaces.
Third, all experiments are confined to the Baby Connectome Project dataset, which includes only infant brains. As such, the model’s generalizability to other populations (e.g., healthy adults, patients with neurological disorders) remains entirely untested. Without a broader evaluation, it is difficult to assess the robustness or utility of CortexGen in real-world scenarios.
Fourth, the paper suffers from a weak motivational framing. The main utility of the generative model is framed around modest improvements (less than 1% Dice) in surface parcellation under a few-shot setting. While this demonstrates some value, it falls short of making a compelling case for a standalone generative model for cortical surfaces.
In summary, while the problem addressed by this paper is relevant and the overall pipeline is well-constructed, the technical limitations, weak motivation, and lack of strong validation undermine the significance of the contribution. I believe this work has promise but would benefit greatly from a more rigorous design that accounts for manifold geometry, a clearer clinical use case, and broader experimental validation.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Reject
- [Post rebuttal] Please justify your final decision from above.
Unfortunately, this work is not the first to address cortical surface generation. For example, Cortical Surface Diffusion Generative Models (Xie et al.) proposed a generative framework based on diffusion models for cortical surface synthesis. The description of the experimental setup remains insufficient. The authors used “Real_Aug” as a baseline for comparison, but it is not self-contained—there is no explanation of whether this involves rotational augmentation or other procedures. The paper claims improvements in both the diversity and realism of cortical surface generation. However, such claims require quantitative evidence. For diversity, it would be appropriate to report standard generative metrics such as FID score, coverage, or mode collapse rate. As for realism, the authors only evaluate segmentation accuracy on a downstream task, without any direct assessment of the generation quality itself. This does not justify the claim that the model preserves diversity or realism. At the very least, surface-specific measures such as cortical distortion metrics (e.g., triangle flipping, normal consistency, etc.) should have been included. Regarding the marginal performance improvement, the authors refer to Table 1 to claim superior accuracy. However, as I and likely other reviewers observed, the performance gains are marginal. At the very least, a statistical significance analysis should have been provided to support these claims. The authors argue for the motivation of their work based on (1) realistic and diverse cortical surface generation and (2) its potential extension to various downstream tasks. However, in order to substantiate the claimed novelty—particularly the effectiveness of the proposed method as a form of data augmentation—it would have been necessary to evaluate it across a wider range of downstream tasks commonly used in clinical and structural cortical surface analysis (e.g., phenotyping predictions, Alzheimer’s disease (AD) diagnosis, cortical thickness estimation, etc.). With the current set of experiments, I find it difficult to identify any significant advantages of the proposed method over conventional, simpler augmentation strategies—especially considering the added complexity of the approach. For claim (1), appropriate generative metrics such as FID scores should have been reported to objectively assess diversity and realism. For claim (2), even if not extending to more challenging tasks like full cortical surface reconstruction, evaluations on widely studied clinical analyses such as phenotyping prediction and AD diagnosis would have been more compelling. Regarding the design of the non-Euclidean architecture, unfortunately, making the model “manifold-aware” does not guarantee that the computation of the vector field in the flow-matching process is intrinsically performed on the manifold itself. While the proposed GVAE framework considers the manifold structure when encoding the input geometric surface, the resulting vector field output is not constrained to lie on the manifold—leading to potential drift-off from the intrinsic surface. Moreover, the definition of noise in the diffusion process fundamentally changes in this setting. In flow-based models, the mathematical formulation typically assumes that noise follows a standard Gaussian distribution. However, in non-Euclidean domains, applying diffusion models as defined in Euclidean space breaks this assumption. As a result, the diffusion process and the associated score function can no longer be properly defined on isotropic noise. For these reasons, I maintain my current rating of the manuscript.
Review #2
- Please describe the contribution of the paper
This paper proposes CortexGen, a novel deep learning-based geometric generative model for cortical surface synthesis in a coarse-to-fine manner. A geometric variational autoencoder (GVAE) is developed to extract latent representations from cortical surface meshes, following a diffeomorphic mesh deformation module for surface self-reconstruction. A recent flow matching model is trained in the latent space to generate high-quality cortical surfaces. The synthetic cortical surfaces are employed as augmented training data for few-shot cortical surface parcellation tasks.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- This paper presents a novel geometric deep generative model for cortical surface generation for the first time.
- A GVAE and a latent flow matching technique are introduced to generate high-resolution cortical surfaces meshes for data augmentation in downstream applications such as cortical surface parcellation.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- It seems that CortexGen only achieves a minor improvement compared to the baseline, with a large performance gap compared to the fully supervised case.
- There is a lack of experimental comparison to traditional data augmentation approaches and other deep generative models such as GAN and VAE.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall, this paper introduces a novel approach for the generation of cortical surface meshes. My major concern is about experimental evaluation and comparison:
-
Table 1 shows that the data augmentation using CortexGen only results in 0.5% improvement in terms of the Dice score. There still exists a large gap (>4.5% Dice) compared to fully supervised cortical surface parcellation.
-
It would be better to compare CortexGen with traditional data augmentation methods, such as random affine transformation and non-linear deformation of cortical surface meshes.
-
The comparisons to other deep generative models are missing as well, e.g., GAN, VAE, and DDPM [7]. Would CortexGen perform worse or not if the latent vectors are generated only by the GVAE without latent flow matching?
Other minor points:
-
The flow matching approach could be described more clearly in Section 2.2. For example, the inference of flow matching model (i.e., integrating a probability flow ODE defined by a learned velocity field) is not fully explained.
-
It is also imprecise to say “a denoising diffusion model is trained via flow matching”. The term “diffusion” refers to the stochastic diffusion process [7,15], while flow matching learns a deterministic continuous normalizing flow [10,11].
-
Instead of deforming an initial icosphere to cortical surfaces, it might be better to use a surface template such as FreeSurfer’s fsaverage as an initial surface, which could lead to better quality for surface self-reconstruction.
-
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The rebuttal has partially addressed my concerns. Overall, I think this is a promising piece of work with potential for future clinical applications.
Review #3
- Please describe the contribution of the paper
This paper presents CortexGen, a geometric generative framework that synthesizes realistic high-resolution cortical surfaces without relying on MRI data. The method combines geometric variational autoencoders (GVAE) and latent flow matching (LFM) to enable two-stage synthesis: deforming an icosahedron sphere to a low-res cortical surface, then refining it to high resolution. The authors demonstrate improved performance in few-shot parcellation tasks using generated surfaces for data augmentation.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
** The methodology is clear and sound, with well-defined architectural and algorithmic descriptions. ** The two-stage synthesis process is well-motivated and aligns with principles of multiscale geometric modeling. ** The method enables fast (1.4s) generation of high-resolution surfaces with 163k vertices, a significant improvement over traditional MRI-based pipelines. ** Experimental results show improvements in few-shot parcellation tasks across two strong baselines, indicating the practical utility of the generated surfaces.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
** Minor Clarity Gaps: Some technical details (e.g., exact number of training samples, architecture depth, and hyperparameters) could be more explicit for clarity and reproducibility. ** The experiments are limited to the Baby Connectome Project dataset. I am not sure if generalizability to adult populations is a concern. ** The comparison to alternative generative models for cortical surfaces is very limited.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Marginal Accept. due to novelty. But limited validation and limited comparison to existing methods.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
Marginal Accept due to novelty. Generating surface representation is interesting and has not been studied before. I recommend a poster.
Author Feedback
We thank all reviewers for their constructive comments and appreciate their positive assessments of our work. R1: the paper presents “a novel geometric deep generative model for cortical surface generation for the first time”. R2: “The methodology is clear and sound”. R3: “The paper is well-written and easy to follow”. We addressed the major concerns raised by the reviewers below:
*[R1, R2, R3]-Limited comparation & technical details & validation To our best knowledge, this work is the first to propose a generative framework specifically designed for cortical surface generation. As a result, direct comparisons with existing models targeting the same objective are limited. During the development of CortexGen, we observed that directly sampling from the GVAE often produced anatomically abnormal surfaces, motivating us to apply DDPM and flow matching in the latent space. Both significantly enhanced sample quality, and we ultimately adopted flow matching for its sampling efficiency. Due to space constraints, we could not include additional comparative experiments. We will update the paper to include more technical details, e.g., hyperparameter settings. Notably, CortexGen was used for data augmentation primarily to showcase the anatomical realism and diversity of its generated cortical surfaces, not to introduce a novel augmentation method. As shown in Table 1, parcellation networks trained exclusively on synthetic surfaces (Gen_aug_1000) achieved comparable or superior accuracy to those trained on real surfaces, supporting our claim.
*[R3]-Weak motivation Indeed, this paper presents an unconditional generative framework. Our use of CortexGen for data augmentation primarily highlights its core characteristics, i.e., the ability to generate realistic and diverse cortical surfaces, not to suggest this is its sole application. By conditioning the generation process, CortexGen can be extended to various tasks, including those challenging for current MRI-based cortical surface reconstruction methods. We consider this work a foundation for future applications.
*[R3]-Non-Euclidean architecture design We should emphasize that GVAE operates on the intrinsic manifold of cortical surfaces rather than in Euclidean space. While vertex coordinates serve as input, mesh connectivity is preserved via 1-ring convolution and pooling. Furthermore, CortexGen learns surface distributions in a compressed latent space, not the original 3D Euclidean space. As the first framework for cortical surface synthesis, CortexGen has a simple and straightforward architecture. In contrast, Riemannian diffusion models are less aligned with this task, while CortexGen fits well, as shown by experiments. We hope this work can serve as a reference for future research pursuing similar goals.
*[R3]-Weak clinical or anatomical validation While we did not explicitly evaluate the anatomical plausibility of the generated cortical surfaces, we computed cortical attributes (i.e., curvature and sulcal depth) of the generated surfaces using FreeSurfer and used them as input to Spherical U-Net. The model achieved parcellation performance comparable to, or even better than, that obtained with real surfaces, suggesting that the generated surfaces exhibit reasonable anatomical plausibility.
*[R1]-Imprecise description Following the suggestion, we will revise the manuscript to improve the clarity of the relevant descriptions.
*[R1]-Choice of initial surface We consider that highly convoluted surfaces, such as FreeSurfer’s fsaverage, have limited plasticity, and using them as surface templates restricts the diversity of the generated surfaces. Therefore, we chose the more malleable and easily accessible icosphere, and deforming it yielded sufficiently accurate self-reconstruction results.
*[R2&R3]-Generalizability to other populations Following the suggestion, we will incorporate data from other age groups in future work to assess the generalizability of CortexGen.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper presents CortexGen, a novel geometric generative framework for synthesizing realistic cortical surfaces without MRI scans. Despite Reviewer #3’s concerns about theoretical rigor and validation, Reviewers #1 and #2 strongly support its methodological novelty and practical relevance. The authors have sufficiently clarified distinctions from related work. Remaining concerns can be addressed in future studies. The paper’s innovative approach justifies acceptance.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This is an extremely borderline paper. It is (AFAIK) the first paper that presents a generative model of cortical surfaces (Xie et al, suggested by R#3 generates feature maps on the surface, such as curvature – not vertex coordinates), and as such would definitely trigger discussions at MICCAI. While R#3 makes important points, I want to outline that many of them were only made post-rebuttal, and where therefore not addressable by the authors – I have ignored those. Nonetheless, the generated surfaces are only quantitatively evaluated using a surrogate segmentation tasks. Segmentation network training may benefit from non-realistic surfaces, which help build robustness in, so improved segmentation scores are not evidence of realism. The quality and realism of the surfaces should have been directly evaluated – at the very least, checks for crossing and inverted triangles should have been made. I therefore tend towards rejection.