Abstract

Most current neuroimaging analyses in studies of brain disorders assume a homogenous presentation of the disorder such that traditional statistical analysis methods based on Gaussian distributions can be applied. Yet, most brain disorders present with a heterogeneous spectrum of cognitive, behavioral, morphometric as well as functional manifestations. In this paper, we introduce a novel approach called PRADA (Phenotype Representation and Analysis via Discriminant Atypicality) that embraces the heterogeneity of both typical and atypical brain morphometry. This approach employs Multiscale Score Matching Analysis (MSMA), a global and local multiscale out-of-distribution analysis via the gradients of the log density (scores). Combining MSMA and manifold-mapping, we compute a morphospace of brain phenotypes representing deviations from a population of typical subjects. Using these brain phenotypes, disorder-related subtyping can be performed. Furthermore, subject-specific profiles of atypicality can be extracted via Spatial-MSMA and summarized per subtype. We show the application of PRADA to structural MRI data in a study of Autism Spectrum Disorder (ASD). The resulting analysis detects disorder-related subtypes and reveals that subtype-specific structural atypicality correlates with cognitive and behavioral outcomes. These results highlight the potential of PRADA to discover disorder relevant phenotypes.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2180_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/eonemli/ASD-OOD-Analysis

Link to the Dataset(s)

IBIS-SA: non-public dataset (at the moment, will be made public on NIH’s NDA platform) ABCD: https://abcdstudy.org/ HCPD: https://www.humanconnectome.org/study/hcp-lifespan-development

BibTex

@InProceedings{OneEmr_Phenotype_MICCAI2025,
        author = { Onemli, Emre and Mahmood, Ahsan and Azrak, Omar and Garic, Dea and Swanson, Meghan R. and Grzadzinski, Rebecca and Mata, Kattia and Shen, Mark D. and Girault, Jessica B. and St. John, Tanya and Pandey, Juhi and Zwaigenbaum, Lonnie and Estes, Annette M. and Shen, Audrey M. and Dager, Stephen R. and Schultz, Robert T. and Botteron, Kelly N. and Evans, Alan C. and Elison, Jed T. and Yacoub, Essa and Kim, Sun Hyung and McKinstry, Robert C. and Gerig, Guido and Hazlett, Heather C. and Marrus, Natasha and Piven, Joseph and Pruett Jr., John R. and Styner, Martin},
        title = { { Phenotype Representation and Analysis via Discriminative Atypicality (PRADA) to capture the structural heterogeneity of Autism Spectrum Disorder } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {472 -- 482}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors introduce the Phenotype Representation and Analysis via Discriminative Atypicality (PRADA) approach, which aims to detect disorder-related subtypes by modeling the heterogeneity of both typical and atypical brain morphometry. Autism Spectrum Disorders (ASD) is the clinical application used to test the proposed approach. One beautiful part of this work is the correlation of image atypicality with the behavioral assessments.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • It is very interesting the idea to have a morphospace that basically summarize brain phenotypes, and how the outlier brains could be analyzed within such morphospace, providing interpretability for the findings in NDD individuals.
    • Last part of the analysis, when correlating brain atypicality and behavioral measures for two primary phenotypes, provide insightful findings which makes sense with what has been clinically reported specifically for ASD. Hope to see how PRADA could be extended to other NDDs.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Lack of information on image pre-processing (detail below) and its impact on the framework, specifically for the second part of the methodology (GHSOM)
    • Lack of discussion with respect to the impact of registration accuracy on the performance of the framework.
    • Not clear how “T1w and T2w images jointly” were used.
    • How do authors make sure the identified differences are not due to site/cohort differences instead of the natural typical or atypical brain morphometry.
    • ASD data could be considered limited (55 subjects), mainly taking into account the availability of public datasets containing ASD subjects (ABIDE I, II)
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    In general, the study shows promising results, however, by including more information to clarify certain details (e.g. preprocessing), including more data (feasible due to availability public ASD datasets), and a more rigurous preprocessing and experimentation, this work could be a journal paper.

    Data

    • Although authors indicated the two used datasets, not information is provided regarding variability in terms of patient sex, scanner/model, magnetic field strength, etc.
    • A major concern about data splitting is whether the identified differences are due to the proposed framework or differences in terms of datasets, institutions, ets.
    • Training and validation: 1650 control individuals (ABCD & HCP-D) + 82 low familial (IBIS). With respect to training data, I wonder how distributed are the 82 cases within the morphospace build with 1650 cases? that is, if control cases from one site/cohort/institution follow the same distribution as the control ones used to construct a typical morphospace.
    • Testing: 55 ASD (IBIS). I consider this set of data is too small.

    Methodology

    • Detail the tool that was used to standardize voxel size (1mm isotropic resolution)
    • Explain what is the meaning of using both T1w and T2w images jointly
    • Beside standard voxel size, is there any other preprocessing step before using PRADA? Steps like intensity normalization, bias field correction, etc.
    • Suggest to add more information about registration method and used tool(s).
    • One question that always arise when working with ASD is how reliable are DAS-II, Vineland-II and ADOS based on the reported inter-reader variability? Authors should discuss on this regard in results section.
    • Since PRADA framework was applied within AAL regions, it would be interesting to see the extended analysis derived from these results.

    Validation

    • How do authors make sure the found differences are not due to site bias instead of the natural typical or atypical brain morphometry.
    • How much is the GHSOM method affected by the intensity values to define the “units”, since intensities depend on the resonator (device, model) with which they were acquired? And this question is related to what was mentioned about image preprocessing, but also when the atypical samples (NDD) were map to the morphospace.
    • How much is the method affected by the registration process?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The constructed morphospace using the PRADA framework looks interesting and promising, nevertheless some serious methodological issues should be solved before consider its acceptance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Authors have addressed comments in a proper way. Previous doubts were solved.



Review #2

  • Please describe the contribution of the paper

    The main contribution of this paper is that the author proposes a novel framework named PRADA to explore the structural heterogeneity and the disorder-related subtypes for ASD patients.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    I think the main novelty is to apply the novely apply the deep learnign network in to the clinical evaluation to solve the challenges arised from traditional imaging analysis. And to better adapt the framework, the author designs several techniques such as the sptaial MSMA and growing hierarchical self-organizing map.

    Overall, the method design is novel.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    I think the main weakness is in the experiment section. I think first there is no numerical results such as tables or figures to demonstrate the proposed method compared to other potential deep learning based methods. And the visulization results seems confused to me to understand the advantages of the proposed methods.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the author proposes a novel framework to solve challegnes appeared in traditional imaging analysis, the overall experiment design is extremely limited.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper introduces PRADA (Phenotype Representation and Analysis via Discriminative Atypicality), a method that combines a score-based out-of-distribution (OOD) framework (Multiscale Score Matching Analysis, MSMA) with a hierarchical self-organizing map (GHSOM) to:

    Quantify individual-level brain “atypicality” from structural MRI.

    Embed atypicality measures into a data-driven morphospace to identify subgroups (or “phenotypes”) within a heterogeneous clinical population (here, Autism Spectrum Disorder).

    Correlate these phenotypes’ regional brain atypicalities with behavioral measures, illustrating potential clinical relevance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel Use of OOD Detection in Neuroimaging

    While there are existing score-based methods for out-of-distribution detection, applying MSMA and its voxel-wise extension (Spatial-MSMA) to identify structural anomalies in autism offers an original and promising perspective. The paper nicely argues why traditional distribution assumptions (e.g., Gaussian) may be too restrictive for neurodevelopmental disorders.

    Hierarchical Phenotype Representation

    The use of a Growing Hierarchical Self-Organizing Map (GHSOM) is particularly interesting, as it adaptively represents a “morphospace” without requiring users to predefine map size or dimensionality. This helps capture the spectrum of neurodevelopmental heterogeneity more flexibly than standard dimensionality reduction.

    Strong Evaluation with Clinical Measures

    The study goes beyond structural comparisons by linking the identified phenotypes to standard autism-related clinical scales (CBCL, ADOS, Vineland, DAS-II). Demonstrating significant associations between measured brain atypicality and real-world functional or behavioral outcomes strengthens the clinical feasibility of the approach.

    Interpretability and Localization

    The paper includes voxel-wise “atypicality likelihood” maps, explaining where in the brain these anomalies appear. This localizing approach is valuable for clinicians and neuroscientists aiming to interpret structural deviations.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Sample Size for the Test (ASD) Population

    Although the framework is tested on a real-world ASD sample, the group size (55 ASD participants) is relatively modest. This limits the statistical power for fully exploring all possible subtypes and validating the robustness of identified phenotypes.

    Limited Discussion on Specific Structural Interpretations

    While the authors provide spatial maps and correlations, the paper does not deeply discuss how structural differences manifest (e.g., shape vs. intensity anomalies). Future work clarifying these mechanistic details would enhance biological interpretability.

    Incremental Nature of the Method Components

    PRADA builds upon existing concepts: MSMA was previously proposed for anomaly detection, and GHSOM has been known for hierarchical clustering. The paper’s novelty is largely in the careful combination and application to ASD phenotyping. Reviewers seeking an entirely new algorithmic approach might see this as relatively incremental, though still useful.

    Potential Domain Shift Concerns

    The typical controls were drawn from multiple large public datasets (ABCD, HCP-D), while the target group is from IBIS. Despite the authors’ mention of steps to mitigate domain shifts, more extensive validation or ablation analyses would strengthen the claim that the method generalizes well across sites/datasets.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses a challenging, highly relevant problem of heterogeneity in autism by presenting a well-structured pipeline that bridges robust OOD detection with a flexible manifold learning method.

    The experimental findings, linking atypical brain phenotypes to meaningful clinical measures, underscore the translational value.

    Despite a few limitations (test-set size, domain shift concerns), the proposed method constitutes a solid contribution to MICCAI’s objectives.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We’d like to thank the reviewers for their insightful review. We agree with the general comment on the lack of detail. Our framework is complex, and MICCAI papers are unfortunately quite limited in space. We will add details but might fall short given the limited size.

We will add information relevant to our answers below to the paper.

Data variability (R1): We agree this information is relevant, and we will add a table with the subject information (scanner, sex, age).

Preprocessing and voxel size (R1): All data were preprocessed the same way by us. Preprocessing reduces scanner-specific biases, however further harmonization is necessary. Images were first intensity-clipped to the 1st–99th percentiles and then brain-masked (ANTsPyNet’s deep-learning extractor). Both T1- and T2-weighted volumes were rigidly registered to the MNI-152 1mm isotropic template with ANTs’ default parameters. Next, N4 bias-field correction, followed by histogram matching (128 bins, 5 match points, MNI-152 template) were applied. Blank background regions were cropped, and values were min–max normalized to [–1, 1].

Using T1w and T2w jointly (R1): We treat T1- and T2-w scans as two channels of a single input fed into the U-Net–style denoising score-matching network.

Cohort differences/domain shift (R1, R3): The purpose here is to discover and analyze the phenotypes in IBIS cohorts (LL-typical, HL-typical, HL-ASD), and these were all acquired with the same scanner (Siemens Prisma, VE11c software) and the same sequences. No site differences within the IBIS cohorts were observed.

Data distribution in morphospace (R1): We expect some (consistent) distributional bias of the IBIS data in the morphospace, thus the presented phenotype results apply to IBIS data only. ABCD + HCP-D data distributions show effects of scanner types. In our recently performed experiments, ComBat applied to the score norm vectors showed great success in removing that scanner variability from the morphospace.

ASD subject size (R1,R3): We acknowledge that our IBIS-ASD cohort (n = 55) is modest. ABIDE I/II provides only T1-weighted scans and lack the T2-weighted acquisitions required by our dual-contrast framework (though a single-channel T1w variant could be explored in a separate study). Consequently, direct application to ABIDE is not feasible. To validate the observed ASD phenotypes, we are applying PRADA in future work to the ASD participants within the ABCD study (161 ASD participants), and our preliminary analyses reveal comparable phenotype clustering.

Impact of registration (R1): No registration is needed (other than a coarse rigid registration into template space) for atypicality quantification, morphospace, and phenotype discovery. It is needed only to aggregate typicality maps across subjects to compute regional atypicality scores. Misregistration would tend to blur anomalies and thus affect the heatmap sharpness, but should not majorly affect regional averages. Thus, we do not expect PRADA to be sensitive to registration errors.

Intensity effect on GHSOM units (R1, R3): GHSOM units are defined by global anomaly score vectors rather than raw intensities. It is noteworthy that our preprocessing steps provide intensity correction and normalization.

Method comparison (R2): As of our knowledge, comparable image based subtype discovery benchmark frameworks are not available. Existing methods, such as Surreal-GAN and Hydra, are tabular data based. We agree with reviewers that ablation studies for comparing individual components is a necessary next step.

Clarity/description (R2): In the revised manuscript, we will improve the description of PRADA and highlight its advantages.

Discussion of structural differences (R3): One limitation of PRADA is that it does not provide the direct interpretation of the observed atypicality, ie. it provides locality and degree of atypicality, but not interpretations of size or shape differences.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    While this paper would benefit from further experimental comparisons and analysis, the interesting different viewpoint of detecting atypicality with the proposed methods for structural MRI would be of interest to the neuroimaging community. The authors should please take care to include the requested clarification of details in the final paper.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top