Abstract

Neurological diseases, such as schizophrenia and attention deficit hyperactivity disorder (ADHD), alter functional connectivity (FC) and are often accompanied by cognitive deficits. Leveraging shared neural mechanisms underlying both neurological disease and cognitive deficits can enhance diagnostic accuracy. However, due to the complex neural mechanisms of these conditions, diagnosing them based on FC alone still presents challenges in terms of accuracy and biomarker reliability. To address these challenges, we designed a meta-analysis guided multi-task graph transformer network to simultaneously predict neurological disease and cognitive deficits and examine alterations in brain FC associated with these conditions. The framework employs a graph transformer method as the encoder and integrates a joint attention mechanism to capture shared disease–cognition features while utilizing saliency pooling to extract saliency weights for each task. To enhance the reliability of saliency weights, we incorporate meta-analysis guidance that aggregates data from 470 functional studies in the BrainMap database. Then, we establish reference probability maps for brain activations associated with neurological diseases and cognitive deficits using a Naive Bayes classifier. The saliency weights learned from saliency pooling are then constrained to align with these references using Pearson correlation. Experiments on the COBRE and ADHD-200 datasets indicate that our proposed method outperforms state-of-the-art multi-task learning models in classifying schizophrenia and ADHD, as well as in predicting their related cognitive deficits. Moreover, the biomarkers extracted from our models exhibit biologically meaningful patterns.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2294_paper.pdf

SharedIt Link: https://rdcu.be/eHc7c

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05162-2_44

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{XiaJin_Metaanalysis_MICCAI2025,
        author = { Xia, Jing AND Chan, Yi Hao AND Rajapakse, Jagath C.},
        title = { { Meta-analysis guided multi-task graph transformer network for diagnosis of neurological disease and cognitive deficits } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15971},
        month = {September},
        page = {459 -- 468}
}

Reviews

Review #1

Please describe the contribution of the paper

Introduction of a new deep learning architecture that simultaneously performs two tasks — classification of a neurological disease (schizophrenia or ADHD) and regression of a related cognitive deficit score — using a graph-based transformer encoder on functional connectivity data. This joint learning framework is designed to capture shared brain network features that link disease and cognition, potentially improving overall performance on both tasks. Integration of prior knowledge from large-scale neuroimaging meta-analyses (BrainMap database) into the model’s attention mechanism. The authors derive reference brain activation maps from 470 published functional studies (160 for schizophrenia, 75 for ADHD, and an equal number for associated cognitive domains) using a Naïve Bayes-based meta-analysis. They use these maps to constrain the model’s learned saliency/attention weights via an additional loss, effectively guiding the network to focus on biologically relevant regions. This is a creative use of domain knowledge to improve interpretability and robustness of the model’s attention.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper’s primary strengths lie in its novel multi-task architecture, which combines a joint attention module (cross-attention) and task-specific branches, effectively capturing both shared and unique features to enhance classification and regression tasks. Its meta-analysis integration leverages the BrainMap database to produce region-level prior maps, lending the model external validation and biological credibility. Empirically, the authors present strong results, outperforming several baselines and state-of-the-art multi-task models in both classification metrics and regression metrics. They further provide valuable clinical insight by correlating identified salient regions with well-established literature on schizophrenia and ADHD. Lastly, the methodological transparency—including details on multi-head cross-attention and graph pooling—combined with a promise to release the source code upon acceptance, bolsters the reproducibility of their approach.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The proposed network is quite complex, consisting of graph convolution layers (GPS, GAT), a joint multi-head cross-attention module, saliency pooling, and a meta-analysis guided loss. While each component is justified, the overall architecture has many moving parts and hyperparameters. This complexity might make it challenging to reimplement or tune for newcomers. The model’s success partly relies on having relevant meta-analysis data available. In this work, the authors were able to query BrainMap for schizophrenia, ADHD, and related cognitive domains, yielding 470 studies, which is impressive. However, for other neurological conditions or more niche cognitive metrics, such rich meta-analytic data may not exist. While the combination of components is novel, the building blocks (graph attention networks, transformers, multi-task learning, saliency maps) are drawn from existing techniques. One could argue that the paper’s novelty is more in the clever integration of these elements and use of external data, rather than a fundamentally new algorithmic theory.While the combination of components is novel, the building blocks (graph attention networks, transformers, multi-task learning, saliency maps) are drawn from existing techniques. One could argue that the paper’s novelty is more in the clever integration of these elements and use of external data, rather than a fundamentally new algorithmic theory.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method is both novel and well-grounded in neuroimaging data, presenting strong results on two well-known datasets and offering interpretability through meta-analysis. Some moderate details (meta-analysis protocol, comprehensive parameter tuning) could be clarified or expanded during rebuttal. The paper meets MICCAI’s standards for an innovative methodological contribution, with potential for clinical and research impact in the domain of neurological disease diagnosis and cognitive assessment.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper introduces MAG-MT (Meta-Analysis Guided Multi-Task Graph Transformer Network), a framework that simultaneously predicts neurological diseases (schizophrenia & ADHD) and cognitive deficits from functional connectivity data. MAG-MT uses domain knowledge from meta-analyses to improve feature learning associated with relevant biomarkers reliability and employs a graph transformer convolution and a joint attention module to capture shared disease-cognition features.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Novel usage of meta-analysis knowledge from the BrainMap database (470 studies) to constrain saliency weights of the reference probability maps
- Well-designed and explained multi-task pipeline that captures both shared and task-specific features through a common encoder and task-encoders
- Comprehensive experiments conducted on two datasets (COBRE and ADHD-200) with strong performance improvements over baselines
- Very strong clinical relevance and interpretations of the results showing discriminative brain regions that correlated with BrainMaps findings and other clinical findings
- Clear ablation analyses that validate both the joint attention module and meta-analysis constraints
- Visualizations are also very informative and support interpretations
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Not as clear how \mu and \lambda were selected as the best hyperparameters, despite mention of best performance from Fig. 2. Was a specific hyperparameter selection process used?
- Missing some critical experiment details. How were the datasets used (train/val/test split, cross-validation, etc.)? Was best performance based on averaged results?
- Was only the Naive Bayes classifier explored for generating reference probability maps? If so, what limitations could exist?
- Would have been interesting for authors to also acknowledge non-FC approaches, and what additional information that might produce–especially for diseases where brain morphology might play a role.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents a novel integration of meta-analysis prior knowledge with deep learning that addresses a significant challenge in neuroimaging research - validating reliable biomarkers and clinical interpretability. The authors constrain model training with sufficient domain knowledge to improve neurological disease classification.

The multi-task framework implements shared and task-specific mechanisms between neurological disorders and cognitive deficits, which results in meaningful performance improvements over baselines (2.5-3.8% improvement for schizophrenia classification, 2.9% for ADHD classification).

The technical approach is sound with combining graph transformer convolution with a joint attention module to capture shared features while using saliency pooling to identify discriminative brain regions. The ablation studies also support the value of the joint attention module and meta-analysis constraints.

Most importantly, the clinical interpretability of the results is very strong. The model identifies brain regions that align with established clinical findinfs about schizophrenia and ADHD, with significant correlations between saliency weights and reference probability maps (r=0.53-0.69, p<0.001).

While the paper has some limitations (modest sample sizes, limited hyperparameter justification, training/test dataset details), I don’t believe these significantly outweight the contributions of this work.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The paper introduces MAG‑MT, a meta‑analysis guided multi‑task graph transformer that jointly predicts neurological diagnoses (schizophrenia and ADHD) and their associated cognitive deficits by combining a shared GPS convolution encoder with task‑specific GAT layers and a cross‑attention module.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The authors present a novel incorporation of meta-analysis priors into graph transformer framework.
- In general the paper is well written and well organised.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- This is one of the few papers I encounter that I almost have no major concerns. The paper and methodology is clear and well detailed. The only issue is that no code is provided.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

good methodology, good presentation of results.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We sincerely thank all reviewers for your time and valuable efforts.

Comment from R1: Some moderate details (meta-analysis protocol, comprehensive parameter tuning) could be clarified or expanded during rebuttal. a. Meta-analysis protocol For the meta-analysis, all studies included the keyword ‘fMRI’. For schizophrenia (SZ), cognition-related studies were filtered by the keywords ‘language’, ‘working memory’, and ‘social cognition’ (under Experiments → Imaging Behavioral Domain → Cognition). The publication year was set to ‘after 2011’ to match the number of studies retrieved with the keyword ‘schizophrenia’ (160 studies). For ADHD, cognition-related studies included the keywords ‘attention’, ‘working memory’, and ‘reasoning’, with an age range of 7 to 21 years, to align with the age range of subjects from the ADHD-200 dataset. The publication year was set to after 2013 to match the number of disease-related studies, resulting in the selection of 75 studies. b. Dimension of node embedding The dimension of node embeddings was tuned within the range [32, 64, 96, 128, 256]. For SZ classification, the highest accuracy was achieved with a dimension of 64, while for ADHD prediction, the optimal performance was obtained with a dimension of 128. These details will be included for both SZ and ADHD in the final version. c. Number of attention heads The number of attention heads was tuned across [1, 2, 3, 4, 6, 11]. The best results across all tasks were obtained when the number was set to 4, which was therefore adopted for all experiments. d. Hidden nodes in MLP Both MLP1 and MLP2 consisted of one fully connected layer with dropout and ReLU activation, followed by a second fully connected layer for the final output. The two layers contained 256 and 1 hidden units for prediction tasks, and 256 and 2 hidden units for classification tasks.

Comment from R2: The only issue is that no code is provided. The link to our code will be added in the final version.

Comment from R3: Not as clear how μ and λ were selected as the best hyperparameters. The initial choice of μ was based on the number of activated ROIs from the meta-analysis for each condition, approximately 27 for SZ and 37 for ADHD. Accordingly, μ was tuned around 27, with candidate values of 14, 27, 40, 66, and 79. The best performance was achieved with μ = 27. The parameter λ controls the weight of the meta-analysis constraint. A straightforward approach was to set it to 1 initially. We subsequently tested values around 1 and found that setting λ to 1 yielded the best performance.

Comment from R3: How were the datasets used? We conducted 5-fold cross-validation, splitting the data into 80% training and 20% testing. We calculated the mean accuracy and standard deviation across 10 runs. We will add this description in the final version

Comment from R3: - Acknowledge non-FC approaches, and what additional information might be produced. As the reviewer suggested, non-FC modalities, such as brain morphology, play a critical role in disease diagnosis. Although our method is based on functional connectivity, the meta-analysis constraint concept can be readily adapted to brain morphology data. The BrainMap database also provides foci MNI coordinates in brain regions associated with neurological disease derived from T1-weighted images, which may be valuable for disease analysis based on brain morphology.

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

The reviewers all agree that the work is novel, with innovative integration of prior meta-analysis information for neuroimage analysis; has potential for impact, given the clinical interpretability; and shows strong experimental results on 2 datasets. Adding clarifications to address the reviewers’ questions will further strengthen the paper.

back to top

Meta-analysis guided multi-task graph transformer network for diagnosis of neurological disease and cognitive deficits

Author(s):