Abstract

Data-sharing in neuroimaging research alleviates the cost and time constraints of collecting large sample sizes at a single location, aiding the development of foundational models with deep learning. Yet, challenges to data sharing, such as data privacy, ownership, and regulatory compliance, exist. Federated learning enables collaborative training across sites while addressing many of these concerns. Connectomes are a promising data type for data sharing and creating foundational models. Yet, the field lacks a single, standardized atlas for constructing connectomes. Connectomes are incomparable between these atlases, limiting the utility of connectomes in federated learning. Further, fully reprocessing raw data in a single pipeline is not a solution when sample sizes range in the 10–100’s of thousands. Dedicated frameworks are needed to efficiently harmonize previously processed connectomes from various atlases for federated learning. We present Federate Learning for Existing Connectomes from Heterogeneous Atlases (FLECHA) to addresses these challenges. FLECHA learns a mapping between atlas spaces on an independent dataset, enabling the transformation of connectomes to a common target space before federated learning. We assess FLECHA using functional and structural connectomes processed with five atlases from the Human Connectome Project. Our results show improved prediction performance for FLECHA. They also demonstrate the potential of FLECHA to generalize connectome-based models across diverse silos, potentially enhancing the application of deep learning in neuroimaging.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3324_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/qinghaoliang/Federated-learning_across_atlases

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Lia_Overcoming_MICCAI2024,
        author = { Liang, Qinghao and Adkinson, Brendan D. and Jiang, Rongtao and Scheinost, Dustin},
        title = { { Overcoming Atlas Heterogeneity in Federated Learning for Cross-site Connectome-based Predictive Modeling } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15010},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This work proposes a technique to convert connectomes from different atlas spaces into a common atlas space. This conversion allows for federated training across diverse data silos, each containing connectomes generated from varying atlases. The mapping process is learnt on the Yale dataset, while the HCP and HCP-D datasets are used separately for working memory and age prediction within a federated setting. Collaborative training across four silos is conducted using the Coef_avg, FedAvg, and FedProx algorithms, with each silo containing a private connectome dataset from a distinct atlas. The method’s effectiveness is evaluated on a fifth silo, which uses an external atlas. The dataset is partitioned into silos by separating subjects, and experiments are repeated 100 times for robustness. Results indicate that the proposed method enhances prediction performance and generalizability for working memory and age prediction tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper bridges the gap between the Federated Learning setting and the utilization of connectomics data. This is particularly innovative since the use of different atlases among users typically makes such applications challenging.
    • The method for data processing and division is well-explained in the paper. Additionally, the experimental setup, including the separation of subjects, atlases, and the use of different datasets for mapping, training and test, is both fair and robust, closely simulating real-world scenarios.
    • Despite some unclear aspects of the evaluation (as noted in weaknesses), the authors conducted a fair comparison and provided a proper statistical analysis of the results. This strengthens the credibility of the findings and ensures a thorough assessment of the proposed method’s performance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Although subjects are divided into silos, all subjects come from the same dataset, resulting in original fMRI data with a similar distribution in terms of the acquisition system. This aspect is not entirely realistic under real-world use cases, where data from different sources may have more varied distributions.
    • The paper lacks clear descriptions of the tasks and labels used in the evaluation. This makes the evaluation phase somewhat confusing. It would have been beneficial to include a concise explanation of the tasks in the main text to aid in understanding. For example, it is unclear what is being predicted when referring to working memory and age. Are these real-valued scalars? The rationale for choosing these two tasks to demonstrate the method’s effectiveness could be clarified.
    • The paper does not clearly define the label space, making it difficult to understand how the Pearson correlation coefficient is used for evaluation. A clearer explanation of the tasks and their associated labels would provide context for the performance measures used.
    • While the application of the framework may lead to potential advantages in using such types of data, the method itself is a combination of well-known and existing methods. Besides the correct application of the framework, there is no novelty regarding the use of federated algorithms, making the significance of the contribution unclear. It would be beneficial to highlight the novelty, if any, such as the mapping into a common space and the differences from approaches described in references [5] and [12].
    • The authors do not thoroughly discuss the limitations of the approach or provide insights into future directions. This weakens the claims made in the paper and reduces the strength of the conclusions. Including a discussion of limitations and potential areas for future research would strengthen the paper.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It would be beneficial to clearly define the scope of the work, whether it focuses more on the mapping strategy for connectomes to a common atlas space or the application of federated learning in connectomics, or both. Clarifying this will help readers understand the main contributions of the paper. Since the overall approach seems correct, I encourage the authors to provide a more detailed explanation of the rationale behind their choice of tasks and evaluation metrics (why were working memory and age prediction tasks selected and how do they demonstrate the effectiveness of the proposed method) Additionally, using more than one metric can provide a more comprehensive assessment of the model’s validity from various perspectives. This approach can enhance the robustness of the evaluation and strengthen the paper’s overall argument.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper addresses an important issue by applying federated learning to connectomics data, the methodology itself is a combination of existing methods. This makes the significance of the contribution unclear The paper lacks clarity in explaining the tasks (working memory and age prediction) and the related performance metrics.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper presents a simple and neat approach for harmonizing atlases for federated learning for connectome-associated predictive tasks. To mitigate differences in parcellations, the authors propose to learn mappings from each client to an intermediate shared atlas space defined by an independent atlas space. The learning of such mappings are formulated as optimal transport problems with transportation costs defined on similarities in time series. The derived mappings are then used to harmonize parcellations across clients in federated learning. The effectiveness of the proposed approach is demonstrated on two popular federated learning approaches.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The topic of atlas harmonization for connectome-based tasks is of significant practical value for supporting scientific research.

    The proposed approach is simple, elegant, and is proven effective when applied to two common federated learning approaches. It may inspire other works on mitigating covariate shifts and concept shifts in federated learning.

    This paper is well-written and I enjoy reading it.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The granularity of parcellation for the overall approach is heavy dependent on the selection of the mapped space. Therefore, would there be a risk of performance degradation when there is severe mismatch among the granularities of atlases (e.g., the mapped space is too coarse compared with those of some of the clients)?

    The proposed approach still requires the dataset of the mapped space to be distributed to the clients – readers may argue this framework therefore does not fully get rid of transmitting subject data.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    An extension of this work may benefit from detailed ablation studies on the selection of the target mappings (and therefore the associated datasets), as large gaps in signal distribution and/or granularity of parcellation may negatively affect the performance.

    In future the authors may want to think of approaches for enhancing privacy for the target dataset which for now still needs to be transmitted (e.g., through dataset distillation).

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The problem setting is of practical value; the methodology is simple and neat; the paper is well-written.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I have read through the rebuttal. The raised concerns are clarified at a reasonable level.



Review #3

  • Please describe the contribution of the paper

    The study introduces Federate Learning for Existing Connectomes from Heterogeneous Atlases (FLECHA), a framework designed to overcome atlas heterogeneity in neuroimaging with Federated learning. FLECHA efficiently maps connectomes processed with different atlases to a common space via optimal transport. The evaluation using both functional and structural connectomes demonstrates FLECHA’s prediction performance and its potential to enhance deep learning in neuroimaging.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novel: The study introduces an innovative approach with its unique method to address atlas heterogeneity in neuroimaging.

    • Great idea: The concept of mapping diverse connectomes to a common space for federated learning is both practical and promising.

    • Great explanations and writing: The paper provides clear and thorough explanations of the methods used, ensuring readers can fully grasp the technical processes involved. Also, the writing is exceptional, offering clarity and engaging content that enhances the reader’s understanding of the research.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I do not have a lot of weaknesses for this paper mostly nitpicking. That is why I resort to asking questions that are essential to answer in the rebuttal and questions that would be interesting in a discussion of the paper.

    Essential questions to answer:

    • Why is coef_avg not compared in fig. 4? Is it to hide that it beats DNN? Would it work better with the Atlas mapping through OT?
    • Are there patients from the same dataset present in both train and test set?

    Questions that are interesting for discussion (I do not expect answers):

    • Your mapping scales quadratically with the number of connectomes as it is a Many-to-Many mapping. What are options to alleviate the problems with increasing number of atlases? What about a Many-to-One-to-Many got better scalability in federated learning (training on the One embedding)?
    • How robust is you OT mapping to games of “telephone” i.e. is the transport from A to B to C the same as A to C?
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I think it is fine without code release.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Trying to improve the paper by nitpicking:

    • make sure when citing multiple works at once there are in numbering oder and have space to the previous word
    • “3 Method” section should be renamed to something more sensical, as the methods are introduced in the previous section.
    • “5 Discussion and Conclusions” should be renamed as their is nothing discussed and only positive aspects are reflected => It is only a Conclusion.
    • Fig. 2 b the illustration of train/test scenarios is cluttered and potentially confusing may remove or rework?
    • Fig. 3 and 4 using both atlas name for single silo methods and the data silo is confusing at first. May use (silo 1,2,…,5) on the x axis additional to atlas names or or give X axis and legend proper names like “Silos” and “Methods”
    • “FLECHA broadens feature exposure and introduces adversarial robustness,” I think the adversarial robustness is an overstatement as the paper does not evaluate the intention’s to manipulate the data in order to make the model fail, i.e, specially engineered inputs. I would suggest to rather write “introduces cross-silo robustness” as this aligns better with what has been shown in the experiments.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall great paper, couldn’t be much happier. However, the answers to the essential questions will determine final rating.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    My voting remains the same.




Author Feedback

Overall, reviewers found our work to be “simple, elegant, and effective (R1)” & a “great paper, couldn’t be much happier (R4)”. The main criticism was from R3 around motivation and novelty. We hope to clarify several points below.

R3-Motivation+Novelty Our method is specifically designed to improve predictive modeling based on connectomes, a widely used biomarker for investigating brain-behavior associations. FLECHA’s novelty lies in overcoming regulatory constraints on sharing raw imaging data, enabling the aggregation of predictive information from distributed heterogeneous connectome data. Refs. [3]+[12] introduced the general mapping methods but did not integrate advanced machine learning, like federated learning. Our goal is to apply federated learning in connectomics while accounting for heterogeneity caused by different atlases.

R3-Choice of Tasks, Labels, & Evaluation Metrics Age and working memory were chosen as the primary predicted variables as they are easy to measure, are common benchmarks in brain-behavior models, and have significant clinical relevance. Structural connectomes have been shown to be strongly correlated with age, making them suitable for predicting age. In contrast, functional connectomes strongly correlate with complex higher-order cognitive functions, like working memory. Age (in months) and working memory scores are treated as continuous variables. Predictive performance is assessed using Pearson correlation or mean square error (MSE) between observed and predicted values. Pearson correlation was chosen for its standardization and comparability across measures and studies. Although not shown due to space constraints, results were similar with MSE. Future work will include more tasks, encompassing classification and other labels.

R1-Does Parcellation Granularity Mismatch Degrade Performance The atlases used in our study vary significantly. For example, the Dosenbach atlas is the smallest (by >25%), with 160 nodes, and does not cover the whole brain. Despite these differences, our results show robust performance across federated learning scenarios for that atlas. Further, common atlases are within the 200-400 node range used here, suggesting similar performance for other popular atlases not tested. Still, we expect FLECHA’s performance to degrade at some point. Ablation studies with a broader array of atlases, varying in node number and coverage, will be needed to understand which atlases are too coarse for FLECHA.

R1-Privacy Concerns We want to clarify that FLECHA does not transmit the target dataset. Only parameters need to be transmitted. These parameters include the between-atlas mappings (pre-trained on external data and centralized) and site-specific models. Still, we agree with the reviewer that the privacy-preserving aspect could be enhanced. Future research could explore machine unlearning to remove the impact of a silo from the centralized model.

R3+4-Realism of Data Distribution All participants came from one dataset. Still, no individual was in two silos (i.e., no data leakage). This choice was deliberate for experimental control. It allows us to specifically correct the data heterogeneity problem caused by using different atlases in preprocessing. If different scanners and sites were used, we could not separate which domain shift affects prediction. FLECHA is flexible and can be added to existing methods for scanner-induced domain shifts. Future work is needed to benchmark FLECHA in real-world cases.

R4-Absence of Coef_avg in Fig 4 We excluded Coef_avg from our DNN model analyses as it is equivalent to FedAvg with only one communication round, leading to underfitting. This exclusion was intended to provide a clearer comparison of more robust configurations.

R1+3-Limitations and Reproducibility We have room to include a few lines about limitations, reflecting the points above. Also, we did not release code to maintain blinding but will upon the paper’s acceptance.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The reviewers inquired about the motivation, innovation, dataset, and data privacy issues. The authors provided clear explanations regarding their novelty, dataset design rationale, and plans for data sharing. They also committed to enhancing privacy protection measures and publishing their code. Therefore, considering these responses, I recommend acceptance.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The reviewers inquired about the motivation, innovation, dataset, and data privacy issues. The authors provided clear explanations regarding their novelty, dataset design rationale, and plans for data sharing. They also committed to enhancing privacy protection measures and publishing their code. Therefore, considering these responses, I recommend acceptance.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors’ rebuttal has comprehensively addressed reviewers’ concerns within the limited space.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors’ rebuttal has comprehensively addressed reviewers’ concerns within the limited space.



back to top