Abstract

Decoding cognitive states from functional magnetic resonance imaging is central to understanding the functional organization of the brain. Within-subject decoding avoids between-subject correspondence problems but requires large sample sizes to make accurate predictions; obtaining such large sample sizes is both challenging and expensive. Here, we investigate an ensemble approach to decoding that combines the classifiers trained on data from other subjects to decode cognitive states in a new subject. We compare it with the conventional decoding approach on five different datasets and cognitive tasks. We find that it outperforms the conventional approach by up to 20% in accuracy, especially for datasets with limited per-subject data. The ensemble approach is particularly advantageous when the classifier is trained in voxel space. Furthermore, a Multi-layer Perceptron turns out to be a good default choice as an ensemble method. These results show that the pre-training strategy reduces the need for large per-subject data.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2040_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2040_supp.pdf

Link to the Code Repository

https://github.com/man-shu/ensemble-fmri

Link to the Dataset(s)

https://doi.org/10.5281/zenodo.12204275

BibTex

@InProceedings{Agg_Acrosssubject_MICCAI2024,
        author = { Aggarwal, Himanshu and Al-Shikhley, Liza and Thirion, Bertrand},
        title = { { Across-subject ensemble-learning alleviates the need for large samples for fMRI decoding } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper investigates an ensemble learning approach for decoding cognitive states from functional magnetic resonance imaging (fMRI) data. The key contribution is the development of a method that leverages classifiers trained on data from multiple subjects to enhance the accuracy of cognitive state predictions in new subjects, especially when per-subject data is limited. The study also provides a theoretical analysis based on bias-variance decompositions, supporting the practical findings. Furthermore, the paper explores the impact of using different feature spaces and classifiers within the ensemble framework, offering insights into the conditions under which ensemble learning is most beneficial.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The primary strengths of this paper include: 1)Multi-Dataset Empirical Validation: The paper’s strength lies in its systematic evaluation across multiple and varied fMRI datasets. This extensive testing not only demonstrates the robustness of the ensemble approach but also provides a comprehensive understanding of its performance under different experimental settings. The lack of comparison with cutting-edge methods like graph neural networks is noted; however, the thorough comparison with traditional methods is still a significant asset. 2)Theoretical Rigor in Mathematical Explanation: A standout aspect of the paper is the mathematical explanation for the effectiveness of ensemble learning in fMRI decoding tasks. The authors provide a bias-variance analysis that not only justifies the empirical results but also offers a deeper understanding of why and how the ensemble method improves decoding accuracy, particularly in low-sample scenarios. 3)Practical Implications for Clinical and BCI Applications: The paper highlights the potential practical implications of the proposed ensemble approach, especially in the context of clinical applications and brain-computer interfaces (BCI) where data scarcity is a prevalent challenge. This focus on real-world applicability adds significant value to the work, suggesting that the method could lead to advancements in these domains.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1)Limited Comparison with State-of-the-Art Methods: The paper does not include a comparison with the latest deep learning approaches, such as graph neural networks, which are increasingly being used for neuroimaging data analysis (e.g., Li et al., 2023). This limits the ability to fully assess the novelty and superiority of the proposed ensemble method over the most advanced techniques. 2)Scope of Evaluation: While the paper provides a robust evaluation across five different datasets, the choice of classifiers (MLP, SVC, and Random Forest) is somewhat traditional. The lack of inclusion of more recent machine learning models may restrict the evaluation’s comprehensiveness and the paper’s impact on the field. 3)Clinical Feasibility and Generalizability: The paper suggests potential clinical applications but does not provide a direct demonstration of clinical feasibility or extensive testing on diverse subject populations. This limitation leaves open questions about how the findings generalize beyond the specific datasets used in the study. [1] C. Li et al., “Individualized Assessment of Brain Aβ Deposition With fMRI Using Deep Learning,” in IEEE Journal of Biomedical and Health Informatics, vol. 27, no. 11, pp. 5430-5438, Nov. 2023

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1)Comparison with Cutting-Edge Techniques: The ensemble learning approach presented is a valuable contribution to fMRI decoding. However, to strengthen the paper’s impact, I recommend comparing the proposed method with state-of-the-art techniques, such as graph neural networks, which have demonstrated effectiveness in neuroimaging analyses. This comparison will situate your work within the current research landscape. 2)Cross-Dataset Generalizability: While the systematic evaluation across five datasets is commendable, assessing the model’s generalizability across a broader range of datasets, including those with different subject demographics or clinical conditions, would enhance the paper’s conclusions. Consider testing on additional datasets to bolster the evidence for the method’s robustness.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The “weak reject” recommendation stems from the paper’s constrained technical innovation, particularly in the context of rapidly evolving machine learning methodologies. While the ensemble approach demonstrates utility in fMRI decoding, the study’s impact is muted by the lack of engagement with the latest advancements, which are pivotal for assessing the true novelty and applicability of the proposed method.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    Thank you for your rebuttal. I apologize for my limited expertise in functional MRI-related tasks, which may prevent me from providing a fully objective and professional evaluation of this paper. Based on my understanding of the submission, the rebuttal, and the opinions of other reviewers, I see this paper as having more application value rather than substantial algorithmic innovation. However, as noted in the initial review phase by myself and other reviewers, I suggest the authors expand the discussion on applications such as BCI to enhance its practical value. Considering the above points, I have reassessed this submission as weak accept at this stage.



Review #2

  • Please describe the contribution of the paper

    This study investigates the potential of employing an ensemble approach to decode cognitive states from fMRI signals. This method leverages patterns learned from an ensemble of other subjects, demonstrating superior performance compared to conventional decoding approaches in four out of the five datasets examined.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This study identifies an intriguing limitation in a recent investigation concerning fMRI-based brain activity encoding. As a contribution, it concentrates on combining models and learning from an ensemble of individuals for decoding fMRI brain activity. By collecting various fMRI datasets and tasks, the study enables robust analysis. Moreover, observed results suggest that this approach enhances accuracy, particularly when data is limited, with the gains becoming more pronounced as the number of subjects in the ensemble increases.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) This study provides interesting proposal on fMRI activity decoding using ensemble models however, it is not clear what motivates the author on the selected models. 2) Its is difficult to conclude a fair judgment on the classifiers used for the analysis since parameters are not always the same across the selected models. 3) The discussion section provide very important details on the study, however justification of the benefit of this study should be made clear to enhance reader comprehension.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The author cited a major motivation for this study from a recent related work however, the importance of this work to wider application is not fully clear. Other recommended modification to this work includes a rewrite of the conclusion section to improve clarity.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work proposes across-subject ensemble-learning for fMRI decoding with extensive experiments to demonstrate significant finding. I recommend a weak-accept because it is important to modify the current version of the manuscript to improve reader comprehension.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The rebuttal responses are satisfactory in responding to the main concerns raised about this paper. I recommend the acceptance (5) because of the significance and clarity of the proposed work.



Review #3

  • Please describe the contribution of the paper

    Application of the combination of models learned from an ensemble of individuals to perform the brain decoding of a new subject from fMRI on different public datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel application tested on different publicitly available dataset. Performances testes using proper statistical analysis.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    MAJOR

    Reproducibility 1) Authors did not report sufficient details on the architecture of the classifiers which constitute the different embedded models, e.g. MLP (how many layers, number of nodes, activation function), Random Forest (number of decision trees, maximum depth,… )… 2) No details concerning the hardware and the software platform used to implement and train the model (they only mentioned about the Seaborn library to estimate the uncertainty in the average accuracy).

    Clarity 3) The description of data splitting procedure might be improved. In particular the state “Within each dataset, we always test the trained model on 10% of the samples per subject. The training set sizes consist of ten geometrically increasing sub-samples of the remaining 90% samples available for each subject in each dataset.” might erroneously suggest the presence of data leakage when testing the generalization capabilities of the model.

    MINOR: 4) Figure: x-axes is missing in subplots 1 and 2 in Fig. 2 and Fig.3

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Described in details in weaknesses section

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The clarity of the paper might be benefit if authors provides more details on the labels predicted in each different task (e.g. in Table 1)

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The application is innovative and might have impact as the shortage of large samples is a relevant problem in medical imaging domain.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors responded satisfactorily to the request




Author Feedback

Thanks to all reviewers for their comments. We found your points valid and constructive, and we are grateful to you for helping us improve this work. Here we address your comments: 1) Comparison with Graph Neural Networks (GNNs) (reviewer #5): We compared our method with GNNs under the conventional settings for all five datasets. GNNs had the worst average performance (like MLP): Neuromod 40%, AOMIC 36.6%, Forrest 24.5%, BOLD 30.2%, RSVP-IBC 27.2%. We suspect that this is because such neural network approaches are particularly data-hungry. However, our focus here is on scarce data conditions (82 trials per class max), and it is these conditions that remain of interest for most researchers simply because not everyone can acquire several hours of fMRI data with hundreds of subjects. Under these conditions, the linear classifiers are still the SOTA. Previous GNN studies like Li et al 2023 were not operating under scarce data. Further, as mentioned briefly in the discussion section, the proposed ensemble framework allows to stack GNNs trained on individual subjects and might improve its performance (like it did with MLP). 2) Clarity: a) Motivation behind the selection of models (reviewer #3): We include various model families all with distinct inductive biases. Linear SVC represents the linear classifiers that are still good enough for such classification tasks with scarce data availability. Random Forest represents the ensembling methods and MLP represents neural networks. b)The benefit of the study (reviewer #3): We have already mentioned the usefulness of this method in BCI applications in the discussion section. However, this method could also be useful in both basic clinical and cognitive research, where a given disorder or cognitive domain could not be decoded accurately due to small sample size or low signal-to-noise ratio in conventional settings. c)Data splitting procedure (reviewer #4): This could be rephrased to - “Within each dataset, we kept 90% of data for training and 10% for testing. We varied the size of the training set over 10 geometrically increasing sub-samples of that initial 90% training split and always tested the trained model on the same 10% testing split.” d) Predicted labels (reviewer #4): Neuromod – body / face / place / tools images AOMIC – negative / neutral emotion images and cue for negative / neutral emotion images Forrest – ambient / country / metal / rocknroll / symphonic music BOLD5000 – animal / artifact / food / plant images RSVP-IBC – type of text: jabberwocky / complex / simple / word list / pseudoword list / consonant strings 3) Cross-dataset generalizability (reviewer #5): We had this in mind but were constrained because of the lack of availability of good-quality data where two different subject pools performed the same tasks. We touched upon this point briefly in the discussion section and agree that it would be interesting to investigate this in future work. 4) Hyperparameters of models, hardware/software specs (reviewers #3 and #4): We used the default scikit-learn parameters for all models with a few minor changes: MLP: 100 hidden layers, ReLU activation, ADAM solver, 1000 iterations Random Forest: 500 trees, and for the maximum depth of a tree, the nodes are expanded until all leaves are pure or until all leaves contain less than 2 samples Linear SVC: L2 penalty, the loss is square hinge, and the algorithm to solve dual optimization is set to automatic Software: Python v3.11.4, Scikit-Learn v1.3.0, Nilearn v0.10.4, Numpy v1.25.2 and Scipy v1.11.1 Hardware: CPU-based cluster with 72 CPUs and 376 GB of RAM 5) Reproducible code and data links (all reviewers): We did not provide the GitHub repo because that would have conflicted with double-blind reviews. We will make sure to provide all the details for the “camera-ready” version for final submission. All the datasets we used are publicly available and are accompanied by appropriate citations wherein the access information can be found.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    the authors are able to provide satisfactory replies to the reviewers comments

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    the authors are able to provide satisfactory replies to the reviewers comments



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper presents a ensemble learning framework to allow across-subject statistical analysis of the fMRI signal, partially address the sample issue in the traditional GLM analysis. Considering the wide application of GLM for fMRI analysis, this work shall be sufficiently impactful to the field.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper presents a ensemble learning framework to allow across-subject statistical analysis of the fMRI signal, partially address the sample issue in the traditional GLM analysis. Considering the wide application of GLM for fMRI analysis, this work shall be sufficiently impactful to the field.



back to top