Abstract

Existing studies in fMRI analysis leverage mask autoencoder, a self-supervised framework, to build model to learn representations and conduct prediction for various fMRI-related tasks. It involves pretraining the model by reconstructing signals of brain regions that are randomly masked at different time segments and subsequently fine-tuning it for prediction tasks. Though it has shown improved performance in prediction tasks, we argue that directly applying this framework on fMRI data may result in sub-optimal results. Firstly, random masking is ineffective for highly redundant fMRI data. Secondly, the reconstruction process is not task-aware, ignoring a critical phenomenon: the varying contributions of different brain regions to different prediction tasks. In this work, we propose and demonstrate a hypothesis that learning representations by reconstructing signals from important ROIs at different time segments can enhance prediction performance. Specifically, we introduce a novel learning framework, Task-Aware Reconstruction Dynamic Representation Learning (TARDRL), to improve prediction performance through task-aware reconstruction. Our approach incorporates an attention-guided masking strategy, which leverages attention maps from the prediction process to guide signal masking during reconstruction, making the reconstruction task task-aware. Extensive experiments show that our model outperforms state-of-the-art methods on the ABIDE and ADNI datasets, with high interpretability.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0249_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0249_supp.pdf

Link to the Code Repository

https://github.com/WENXUYUN/TARDRL/

Link to the Dataset(s)

http://preprocessed-connectomes-project.org/abide/ https://adni.loni.usc.edu/

BibTex

@InProceedings{Zha_TARDRL_MICCAI2024,
        author = { Zhao, Yunxi and Nie, Dong and Chen, Geng and Wu, Xia and Zhang, Daoqiang and Wen, Xuyun},
        title = { { TARDRL: Task-Aware Reconstruction for Dynamic Representation Learning of fMRI } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    1.Task-Aware Reconstruction Dynamic Representation Learning(TARDRL) which is proposed in this paper, introduces a task-aware reconstruction approach that focuses on reconstructing signals from regions of interest (ROIs) that are important for the prediction task, rather than randomly selected ROIs. 2.The framework combined multi-task learning uses attention maps generated during the prediction phase to identify and mask important ROIs, making the reconstruction process more relevant to the downstream prediction tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    TARDRL introduces a task-aware reconstruction approach. The attention-guided masking strategy is an innovative approach to making the reconstruction process more aligned with the prediction task, potentially leading to better representation learning. The integration of spatial and temporal transformers in the shared encoder is a novel architectural choice that allows for the capture of both spatial and temporal dependencies in fMRI data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The innovativeness of the method needs to be improved. In the field of deep learning, the concept of using attention to guide masking is not new, and there is not much innovation in model architecture.
    2. The experimental part of this article does not compare with the sota method(https://doi.org/10.48550/arXiv.2307.10181)
    3. The article does not explain the methods and experimental settings in detail enough, such as how to determine the length value of the time period; why the four categories in ADNI are divided into two categories.
    4. The font of the formulas in the main frame diagram(Fig.1.) is too small and difficult to read clearly.
    5. The motivation of the article is similar to this article(Learning transferrable and interpretable representation for brain network OpenReview). This paper adds a reconstruction task to form a multi-task idea to explore the interpretability of the model. However, judging from the results of this article, the top2 brain areas are given and cannot provide a strong verification of the reconstruction task. Whether it is really possible to effectively sense the changed brain areas.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The motivation of the article needs to be improved. Secondly, The readability of the article needs to be improved, and the main methods in the article are not explained clearly, which brings obstacles to understanding the model proposed in this article. Finally, some details in the article can be improved to improve the quality of this article. Finally, I think the reconstruction task (self-supervised task) is more about learning a strong and robust representation of the space to better generalize to the space. This article adds the reconstruction task to the prediction task as a constraint. Whether it can really prove that the reconstruction task learns the perception of the relevant tasks remains to be examined.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1.Improve the novelty of the article and emphasize the difference of attention-guided 2.Add comparative experiments to compare sota. Are the results of other methods also based on the experimental settings of this article? 3.A clearer explanation of the effectiveness of reconstruction tasks 4.Change the text in the figures to the appropriate size.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The motivation of the article needs to be improved. Secondly, The readability of the article needs to be improved, and the main methods in the article are not explained clearly, which brings obstacles to understanding the model proposed in this article. Finally, some details in the article can be improved to improve the quality of this article. Finally, I think the reconstruction task (self-supervised task) is more about learning a strong and robust representation of the space to better generalize to the space. This article adds the reconstruction task to the prediction task as a constraint. Whether it can really prove that the reconstruction task learns the perception of the relevant tasks remains to be examined.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The author clearly explains the differences between this work and previous work, as well as the innovation of this article.



Review #2

  • Please describe the contribution of the paper

    This paper presents a novel spatio-temporal transformer framework for representation learning on brain fMRI data. By jointly training the model on prediction and reconstruction tasks, the authors show evidences of proposed model outperforming several baseline architectures.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed TARDRL architecture is a novel application of transformer architecture specialized on fMRI data. It encodes fMRI data spatio-temporally using spatial transformer and temporal transformer layers respectively. In comparison with baseline models, author shows TARDRL’s superior performance.

    2. This paper is one of the first few attempts in learning general representation of brain functions using fMRI. Although the proposed approach is aided by shared weights learned in a supervised prediction task, the attempt has its novelty.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Although titled as a task-aware architecture, the authors apply non-overlapping sliding window truncation on resting-state fMRI data and considers each sliding window a ‘task’.

    2. The authors extract attention weights in transformer layer and interpret as importance of ROIs. This is not a novel invention but might subject to deficiency influenced by input magnitudes. For example, a input ROI with small magnitude might have high attention weights but still contribute less compared to other ROIs. This is a shared potential deficiency in attention-based importance score. It is better from my perspective to add comment on this matter.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The shared anonymized source code only contains model architecture. There is no training code or instructions. The reproducibility is currently limited. Do authors intend to update the repository after review period?

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. This paper presents a task-aware architecture for fMRI data, but the experiments are conducted on resting-state fMRI using a fixed-size sliding-window truncation. More discussions on the choice of window size or extending experiments to task-based fMRI would clarify the message.

    2. There is a typo in Table 2 for missing bold text in ABIDE AUROC.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed model architecture is explained clearly and showed effective in experiments. Although there are confusions in the presentation of ideas, the method itself is novel and reasonable from my perspective.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work proposes a new fMRI-task-aware region-of-interest (ROI) masking strategy to improve the performance of disease classification and fMRI data reconstruction.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • the authors proposed a novel attention-based masking strategy
    • the use of attention weights to estimate the discriminative ROIs that have the highest contribution to model prediction for individual functional brain network
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • lack of investigations on the effect of mu (subsampling of the important ROIs) on model performance
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • The authors should consider predicting the task-based fMRI activation map on a voxel by voxel basis, instead of the entire task-based fMRI signal volumes. It would be interesting to see how the model performs as compared to baseline models.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors propose a somewhat novel ROI masking strategy to improve model performance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors have adequately addressed the comments of the reviewers




Author Feedback

We thank all reviewers for their valuable comments. We will release the complete codes upon acceptance.

[R1] Q1 Motivation and novelty: Our motivation differs from that of BrainMAE (Yang et al., 2024). BrainMAE uses the MAE to construct a highly transferable fMRI pre-training model, addressing the challenge of generalizing fMRI features across different tasks (such as applying ASD feature extraction modules to AD). We incorporate the MAE (i.e. fMRI reconstruction) into disease prediction models as an auxiliary task to learn more robust task-oriented fMRI representations, which can substantially improve the disease detection accuracy against the BrainMAE style (see Table 1).
Correspondingly, the innovations of our work include: 1) Different from the conventional sequential design of MAE and specific prediction task, we set up the MAE and prediction task in parallel to learn robust task-oriented features. 2) We also leverage the attention maps from the specific task to guide the fMRI time series reconstruction, which can in turn help learning the task-aware fMRI representations, thereby improving disease prediction accuracy; These motivations and innovations will be included in our final version.

[R1] Q2 Effectiveness of task-ware reconstruction: We agree to have more investigation into the model’s effectiveness and interpretability. For the model’s effectiveness, using Frechet Distance, we found that the discrepancy in fMRI representations generated by our model between healthy controls and patients is significantly greater than BrainMAE, indicating that our fMRI reconstruction process can more effectively perceive disease (task)-related features. Regarding the model’s interpretability, considering that various human behaviors are mediated by neural circuits, the submitted version focused on functional systems and found significant differences in the impact of ASD and AD on these functional systems. Additionally, we visualized the importance of all brain regions in disease classification based on attention maps. The distribution of brain areas aligns with existing findings. All the above results will be included in the final submission.

[R6] Q3 Potential deficiencies in attention-based importance score:We agree with your point that when the model is not well-trained, some ROIs may have high attention weights but low contributions to the prediction. However, if the model is well-trained, this issue can be significantly mitigated. Therefore, many studies use attention maps as indicators of importance (Shi et al., ICML, 2022; Kakogeorgio et al., ECCV, 2022; Li et al., NIPS, 2022). Additionally, the brain region importance maps for ASD and AD in our interpretability analysis are highly consistent with existing findings, demonstrating that our model does not suffer from this issue. This discussion will be included in the final submission.

[R1, R5, R6] Q4 Experiment-related issues: 1) (R1) We conducted the comparison with SOTA ComTF and found that our method outperforms ComTF on both ABIDE and ADNI datasets; 2) (R1) For the parameter settings of all baselines, we used the recommendations from their papers; 3) (R1, R5, R6) Analysis of sliding windows size and μ has been conducted and will be included in the final submission; 4) (R1) Since this paper primarily focuses on testing the model’s predictive ability for different brain diseases (ASD and AD), we did not further subdivide the AD category; 5) (R6) In this paper, the task refers to disease prediction, not task-fMRI data. Representation learning for task-fMRI is an interesting topic, and we will explore it in future work.

[R1,R6] Q5 Writing Issues: In our final revision, we will carefully revise the manuscript, including improving the description of the model (R1), and adjusting the font in figures and tables (R1,R6).




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Post-rebuttal, all reviewers agree on the side of accept, and I follow their recommendation.

    Note a number of additions to the paper were promised by the authors in rebuttal which may be difficult to meet, but I suggest that in particular the authors clarify the motivation/innovation of the proposed work compared to prior work, and also to clearly define that “task”-aware refers to the target prediction task to remove confusion with task-fmri.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Post-rebuttal, all reviewers agree on the side of accept, and I follow their recommendation.

    Note a number of additions to the paper were promised by the authors in rebuttal which may be difficult to meet, but I suggest that in particular the authors clarify the motivation/innovation of the proposed work compared to prior work, and also to clearly define that “task”-aware refers to the target prediction task to remove confusion with task-fmri.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Post rebuttal, all reviewers recommend accepting this article. The concerns raised during the review process seem to have been adequately addressed, therefore I lean towards accepting this work.

    For the final version, I would recommend the authors pay careful attention to the point raised by R6, where they make their training code and instructions available to aid reproducibility. The authors should also make appropriate clarifications in the final version to avoid over-loading the term ‘task-aware’ in the context of the fMRI domain vs machine learning.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Post rebuttal, all reviewers recommend accepting this article. The concerns raised during the review process seem to have been adequately addressed, therefore I lean towards accepting this work.

    For the final version, I would recommend the authors pay careful attention to the point raised by R6, where they make their training code and instructions available to aid reproducibility. The authors should also make appropriate clarifications in the final version to avoid over-loading the term ‘task-aware’ in the context of the fMRI domain vs machine learning.



back to top