Abstract

The dynamic variation in the spatio-temporal organizational patterns of brain functional modules (BFMs) associated with brain disorders remains unclear. To solve this issue, we propose an end-to-end transformer-based framework for sufficiently learning the spatio-temporal characteristics of BFMs and exploring the interpretable variation related to brain disorders. Specifically, the proposed model incorporates a supervisory guidance spatio-temporal clustering strategy for automatically identifying the BFMs with the dynamic temporal-varying weights and a multi-channel self-attention mechanism with topology-aware projection for sufficiently exploring the temporal variation and spatio-temporal representation. The experimental results on the diagnosis of Major Depressive Disorder (MDD) and Bipolar Disorder (BD) indicate that our model achieves state-of-the-art performance. Moreover, our model is capable of identifying the spatio-temporal patterns of brain activity and providing evidence associated with brain disorders. Our code is available at https://github.com/llt1836/BISTformer.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1874_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/llt1836/BISTformer

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_Exploring_MICCAI2024,
        author = { Li, Lanting and Zhang, Liuzeng and Cao, Peng and Yang, Jinzhu and Wang, Fei and Zaiane, Osmar R.},
        title = { { Exploring Spatio-Temporal Interpretable Dynamic Brain Function with Transformer for Brain Disorder Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper propose an end-to-end spatio-temporal interpretable transformer-based framework for dynamic brain activity analysis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well organized and presented. Extensive experiments were presented to show the superiority of their proposed method. The topics on analyzing the Major depressive disorder and Bipolar disorder may be interested by the related neuroscience communities.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There are a few weaknesses. 1 Please provide additional details about the dataset, as it is not publicly accessible. Could you specify the age range, gender distribution, and the number of ROIs involved? 2 Equation (2) is somewhat confusing. When you state ‘X^t is the BOLD signals of all ROIs at time point t,’ does this imply that X^t represents the signals from only one time point for each ROI? 3 The paper presented at MICCAI-2023, titled ‘Predicting Spatio-Temporal Human Brain Response Using fMRI,’ should be discussed in the introduction. This study is also focused on the interpretable spatio-temporal modeling of brain networks using fMRI data and is relevant to your work. 4 The p-values for the proposed BISTformer should be included in the last row of Table 1, as these are among the most critical results. 5 In the interpretability section, the paper explores the significance of identified BFMs related to different diseases by removing various BFMs. While I understand this approach, it’s important to consider that the brain network is a systemic model that represents complex interactions among brain ROIs. Removing parts of the BFMs could lead to complex outcomes that might not accurately reflect the importance of these BFMs in predicting brain diseases. 6 According to MICCAI rules, authors may not be allowed to include any discussions in their supplementary file.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The method in this paper is validated upon private dataset. It can be reproduced if the data is released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It may helpful to address this issues. 1 Please provide additional details about the dataset, as it is not publicly accessible. Could you specify the age range, gender distribution, and the number of ROIs involved? 2 Equation (2) is somewhat confusing. When you state ‘X^t is the BOLD signals of all ROIs at time point t,’ does this imply that X^t represents the signals from only one time point for each ROI? 3 The paper presented at MICCAI-2023, titled ‘Predicting Spatio-Temporal Human Brain Response Using fMRI,’ should be discussed in the introduction. This study is also focused on the interpretable spatio-temporal modeling of brain networks using fMRI data and is relevant to your work. 4 The p-values for the proposed BISTformer should be included in the last row of Table 1, as these are among the most critical results.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. Some important information, such as the details addressed in questions 1, 3, and 4 from box 6, is missing.
    2. I have some reservations about the interpretability section as outlined in question 5 from box 6.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    After reviewing the authors’ feedback, I find that some of my previous concerns have been addressed, particularly regarding the authors’ explanation of the model interpretation part. Therefore, I change my recommendation to weak accept. However, the authors should modify their manuscript in the final version by:

    1. Providing more details about the dataset.
    2. Showing p-values for their own methods.
    3. Removing the short discussion part in their supplementary materials.



Review #2

  • Please describe the contribution of the paper

    This paper proposes an architecture called BISTformer, which applies spatio-temporal clustering to dynamic brain networks and introduces a multi-channel self-attention mechanism with topology-aware projection. Evaluation was performed on datasets of depression and bipolar disorder patients. Important spatio-temporal patterns were identified.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Experiments and analyses are quite robust with a good balance between performance and interpretability
    • Interesting findings wrt active/stable states, backed by experimental results
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Ablations need to be more robust, considering that the architecture is quite complicated (there are at least 6 parts: (i) clustering/tensor decomposition, (ii) self-attention, (iii) reconstruction loss, (iv) topology-aware projection layer), (v) temporal-varying weights, (vi) joint learning, but the ablation study done only removes 2 of it). You could cut down on comparisons with some existing models especially since there is little space to explain how the they were actually implemented in the context of your problem (and thus some of the comparisons are not very meaningful - e.g. it is not very useful to compare against population GCN since BISTformer is based on brain graphs, not population graph).
    • Some equations are not very clearly explained (see comments below)
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    -

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    As mentioned above, ablations need to be done more carefully. Also, joint learning involves many hyperparameters (\lambda_1, _2, etc.) and it would be necessary to see how sensitive the model performance is wrt to these hyperparameters. It is understandable that space might be too limited in a conference paper, but there is still plenty of space in the appendix so the rebuttal should at least address either of these two points (more detailed ablation, or results with different values of \lambda).

    Weak/Questionable claims (good to remove/rephrase them)

    • Abstract: there are enough papers out there highlighting spatio-temporal patterns in brain disorders to make the claim that “existing methods based on fMRI primarily focus on… spatial perspective” less relevant now.
    • Introduction, key contribution: this is not the first attempt of “spatio-temporal interpretability of brain dynamics associated with brain disorders”. Search for spatio-temporal interpretability brain disorder, there are many papers out there highlighting both spatial and temporal patterns for brain disorders. See [1] for a specific example for MDD.

    Equations

    • Eqn 2: What does it mean when there is a comma in the ReLU function? Do you mean that Q_r = ReLU( \theta_r^Q) and K_r = ReLU( \theta_r^K) and so on ?
    • LN in Eqn 4 does not seem to be defined

    Minor points

    • What version of AAL is being used? How many ROIs?
    • Figure 1 and Figure 3 contain too much information to the extent that some details (e.g. the legend in the subplot with grey/blue bars) cannot be seen even after zooming all the way in. But it is good that another legend is provided at the bottom of Figure 3. Perhaps the legend in the subplots could be removed (in both Fig 1 and 3) since they can’t be seen anyway. The bottom row of Fig 1 is quite redundant as (i) it is difficult to understand them at that point of reading the paper, (ii) similar figures are already shown in Fig 2 and 3 in a much clearer way, (iii) they are not actually part of the architecture.

    [1] Fang, Yuqi, et al. “Unsupervised cross-domain functional MRI adaptation for automated major depressive disorder identification.” Medical image analysis 84 (2023): 102707.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the paper is pretty well written except for a couple of questionable claims (but they are not directly related to the conduct of the experiments, so they only require minor editing). The score could be improved depending on the results from more detailed ablation studies.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I think the paper, in present state, is strong enough for acceptance but since the evaluation should be based on the submitted work (and not additional results), I’d have to keep my score unchanged.

    Nevertheless, if the submission is accepted, (i) sensitivity to lambda, (ii) additional ablations should be shown in the updated paper as much as possible since they would enhance clarity (wrt which part of the novel additions contributed to model performance) and should not be seen as substantial changes since the final model is likely already the best combination.

    To clarify, the purpose of mentioning about [1] is not to argue that [1] diminishes the novelty of this submission in any way, there was no need to discuss about that in the rebuttal. It was only mentioned to make it clear that it is not accurate to claim, as part of the key contributions, that “To the best of our knowledge, we are the first attempt to provide spatio-temporal interpretability of brain dynamics associated with brain disorders.”. That line should be removed or modified to narrow the scope of the claim.



Review #3

  • Please describe the contribution of the paper
    1. This paper porposed an end-to-end spatio-temporal interpretable transformer-based framework for dynamic brain activity analysis associated with brain disorders.
    2. The authors proposed a classification-guided clustering with tensor decomposition for capturing disorder-related spatio-temporal organization.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The author proposed an end-to-end transformer based framework which can automatically identify the brain functional modules with the dynamic temporal-varying weights and explore the temporal variation and spatio-temporal representation.
    2. They provided comprehensive experiments and a detailed interpretation of the results.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. There are four trade-off lambdas in the final loss functions which makes it harder to tune these parameters and the authors didn’t mention how they tuned these four parameters.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The author might share more about how they tuned the four trade-off parameters.
    2. The author didn’t mention if they did the cross validation and how they split the data.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The structure and framework of this article are well-organized, and the flowchart is easy to understand.
    2. The proposed BISTformer framework can concurrently build spatio-temporal clustering and capture the latent spatio-temporal representation of brain dynamics.
    3. Statisitcal test was performed when comparing the proposed method with other competing method makes the final results more convincing.
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank reviewers(R1,R3,R4) for their positive comments: quite robust(R1,R4), well organized(R1,R3,R4), and interested by the related neuroscience communities(R1,R3). We clarify the main points:

  1. Dataset and validation(R1,R3,R4) We choose AAL-116 as the brain atlas. The gender(F/M) distribution and age(±) are as follows. HC:F152/M94,26.7±6.1 MDD:F108/M43,17.0±5.0 BD:F83/M4,17.2±4.0 (R4)We have partly described the validation in Section3.1. Specifically, we use a 10-fold CV and further split a training subset (90%) and a validation subset (10%) to select the hyperparameters in each fold.
  2. Interpretation(R3) As shown in Section3.3, the BFMs removal is based on the selected active state. Some studies have demonstrated that brain function in different states is mainly controlled by some distinct subnetworks, as illustrated in Ref[Diego,PANS]-Fig.1. Therefore, we aim to identify the disease-related BFMs by investigating which dominates the active state. We hypothesize that if performance significantly degrades after the removal, the BFM is considered to be related to the disorder. Conversely, it’s irrelevant. Meanwhile, we quantified the importance score of each BFM as shown in our paper. Both of the two conclusions are consistent with prior research. We can add more rigorous explanations in the final paper.
  3. Discussion(R1,R3,R4) (R3)We will show the specific p-values in the final paper. (R1)Six parts of ablations are provided in the review. The parts of (i,iii,v) are collaborative. We decompose dFCs(i) into BFMs and corresponding temporal-varying weights(v) by optimizing the reconstruction loss(iii), which constituted the STC module for capturing the spatio-temporal patterns of brain dynamics. The STC and TWP(iv) ablation results are shown in Table1, demonstrating their effectiveness. Moreover, we have conducted experiments by removing self-attention(ii) and optimized our model with the alternating training scheme(vi). The two ablations demonstrated the effectiveness of self-attention(ii) and joint learning scheme(vi). (R1,R4)Lambdas are tuned through the validation set from a fixed range. As R1 suggested, we can leave space by removing the redundant and unclear parts in the figures and discuss the sensitivity to the Lambdas in the final version.
  4. Additional references(R1:Fang,MIA[1], R3:Zhao,MICCAI[2]) We will include all the suggested references in the final paper. [New paradigm] [1] proposed an alternating spatio-temporal features learning framework. The temporal features are merely treated as auxiliary to spatial. [2] enhanced the neuroimaging resolution by learning the temporal features of brain signals (fMRI and MEG) while ignoring the spatial connections of brain function. In contrast, we directly model the temporal variation of brain function to learn comprehensive spatio-temporal features in a collaborative learning manner. Moreover, we proposed a supervision-guided spatio-temporal clustering module to decompose dFCs into BFMs and corresponding temporal-varying weights obtained via the multi-channel self-attention module. [New interpretation] [1] localized discriminative ROIs and analyzed their group-level feature differences at each time point. [2] validated their effectiveness by matching subnetwork spatial maps and time series between fMRI and MEG. Our key contribution lies in identifying 1) the disease-related state and 2) biomarkers (BFMs) based on the collaboration of BFMs from a spatio-temporal variation perspective.
  5. Confusing descriptions(R1,R3). We will fix the confusing description and the missing explanation in the final paper. (R1)We could split Eq2 if space allows. ‘LN’ is LayerNormalization. (R3)In Eq2, we aim to aggregate signals based on BFMs at each time step and concatenate the features across time. Finally, we would be delighted if our paper would be considered for acceptance at MICCAI 2024.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Post rebuttal, all reviewers lean towards accepting the paper and I am in agreement as well. All major concerns seem to be addressed. In particular, they note that the following salient points about the paper: (a) methodology is well motivated (b) arguments clearly presented (c) appropriate baseline comparisons and ablations performed

    Please pay attention to the following if accepted:

    (1) Removal of text from supplementary (in violation of MICCAI rules) (2) Ensure that links to references inline render properly in the camera ready

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Post rebuttal, all reviewers lean towards accepting the paper and I am in agreement as well. All major concerns seem to be addressed. In particular, they note that the following salient points about the paper: (a) methodology is well motivated (b) arguments clearly presented (c) appropriate baseline comparisons and ablations performed

    Please pay attention to the following if accepted:

    (1) Removal of text from supplementary (in violation of MICCAI rules) (2) Ensure that links to references inline render properly in the camera ready



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The rebuttal does a good job in clarifying the data and model ambiguities in the paper. While the comment by R1 about a more methodical ablation study remains a valid (and unresolved) critique, the reviewers and I agree that the paper merits acceptance to MICCAI in its current form.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The rebuttal does a good job in clarifying the data and model ambiguities in the paper. While the comment by R1 about a more methodical ablation study remains a valid (and unresolved) critique, the reviewers and I agree that the paper merits acceptance to MICCAI in its current form.



back to top