Abstract

The diagnosis of Autism Spectrum Disorder (ASD) using resting-state functional Magnetic Resonance Imaging (rs-fMRI) is commonly analyzed through functional connectivity (FC) between Regions of Interest (ROIs) in the time domain. However, the time domain has limitations in capturing global information. To overcome this problem, we propose a wavelet-based Transformer, BrainWaveNet, that leverages the frequency domain and learns spatial-temporal information for rs-fMRI brain diagnosis. Specifically, BrainWaveNet learns inter-relations between two different frequency-based features (real and imaginary parts) by crossattention mechanisms, which allows for a deeper exploration of ASD. In our experiments using the ABIDE dataset, we validated the superiority of BrainWaveNet by comparing it with competing deep learning methods. Furthermore, we analyzed significant regions of ASD for neurological interpretation.In our experiments using the ABIDE dataset, we validated the superiority of BrainWaveNet by comparing with competing deep learning methods. Furthermore, we analyzed significant regions of ASD for neurological interpretation.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1241_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/ku-milab/BrainWaveNet

Link to the Dataset(s)

https://fcon_1000.projects.nitrc.org/indi/abide/abide_I.html

BibTex

@InProceedings{Jeo_BrainWaveNet_MICCAI2024,
        author = { Jeong, Ah-Yeong and Heo, Da-Woon and Kang, Eunsong and Suk, Heung-Il},
        title = { { BrainWaveNet: Wavelet-based Transformer for Autism Spectrum Disorder Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This work proposed a frequency-baed model that can capture brain activation levels and temporal dynamics from fMRI data through continuous wavelet transform.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This work is well-organized with clear methods and source code released. The comparison and ablation study demonstrate the effectiveness of the framework. The source code makes it easy to reproduce.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The model’s performance improvement over existing baselines like BrainNetTF appears slight, with likely insignificant differences, particularly for AUC and ACC. This hints that integrating modular information may not be as beneficial as anticipated. Considering this, experimenting with other datasets before exploring model interpretability would have been beneficial.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Table 1 illustrates that BrainNetTF and BrainNetCNN outperformed the models P w/o Image, P w/o Real, and P w/o Cross-atten. Why not consider using BrainNetTF or BrainNetCNN as the backbone?
    2. Neuroscientific analysis should be more specific, as it is difficult to summarize from Fig. 2. More discussion should be included for better interpretation.
    3. The authors should provide more discussion on the clinical implications of the proposed method and its potential use in real-world ASD diagnosis and treatment.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is well written, and the related code is released. However, the qualitative results and quantitative analysis are not so convincing. More discussion on clinical treatment or potential use is needed.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Good paper, solving interesting problem.



Review #2

  • Please describe the contribution of the paper

    This paper proposed a wavelet-based Transformer for ASD diagnosis from rs-fMRI data. The rs-fMRI data are transformed using continuous wavelet transformation (CWT). The model consists of two independent Temporal Transformer Encoders to learn temporal relations in the real and imaginary parts of the CWT features, followed by a Spatial Transformer Encoder with cross-attention to learn the spatial relationships between brain ROIs. Experiments on the ABIDE dataset suggested that the proposed model outperformed existing deep learning-based approaches in ASD diagnosis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The use of both real and imaginary part of the time-frequency features from CWT is novel, which could allow the model to learn both magnitude and phase information of the brain activity.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Certain parts of the methodology need more justification. For example, in Section 2.1, the authors mentioned that a complex Morlet wavelet was selected, but did not justify why.
    2. There is no interpretation of Figure 2. Do the top-10 regions make sense clinically? Are the patterns generalizable across participants?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. As mentioned above, please provide more justification in methods: a. Why is complex Morlet wavelet selected and what’s its advantage over other wavelets? b. In Equation 8, why is cross-attention chosen to be applied to the imaginary part of the query and real parts of key and value, and not the other way around (real part of query and imaginary parts of key and value) or a combination of both?
    2. Please provide some interpretations of Figure 2. Please also provide more examples across multiple participants (can be in Supplement).
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Some aspects of the methodology are novel, and empirical results are promising. However, certain parts of the methods need clarification.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The author proposed the BrainWaveNet which learning the real and imaginary representation of complex Morlet wavelet for autism spectrum disorder diagnose, and the proposed method outperforms other SOTAs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • the paper is well organized and figures and words are clear which makes reader easy to follow.
    • The author proposed the method which could analyze the frequency, temporal and spatial information of BOLD signals to provide a dynamic comprehensive analysis.
    • The author conduct experiments and ablation studies to show the improvement from proposed methods.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The author using the wavelet transform to obtain the network’s input, it is also important to do ablation studies on different mother wavelet type and wavelet level.
    • Also, lots of studies using FFT or STFT to obtain frequency representation, is that wavelet better than those two?
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No. The author provides the anonymized link to code for replication, which is impressive.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See in weakness, the author should do ablation study compare with different spectral feature extraction methods like fft/stft.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is overall nicely written, but there are still some major weakness need to be improved.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The author addressed my concerns, so I raised my score accordingly.




Author Feedback

We would like to thank the reviewers for their helpful comments.

#3, 6) Rationale for selecting the complex Morlet wavelet The Morlet wavelet is preferred due to its Gaussian shape in the frequency domain, which minimizes ripple effects that could be misinterpreted as oscillations (Cohen, 2019). Its effectiveness has already been demonstrated in data requiring critical temporal resolution, such as EEG studies (Herrmann et al., 2005). Additionally, the Morlet wavelet has an optimal ratio between the Fourier period and wavelet scale, facilitating interpretation in the frequency domain (Torrence and Compo, 1998). This is why Chang and Glover (2010) used the Morlet wavelet to measure rs-fMRI functional connectivity. Given the characteristics of the rs-fMRI BOLD signal, which includes rapid fluctuations where temporal resolution is critical, we selected the Morlet wavelet. We will add and clarify these rationales of selecting the complex Morlet wavelet in our revised paper.

#3, 6) Rationale for selecting the CWT We chose CWT due to the limitations of FFT and STFT. FFT cannot observe the frequency over time, making it unsuitable for analyzing non-stationary signals. STFT has a tradeoff between frequency and temporal resolution by selecting the window sizes. In contrast, the CWT captures both frequency and temporal information simultaneously, using scalable wavelets, providing a multi-resolution analysis that allows for capturing various scales and frequencies of signals. This makes it more suitable for non-stationary signals and better at capturing transient features in biological signals than FFT-based methods.

#6) Comparison with Fourier transform To our knowledge, there is little research on applying FFT or STFT to rs-fMRI with deep learning, particularly in separating the real and imaginary parts. Also, a fair comparison is complicated because FFT is unsuitable due to the lack of temporal information, and in STFT, the real or imaginary feature can be zero in a real-valued signal depending on the frequency band (Sorensen et al., 1987). This requires additional research to find an appropriate representation. It should be addressed in future work.

#3, 5) Fig.2, interpretation and clinical implications The ROIs in Fig. 2 are extracted by averaging well-classified subjects across groups and selecting 5% of the 200 ROIs to highlight the most significant regions, which are identified as meaningful biomarkers based on preliminary ASD research. Thus, our method extracts significant brain signal features, enabling the discovery of important biomarkers and enhancing diagnosis. However, as reviewers mentioned, detailed explanations are needed. We will add (1) individual-wise analysis and (2) the existing related studies in our revised version, including generalized analyzed ROIs with their relation to symptoms of ASD.

#5) Performance comparison While the performance improvements were marginal, we believe that our novel approach of utilizing CWT-based real and imaginary values as input and designing a Transformer-based network architecture deserves its originality and is comparable to traditional approaches such as ALFF and FFT-based fMRI analysis methods. In the meantime, considering the model’s efficacy, we conducted experiments on the ADHD-200 dataset and achieved meaningful improvements compared to the existing methods. However, we think it is not appropriate to add the results in the revised paper because the main focus of our current work is to develop a method for ASD diagnosis. Therefore, we will officially publicize the results on our GitHub page with the codes. Regarding the use of mentioned SOTA network architectures with CWT as input, it is not possible because those networks fundamentally require functional connectivity as input.

#3) Eq. 8 We also used imaginary parts as a query and real parts as a key and value, combining these features for the classification. We will clarify the relevant part in our revised version.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Post rebuttal, all reviewers are leaning towards accepting this paper. Upon reading the responses, the authors seem to have addressed all major concerns and are largely impressed by the novelty in the proposed methodology.

    In the rebuttal, the authors have indicated that additional experiments have been run on a different dataset that will not be a part of the final paper (per MICCAI guidelines) but made available through the GitHub repository. I hope to see this independent validation made available, given the marginal improvements for the ASD application (note in particular the higher standard deviations for the specificity and sensitivity measures and the lack of statistical significance measures).

    An additional point that has been glossed over in the paper is the relative sizes of the model (in terms of parameters) in comparison with reported baselines. Perhaps this can be determined based on the repository that has been released during review.

    Finally, please note that as indicated in the response to clinical interpretation, adding new results to the main manuscript and making substantial changes to the current version may be in violation of the conference policies

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Post rebuttal, all reviewers are leaning towards accepting this paper. Upon reading the responses, the authors seem to have addressed all major concerns and are largely impressed by the novelty in the proposed methodology.

    In the rebuttal, the authors have indicated that additional experiments have been run on a different dataset that will not be a part of the final paper (per MICCAI guidelines) but made available through the GitHub repository. I hope to see this independent validation made available, given the marginal improvements for the ASD application (note in particular the higher standard deviations for the specificity and sensitivity measures and the lack of statistical significance measures).

    An additional point that has been glossed over in the paper is the relative sizes of the model (in terms of parameters) in comparison with reported baselines. Perhaps this can be determined based on the repository that has been released during review.

    Finally, please note that as indicated in the response to clinical interpretation, adding new results to the main manuscript and making substantial changes to the current version may be in violation of the conference policies



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top