Abstract

Neuroscientific literature faces challenges in reliability due to limited statistical power, reproducibility issues, and inconsistent terminology. To address these challenges, we introduce NeuroConText, the first brain meta-analysis model that uses a contrastive approach to enhance the association between textual data and brain activation coordinates reported in 20K neuroscientific articles from PubMed Central. NeuroConText integrates the capabilities of recent advancements in large language models (LLM) such as Mistral-7B instead of traditional bag-of-words methods, to better capture the text semantic, and improve the association with brain activations. Our method is adapted to processing neuroscientific textual data regardless of length and generalizes well across various textual content—titles, abstracts, and full-body—. Our experiments show that NeuroConText significantly outperforms state-of-the-art methods by a threefold increase in linking text to brain activations regarding recall@10. Also, NeuroConText allows decoding brain images from text latent representations, successfully maintaining the quality of brain image reconstruction compared to the state-of-the-art.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3550_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3550_supp.pdf

Link to the Code Repository

https://github.com/ghayem/NeuroConText

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Meu_NeuroConText_MICCAI2024,
        author = { Meudec, Raphaël and Ghayem, Fateme and Dockès, Jérôme and Wassermann, Demian and Thirion, Bertrand},
        title = { { NeuroConText: Contrastive Text-to-Brain Mapping for Neuroscientific Literature } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a new brain meta-analysis framework utilizing a contrastive learning approach to improve the association between neuroscientific text and brain activation maps. In the experiments, the authors illustrate that NeuroConText outperforms existing models.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. NeuroConText proposes a new method to contrastive learning for enhancing text-brain association.
    2. The paper includes extensive experiments and comparative analysis.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper is not well organized, has typos and a little bit hard to follow.
    2. The implementation details are not provided, like the computation resources and hyperparameters.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. What are the results shown in Table 1&2, are they title, abstract or body?
    2. The authors mentioned that they divide the texts into chunks, can you provide more details on how to handle long text? Will it impact the semantic meanings of the sentences/context?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Need to be well organized and written

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    The authods address my concers and answer my questions, so I increase my rating to weak-reject. I think there are a lot of revisions needed for this paper, like more detailed explanations of the terms and notions.



Review #2

  • Please describe the contribution of the paper

    The paper proposes a contrastive learning approach for mapping scientific neuroscience articles to the brain activations described in the article. The goal is to be able to aggregate the large number of atricles in statistically significant brain maps.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is the first to investigate contrastive learning and the use of LLMs for this task.
    • The results in brain activation retrieval outperform SOTA significantly.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Details on decoder training and architecture are completely missing
    • The technical novelty is limited, as the paper mainly combines known methods and approaches (CLIP, LLMs, DiFuMo)
    • the results in estimating brain maps are not convincing
    • the authors claim the main issue of Text2Brain is its ineffectiveness in processing long texts, however the proposed handling of long text is only based on averaging which could also lead to information loss
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Please add more details on how the decoder is built and trained
    • A reasoning on why the method’s results on generating brain maps are not so good would be interesting
    • Make the differences to Text2Brain clea
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has merits, but some parts need to be clearer for it to be accepted to MICCAI

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The method and application is interesting and the authors promised to include additional details and improve the clarity of the writing.



Review #3

  • Please describe the contribution of the paper

    This study introduces NeuroConText, a brain meta-analysis model that employed contrastive learning to enhance the association between text data and brain activation coordinates reported in 20K neuroscientific articles. The authors demonstrated that the proposed approach better capture the text semantic and improve the association with brain activation by leveraging LLM.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) Novelty: Introduce CLIP for the first time and leverage LLM for coordinate-based brain meta-analysis. It is interesting for both neuroscientists and psychologists with potential applications in brain encoding/decoding and neurocience. 2) Link fMRI to 20K neuroscientific texts with higher Recall@10 accuracy (but limited gain in dice score, see weakness)

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) For fMRI dimension reduction, the authors used DiFuMo representation coefficients with dictionary size of 256 & 512 without any justifications. This is especially concerning when the authors also showed that 512 outperformed 256 by 3%. 2) How about other alternative ways of characterizing brain mode/networks? Does it work as well? What are the impact on CLIP results? 3) Fig 3, what does the color bar represent? It looks like the proposed method did not outperform Text2Brain/NeuroQuery in terms of Dice coefficient (figure at the bottom). This is in contrast to higher recall accuracy. It would be good to discuss why and examine it further. Ablation study on this part is also missing. 4) Lack of baseline model comparisons (e.g. other CLIP strategies)

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    None

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The authors proposed CLIP and LLM on a novel neuroscience application – brain meta-analysis. It would be great to clarify a few methodological decisions/baseline models, and discuss the performance contrast between retrieval accuracy and dice scores. Also, the authors stated that this method could be helpful for brain decoding but no data was shown.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) Novelty 2) Clear presentation 3) Lack of clarity/discussion (see above) 4) Potential implications

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank the reviewers for their valuable comments.

(R3, R5) Model architecture and reproducibility: The codes and architecture details will be publicly released upon acceptance of the paper. We will also revisit the paper for missing information about the architecture and hyper-parameters.

(R1, R5) Dice score improvement: Improving brain map reconstruction is beyond this paper’s scope, and the focus is to improve text/brain association via contrastive loss. We checked a posteriori that we maintain brain map reconstruction scores comparable to the baselines. Also, NeuroContext benefits from large descriptive texts, but IBC contrast definitions are typically only 100 characters, making performance differences with Text2Brain less evident. This contributes to having equivalent dice scores.

(R3, R5) Semantics and information loss in averaging: Our experiments showed that averaging chunk embeddings consistently provides better results. We tested other methods like selecting the top p\% quantile of correlated chunks, and spline quantization with MLP aggregation, yet averaging performed better. Fig.2 shows full body text improves performance compared to titles, abstracts, or random text parts, despite potential semantic changes and information loss from averaging. To handle long texts, the best approach so far is thus to average chunk embeddings. We have detailed this in Section 2.1.

(R5) Technical novelty: While our approach builds upon existing methods, the innovation lies in integrating and adapting these techniques for automated literature analysis (in our case, brain meta-analysis). By combining advanced language models with a contrastive learning framework, we improve the poor association of text and brain maps in existing meta-analysis methods. Also, we addressed the challenge of processing diverse textual input lengths using an efficient chunk averaging approach. We also showed the importance of leveraging full body text in training our model. Our experiments show that NeuroConText improves association scores by up to threefold compared to existing baselines, marking a significant advancement in brain meta-analysis. It offers a more effective tool for the neuroscience community than current meta-analytic methods.

(R1) DiFuMo dictionary sizes and alternatives: Choosing DiFuMo size is based on the reference paper [4]. While alternatives to DiFuMo exist, comparing them is beyond this paper’s scope. We chose DiFuMo for its well-established nature, pretraining on a large dataset (2.4TB), and the detailed, continuous atlases it provides for brain pattern extraction, with clear reference points [4]. DiFuMo atlases range from 64 to 1024 networks. Larger dictionaries provide more detailed brain activation structures, which justifies why 512 dictionary size outperformed 256 by 3\% (Table 2). However, 1024 components led to overfitting due to our small dataset. This trade-off between detail and overfitting explains why 512 performed best.

(R1) Baseline model comparisons (other CLIP strategies): In this study, we focused on showing the validity and strength of our approach using the original CLIP model. While we acknowledge the importance of comparing with other CLIP strategies, our current scope is focused on this proof of concept. Future research should indeed perform systematic comparisons.

(R5) Text2Brain differences: Differences between NeuroConText and Text2Brain are addressed in various sections of the paper, such as random text selection vs. long text processing, SciBERT vs. Mistral-7B, high-dimensional brain maps vs. DiFuMo coefficients, and regression via 3D CNN vs. introducing shared latent space via text/image encoders and contrastive loss. We will make the presentation of these points more consist.

(R3) Results in Tables 1 and 2: They are based on the final NeuroConText model built for the full body text.

(R1) Color bar in Fig. 3: It is the statistical value for IBC dataset. We will include it in the caption.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper presents NeuroConText, which combines LLM and contrastive learning to learn the association between brain activation coordinates and text from 20K neuroscientific articles in PubMed Central. The application is novel and interesting, with potential scientific values. On the other hand, as Reviewer #3 pointed out, the presentation of this paper needs to improved with clarifications of technical details.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This paper presents NeuroConText, which combines LLM and contrastive learning to learn the association between brain activation coordinates and text from 20K neuroscientific articles in PubMed Central. The application is novel and interesting, with potential scientific values. On the other hand, as Reviewer #3 pointed out, the presentation of this paper needs to improved with clarifications of technical details.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    While the reviews are mixed, the reviewers agree that a main strength of the paper is the introduction contrastive learning for brain meta analyses from map coordinates/scientific papers. However, there were major concerns regarding clarity, missing implementation details, limited methodological novelty as using existing CLIP/LLM methods, and some missing experiments. After reading through the response and paper, I think a number of concerns can be resolved. Thus, I would like to recommend accept for this paper, due to the interesting application of CLIP to the brain data meta-analysis problem and its potential utility/implications for the research community as a tool. The authors should be sure to address clarification requests from reviewers, eg implementation details for decoder.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    While the reviews are mixed, the reviewers agree that a main strength of the paper is the introduction contrastive learning for brain meta analyses from map coordinates/scientific papers. However, there were major concerns regarding clarity, missing implementation details, limited methodological novelty as using existing CLIP/LLM methods, and some missing experiments. After reading through the response and paper, I think a number of concerns can be resolved. Thus, I would like to recommend accept for this paper, due to the interesting application of CLIP to the brain data meta-analysis problem and its potential utility/implications for the research community as a tool. The authors should be sure to address clarification requests from reviewers, eg implementation details for decoder.



back to top