Abstract

Glioma remains one of the most lethal malignancy, making accurate prognosis crucial for personalized treatment and improved patient outcome. Existing models based on non-invasive magnetic resonance imaging (MRI) offer convenience, but they suffer from the poor performance and generalizability compared to genomic biomarkers, limiting their clinical adoption. Genomic biomarkers, such as IDH mutation and 1p/19q co-deletion, provide superior prognostic value but are restricted by their reliance on invasive surgical sampling. In this study, we hypothesize that these genomic biomarkers can guide the development of more robust MRI-based prognostic models, and propose a genomics-guided prompt learning framework that leverages both MRI and transcriptomic data to enhance survival prediction. Specifically, we introduce a novel visual modeling strategy for comprehensive glioma MRI representation and a Prompt-bridged Attention mechanism that can fuse multiple modalities during training and enhance visual representations during inference. Experimental results demonstrate that our proposed method achieves c-indeces of 0.6709 and 0.6904 on UCSF-PDGM and TCGA-GBM datasets, respectively, with highly significant p-values of 5.27e-14 and 6.72e-7. These results substantially outperform existing methods, presenting a promising step toward reliable and non-invasive glioma prognosis prediction.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3661_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{ZhoYi_Thread_MICCAI2025,
        author = { Zhong, Yi and Zheng, Xubin and Shen, Xiongri and Wang, Jiaqi and Zhao, Leilei and Song, Zhenxi and Zhang, Zhiguo},
        title = { { Thread the Needle: Genomics-guided Prompt-bridged Attention Model for Survival Prediction of Glioma based on MRI Images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {627 -- 637}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The manuscript is based on survival prediction of Glimna from MRI imaging. The proposed manuscript uses genetic biomarkers to enhance the predication capability.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The method employed genetic biomarkers with MRI imaging to improve the survival prediction.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The major concern is the genetic data and MRI used are not registered/belong to same patient.
    2. The genetic biomarkers used are not used in Testing phase. Does it really required for only training phase?
    3. Authors must presented the side effect or biases involve using unregistered MRI-genomic data pair.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. Why low resolution features are derived is used in Eq. 1.
    2. For completeness, authors should provide brief discussion on scFoundation (Ref. 7) method.
    3. In Eq 3 & 4, how learnable parameters were estimated? How they are effecting the performance. What LN stands for?
    4. There is a typo in Eq. 5. Arrange the brackets accordingly.
    5. How results are generated in Table 2, without genomic data.
    6. Author(s) should also compare the proposed method with other available methods like Ref. 4, 5, 6, 7.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors must method unregistered MRI-genomic pair data will work and biases involved. Also, the genomic data prompt are not used in testing phase at all.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    Still not convinced with not able to compare with state-of-the-art methods.



Review #2

  • Please describe the contribution of the paper

    This paper proposes using a ‘bridge token’ within a co-attention mechanism to integrate MRI image features with genomic feature representations for glioma, aiming to improve the accuracy of patient survival prediction.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A key strength of the proposed bridge token co-attention mechanism is that it allows the model to leverage learned multimodal relationships during training using only imaging data at inference time, enhancing clinical utility where paired genomic data may be unavailable.
    2. The study demonstrates substantial improvements in survival prediction accuracy compared to the reported baseline methods.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Lack of Clarity in Methodology and Presentation: The manuscript’s writing is often convoluted and lacks coherence, making it difficult to fully grasp the proposed methodology and data flow. Several critical aspects require clarification: a. Is paired genomic and MRI data necessary during the training stage or even unpaired datasets will do? If unpaired data can be utilized, please clarify the mini-batch sampling strategy used for unpaired datasets. b. The role and utilization of language model embeddings need detailed explanation: How are they generated and incorporated during training? How, if at all, are they used during the testing/inference phase? c. An ablation study is needed to understand the performance impact of these language model embeddings when they are masked or omitted. d. (Presentation Issue): The panels in Figure 1 appear to be reversed (panel (b) seems to precede panel (a)).
    2. Missing Control Experiments and Baselines: The study lacks crucial experiments to contextualize the method’s performance: a. A direct comparison is needed showing the model’s performance when trained/tested with explicitly paired imaging/genomic data versus its performance using an unpaired setup. This would help quantify the model’s effectiveness in handling unpaired data. b. When fully paired data (MRI, genomics, clinical) is available, a comparison against standard co-attention mechanisms or established multi-task learning frameworks designed for paired data is necessary to understand its relative performance in that optimal scenario.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. MRI Image segmentation quality: Please comment on the potential impact of MRI segmentation quality on downstream results

    2. Potential Data Source Mismatch: The potential mismatch between bulk transcriptomics data (often from a specific biopsy/tissue block) and MRI data (capturing the entire tumor volume) requires explicit acknowledgment. Please comment on the potential impact of this inherent data limitation on the model’s ability to learn meaningful multimodal interactions.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the bridge token concept proposed in this paper is interesting, the convoluted description of the methodology and the lack of adequate control/ablation experiments make it impossible to rigorously evaluate the paper’s contributions, thus precluding acceptance pending substantial clarification and validation in the rebuttal.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Overall, the authors’ proposed bridge token methodology presents an interesting and novel approach towards multimodal data integration that has significant implications for improving the interpretability and performance of prognostic models in precision oncology. Hence I recommend accept.



Review #3

  • Please describe the contribution of the paper

    The main contribution of this paper is the proposal of a genomics-guided prompt learning framework that integrates MRI and transcriptomic data for survival curve prediction for glioma patients.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Incorporating transcriptomic data represents a promising strategy. The use of an open dataset helps mitigate the risk of overfitting. Although achieving high c-indices is inherently challenging, the study reports meaningful and competitive results.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    A practical strategy for translating the model into clinical use is lacking—for example, its application in adaptive therapy remains unexplored. Additionally, a robust approach for handling missing data is necessary to enhance the model’s reliability and generalizability.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Incorporating transcriptomic data presents a promising strategy, and the use of an open dataset helps mitigate concerns of overfitting. While achieving high c-indices is inherently challenging, the study demonstrates meaningful and competitive results. However, a practical strategy for translating the model into clinical settings—for instance, in the context of adaptive therapy—is currently lacking. Additionally, a robust approach to handling missing data is needed to enhance the model’s reliability and generalizability. Finally, the study does not provide sufficient information to ensure reproducibility, such as access to source code or implementation details.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors appropriately address all of my concerns.




Author Feedback

Thanks to all the reviewers (R1, R2, R3) for acknowledgement about our methodological contribution, and their constructive comments for further clarification. Q1: Question about used datasets & why used unpaired MRI & Genomic data? (R1 & R2 & R3) A1: Due to the different sources of MRI images and bulk transcriptome data, pairing these data from different source is difficult and challenging. Furthermore, the cost of collecting multimodal data is also very expensive in real clinical scenarios. After we integrated the collected data, we found that there were very few samples that could have both MRI image and bulk transcriptome multimodal data (about one-tenth of the amount of image data), and the vast majority of the samples existed in our test dataset (TCGA-GBM), so we were unable to consider using paired multimodal data during the training procedure. Finally, we designed a method to pair these MRI images and bulk transcriptome data with survival state associations from different samples and different sources as much as possible. We have described the pairing method in Section 2.1 in our paper. Based on previous clinical studies, and The 2021 WHO classification of tumors of the central nervous system, we believe that if two samples have similar key clinical information, especially molecular pathology characteristics, they have similar survival risks. Thus, we organized each patient’s clinical information into a textual description that included gender, age, survival time, and events, as well as molecular pathology information obtained through invasive methods. Then, we used BioClinicalBERT to encode the caption to obtain a representation vector for each sample, and then performed cosine similarity calculations Sim<i,j> between the sample i from MRI images group and the sample j from the bulk transcriptome group. This method is only used to form our training data; the clinical text caption and features are not added to our models. Besides, about the segmentation quality of tumor, we used the manually corrected segmentation mask in our method directly, so that we do not evaluate the impact of the quality of the segmentation on the performance, but it is worth discussing in our future work. Q2: Further clarify our proposed method. (R1 & R3, R2 Recommendation) A2: Our proposed method was originally designed to enable non-invasive prediction during the preoperative period, so that the method is essentially a unimodal approach, since only MRI image data are used as input during the inference procedure. Prediction using only MRI images allows the method could be applied for clinical reference before surgical treatment. However, once surgical treatment is complete, information such as pathology and molecular sequencing is obtained and will serve as the gold standard for subsequent treatment and survival prediction. Therefore, MRI images are important only for preoperative survival prediction. Why integrate MRI images and bulk transcriptome multimodal data during the model training? First, we are trying to learning the correlations between image features and gene features, and then enhance the representation of image features. Second, the correlated genetic features are incorporated into the visual features through prompt bridge attention in the model to improve the accuracy and significance of the survival analysis. So that our method can be further extended for the construction of image-genetic biomarkers, which is a by-product of the method. To better present the details of the methodology, we will add a GitHub link of the source code in the camera-ready version of this paper (if finally accepted). Q3: Experiments and baseline method. (R1 & R3) A3: However, the limitations of the dataset prevented us from conducting more comparative experiments, including multi-task designs, and image-genomics multimodal fusion for survival prediction. We selected recent high-quality work as the baseline methods.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper proposes a Prompt-bridged Attention mechanism to fuse MRI and genomics data for survival prediction and demonstrates that the model can still perform well when only MRI is available at test time. The idea is interesting and addresses a practical challenge in multimodal learning with missing modalities. However, the main limitation lies in the lack of comprehensive comparisons with state-of-the-art methods. While there are relatively few existing approaches specifically focused on MRI and genomics fusion, there are numerous multimodal fusion methods developed for pathology and genomics, many of which are general and applicable to this task. Without benchmarking against these broader multimodal fusion baselines, it is difficult to assess the significance of the proposed method. Given this limitation, I recommend rejection.



back to top