Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Accurate, noninvasive detection of isocitrate dehydrogenase (IDH) mutation is essential for effective glioma management. Traditional methods rely on invasive tissue sampling, which may fail to capture a tumor’s spatial heterogeneity. While deep learning models have shown promise in molecular profiling, their performance is often limited by scarce annotated data. In contrast, foundation deep learning models offer a more generalizable approach for glioma imaging biomarkers. We propose a Foundation-based Biomarker Network (FoundBioNet) that utilizes a SWIN-UNETR-based architecture to noninvasively predict IDH mutation status from multi-parametric MRI. Two key modules are incorporated: Tumor-Aware Feature Encoding (TAFE) for extracting multi-scale, tumor-focused features, and Cross-Modality Differential (CMD) for highlighting subtle T2–FLAIR mismatch signals associated with IDH mutation. The model was trained and validated on a diverse, multi-center cohort of 1,705 glioma patients from six public datasets. Our model achieved AUCs of 90.58% ± 1.25, 88.08% ± 3.08, 65.41% ± 3.35, and 80.31% ± 1.09 on independent test sets from EGD, TCGA, Ivy GAP, RHUH, and UPenn, consistently outperforming baseline approaches (p ≤ 0.05). Ablation studies confirmed that both the TAFE and CMD modules are essential for improving predictive accuracy. By integrating large-scale pretraining and task-specific fine-tuning, FoundBioNet enables generalizable glioma characterization. This approach enhances diagnostic accuracy and interpretability, with the potential to enable more personalized patient care.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4377_paper.pdf

SharedIt Link: https://rdcu.be/eHwWr

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04981-0_25

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/SomayyehF/Glioma_Biomarkers.git

Link to the Dataset(s)

TCGA-LGG: https://www.cancerimagingarchive.net/collection/tcga-lgg/ TCGA-GBM: https://www.cancerimagingarchive.net/collection/tcga-gbm/ UCSF-PDGM: https://www.cancerimagingarchive.net/collection/ucsf-pdgm/ EGD: https://xnat.bmia.nl/REST/projects/egd/ Ivy GAP: https://www.cancerimagingarchive.net/collection/ivygap/ RHUH-GBM: https://www.cancerimagingarchive.net/collection/rhuh-gbm/ UPenn-GBM: https://www.cancerimagingarchive.net/collection/upenn-gbm/

BibTex

@InProceedings{FarSom_FoundBioNet_MICCAI2025,
        author = { Farahani, Somayeh AND Hejazi, Marjaneh AND Di Ieva, Antonio AND Liu, Sidong},
        title = { { FoundBioNet: A Foundation-Based Model for IDH Genotyping of Glioma from Multi-Parametric MRI } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {259 -- 270}
}

Reviews

Review #1

Please describe the contribution of the paper

The presented work introduces FoundBioNet, a foundation model built on the SWIN-UNETR architecture, designed for noninvasive IDH genotyping of gliomas using multi-parametric MRI. Key contributions include the development of novel modules: the TAFE, which enables multi-scale tumor-focused feature extraction, and the CMD, which enhances T2-FLAIR mismatch signals that are clinically linked to IDH mutations. The model undergoes large-scale validation on a diverse, multi-center cohort of 1,705 glioma patients from six public datasets, showcasing its strong generalizability. FoundBioNet demonstrates superior performance, achieving high 90.58% AUCs compared to baseline methods, with ablation studies confirming the crucial roles of the TAFE and CMD modules.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The CMD module explicitly targets the T2-FLAIR mismatch sign, a known imaging biomarker for IDH-mutant gliomas, enhancing both interpretability and clinical relevance by integrating clinical insights.
2. Extensive experiments across multiple datasets, along with statistical validation and ablation studies, rigorously evaluate the model. Its robustness is further demonstrated by strong performance on imbalanced external datasets.
3. By jointly optimizing segmentation and IDH classification, the model leverages tumor localization to enhance feature extraction, aligning with recent trends in glioma analysis through multi-task learning.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The model’s performance is highly dependent on accurate tumor segmentation, which could limit its applicability in cases where segmentation quality is poor. Although the authors acknowledge this issue, they do not provide concrete solutions.
2. Although the model addresses class imbalance through augmentation, performance significantly drops on highly skewed datasets. Exploring techniques such as focal loss or synthetic oversampling could help improve results.
3. Please redraw your figure, as the text is not clear when zoomed in.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents a promising approach, but key limitations, including reliance on accurate segmentation and insufficient handling of class imbalance, undermine its practical robustness and clarity, leading to a weak reject.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

All of my concerns haven been well taken.

Review #2

Please describe the contribution of the paper

The manuscript presents FoundBioNet, a foundation-based deep learning model specifically tailored to predict IDH mutation status noninvasively from multi-parametric MRI. The primary innovations are the integration of Tumor-Aware Feature Encoding (TAFE) and Cross-Modality Differential (CMD) modules, within a SWIN-UNETR architecture.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The introduction of foundation model principles to glioma IDH classification represents an important step forward in the application of advanced AI methods to clinical neuroimaging.
- The study reports superior performance compared to multiple baselines, demonstrating strong generalization across diverse, multi-center datasets.
- Rigorous statistical validation using ANOVA with appropriate post-hoc testing strengthens the robustness of the presented results.
- Occlusion sensitivity maps in Figure 2 provide valuable insights into model interpretability and attention mechanisms, enhancing the clinical relevance of the findings.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Lack of Clarity on Novelty of Modules: Although the authors introduce TAFE and CMD modules with distinct naming, similar strategies likely exist within previous architectures such as Swin-UNETR. It would significantly strengthen the manuscript to clarify explicitly how these modules differ from existing methods in the literature.
- Data Imbalance Handling: The authors acknowledge the challenges posed by highly imbalanced datasets (e.g., Ivy GAP, RHUH-GBM, and UPenn-GBM datasets). However, the manuscript lacks a clear description of the specific methods or strategies used to mitigate this imbalance during model training or evaluation.
- Baseline Model Reporting: Table 2 lacks results from the baseline BrainSegFounder-Tiny model, which would provide clearer evidence of improvement gained specifically from the introduced modules.
- Incomplete Interpretability Analysis: Although the occlusion sensitivity maps are commendable, additional details regarding the generation of these interpretability maps and their quantitative evaluation are missing.
- Limited Foundation Model Comparisons: The manuscript claims the development of a foundation model but does not provide direct comparisons against other foundation models or large-scale pretrained approaches, limiting the context of the model’s significance.
- Reporting on Additional Datasets: The authors should report comprehensive performance across all used datasets, providing deeper insights into generalizability and dataset-specific challenges.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- Please clarify explicitly how the TAFE and CMD modules differ from existing approaches in the Swin-UNETR literature.
- Authors should describe and justify clearly the strategies adopted to address class imbalance during model training.
- It is advisable to include results from the baseline BrainSegFounder-Tiny model explicitly in Table 2 for clearer comparative analysis.
- Provide more detailed methodology and quantitative evaluation metrics for the interpretability analysis, particularly the occlusion sensitivity maps.
- Consider benchmarking your proposed model against other foundation models from recent literature, to robustly contextualize your findings.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The manuscript presents a strong methodological approach, significant generalization performance across multiple independent datasets, and robust statistical validation. Despite some limitations, especially regarding novelty clarifications and class imbalance handling, the paper makes a valuable contribution worthy of acceptance, pending minor revisions.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.
The authors have adequately addressed the major concerns raised during the initial review. Specifically:
- The authors provided a clear explanation of how the TAFE and CMD modules differ from prior Swin-UNETR-based architectures, emphasizing their tailored design for multi-task learning with minimal overhead while preserving pretrained knowledge. This strengthens the novelty claim.
- They clarified the augmentation strategies used and acknowledged performance limitations on highly imbalanced datasets such as UPenn-GBM. They also proposed future directions like MixUp and CutMix, which is appropriate given conference restrictions on adding new experiments.
- While direct comparisons to newer foundation models are not yet included in the manuscript, the authors justified this by noting that those models were released as preprints after their study began and that they plan to benchmark them in future work. Their comparison with ViT variants remains valid.
- The response clarified the methodology behind occlusion sensitivity maps, aligning with standard practices. Although quantitative interpretability metrics could not be added due to space constraints, the methodological clarification was sufficient.
- The rebuttal offered reasonable mitigation strategies (e.g., soft gating, residual amplification, loss weighting) to address segmentation dependency, along with potential future improvements.
- A summary table will be added to clarify dataset usage across the two experimental setups, improving transparency.
Considering these points and the fact that the core scientific contribution, FoundBioNet as a foundation model for IDH genotyping using multi-parametric MRI, is strong, methodologically sound, and well-validated, I believe the paper has been sufficiently improved and merits acceptance.

Review #3

Please describe the contribution of the paper

The manuscript builds upon a segmentation FM to introduce a dMTL-based for MRI-based IDH mutation status identification.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- the methodology is novel and relies on SOTA elements
- although the pipelines are focused on ML technical concepts, clinical rationale was also taken into account.
- the results are promising and the proposed model consistently outperforms the base lines
- the ablation study was well designed and its results validated efficacy of the design ideas
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

There is no major weakness
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

As a reviewer, my number one goal is to help you improve your manuscript rather than judge if it should be published or not. I am aware of how difficult it can be to prepare a manuscript, and I wish you good luck with this publication. Please feel free to disagree with any comment that I add, as no one knows your research better than you. If you find my style of review helpful, please use it when you serve as a reviewer.

Section 2.1 “To optimize dataset utilization, we established two experimental scenarios”. Please clarify how your approach would optimize dataset utilization. I recommend providing relevant details of the datasets (e.g., as a table) to support.

Section 2.1 “… cropped to dimensions of 96 × 96 × 96 voxels “. Please clarify why you chose this setting and how it limited your results. For example, figure 2 shows that the images did not include the whole brain. Did you miss any tumors (partially or completely)? You can highlight this in the limitations section of the manuscript.

[minor] 2.2, CMD Module: “… yielding features F_T2 and F_FFLAIR.” typo in F_FFLAIR

please provide full form of MCC. You can also add one sentence about each metric to highlight their importance in your multitask framework.

please provide more details about the occlusion sensitivity maps and how you derive them. Also, please cite relevant references.

when releasing your codes, dockerizing the pipeline would be helpful if possible.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Given that the manuscript conducts a comprehensive research, relies on SOTA architectures, and is novel in extending the base FMs, it deserves acceptance.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The manuscript was already well structured and was beyond the acceptance threshold. The authors addressed reviewer concerns as much as it was possible in the given time. Thus, I vote for acceptance.

Author Feedback

We thank the reviewers for their insightful comments and are pleased that they recognize the methodological novelty and strong validation of FoundBioNet. Below, we address the major concerns first, then responding to minor points. Q1 (R1): Novelty of TAFE/CMD FoundBioNet is, to our knowledge, the first to apply the SWIN-UNETR architecture to biomarker classification, as previous works used it only for segmentation. To retain the pretrained knowledge from 42k brain MRIs and keep overhead minimal, we preserve the 62M-parameter backbone and introduce two lightweight modules for efficient multi-task learning in a single forward pass. The TAFE module adds a parallel classification head on multi-scale encoder features and uses a joint segmentation-classification loss to guide tumor-focused feature learning, while the decoder remains dedicated to segmentation. The CMD module uses the decoder’s tumor probability map to softly gate T2 and FLAIR inputs, followed by a streamlined convolution–difference–pooling–attention pipeline to capture the clinically important T2-FLAIR mismatch. Q2 (R1, R3): Class-Imbalance Handling We applied targeted online augmentations (e.g., random flips, rotations, intensity scaling) to IDH mutant cases. This yielded robust performance on moderately imbalanced cohorts (TCGA: 42% vs. 58%; EGD: 33% vs. 67%) but a drop on highly skewed datasets such as UPenn-GBM (16 vs. 498). We will clarify this limitation and discuss potential advanced oversampling strategies (e.g., MixUp, CutMix) in the Limitations section. Q3 (R1): Model Comparisons BrainSegFounder-Tiny is segmentation-only and cannot perform classification. Although FoundBioNet improved segmentation over standalone SWIN-UNETR, MICCAI policy prevents us from adding new quantitative metrics. For IDH prediction, the only specialized foundation model available during our experiments was ViT (4- and 8-block), both outperformed by our model (Table 1). Since then, three preprints—BrainIAC, BrainMRIFM, and SSM 3D MAE—have emerged; only SSM 3D MAE has released weights, which we plan to benchmark in future work. Q4 (R1, R2): Occlusion-Sensitivity Analysis We will add to the Methods section that occlusion sensitivity maps were generated with MONAI’s OcclusionSensitivity utility by sliding a 16×16×16-voxel mask at 50% overlap over the 3D input, measuring the drop in ground-truth class probability, then applying Gaussian smoothing, inversion, and normalization to highlight key regions. Although standard quantitative metrics (Area Over the Perturbation Curve, sufficiency, comprehensiveness) exist, conference guidelines prohibit adding new results. Q5 (R3): Robustness to Segmentation Quality We will add to the Limitations section that we mitigated segmentation dependency through fine-tuned weight initialization, soft probability gating (≥10%) to retain global context, residual amplification via CMD, global average pooling, and classification-prioritized training (loss weighting and model selection).To further improve robustness under poor segmentation, we propose future directions including uncertainty-gated feature fusion, curriculum-based decoupling, and semi-supervised mask refinement. Q6 (R1, R2): Dataset Reporting We will add a table summarizing dataset characteristics. All six cohorts were used either for training/internal validation or external testing, as results reported in Table 1. To optimize dataset utilization, we tested performance on less skewed cohorts (TCGA, EGD) by designing two setups: (1) TCGA+UCSF and (2) EGD+UCSF for training/internal validation, with the remaining datasets used for testing in each scenario. Table 1 metrics (except TCGA) reflect setup 1, which yielded slightly better performance. Minor Points (R2, R3) We acknowledge that cropping may exclude some brain regions, though all scans were visually verified for tumor coverage. We will expand metric descriptions, enhance figure resolution, and note our plan to share the code upon acceptance.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

FoundBioNet: A Foundation-Based Model for IDH Genotyping of Glioma from Multi-Parametric MRI

Author(s):