Abstract

Tumor grading and Isocitrate Dehydrogenase (IDH) status are key prognostic biomarkers. Transformer-based methods are widely applied in glioma segmentation and diagnosis, but challenges still exist due to the tumor’s heterogeneity and the computational burden of Transformers. We propose a multi-task network called MTamba for glioma segmentation, IDH genotyping, and grading. We design Tetra-oriented Mamba to perform global information interaction from different orientations in MRIs for segmentation. We design a T2-FLAIR mismatch feature extraction module to explore the mismatch features between T2 and FLAIR images at different depths to enhance diagnosis. We propose a channel-space Siamese Mamba fusion module to fuse T2-FLAIR mismatch features with multi-modal MRI features from the segmentation encoder for diagnosis. Finally, we apply an uncertainty loss optimization method to jointly optimize glioma segmentation, IDH genotyping, and grading. We validate MTamba on the publicly available UCSF-PDGM and BraTS2020 datasets, and experimental results show that MTamba outperforms existing multi-task learning methods. The code for MTamba is available at https://github.com/xhwv/MTamba.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1717_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/xhwv/MTamba

Link to the Dataset(s)

https://www.cancerimagingarchive.net/collection/ucsf-pdgm https://www.med.upenn.edu/cbica/brats2020/data.html

BibTex

@InProceedings{LiXin_Tetraorientated_MICCAI2025,
        author = { Li, Xinyu and Liu, Jin and Kuang, Hulin and Wang, Yuanzhuo and Wang, Jianxin},
        title = { { Tetra-orientated Mamba with T2-FLAIR Mismatch Features for Glioma Segmentation, IDH Genotyping, and Grading } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15969},
        month = {September},
        page = {553 -- 563}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces MTamba, a novel multi-task learning framework that simultaneously performs glioma segmentation, IDH genotyping, and tumor grading using multi-modal MRI. The main technical contributions are as follows:

    1. Tetra-oriented Mamba (TeoM): The authors propose an extension of the existing Tri-oriented Mamba by introducing a Tetra-oriented variant that captures global dependencies across four orientations. This is designed to improve long-range context modeling in 3D MRI volumes.

    2. T2-FLAIR Mismatch Feature Extraction Module (MFEM): A dual-depth mismatch modeling strategy is introduced to extract both shallow and deep features from T2 and FLAIR sequences. This module aims to capture inter-modality discrepancies (i.e., mismatch signals) known to be indicative of IDH mutation and glioma grade.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Clinically Grounded Multi-task Design: The overall problem formulation—joint learning of glioma segmentation, IDH genotyping, and tumor grading—is clinically meaningful. The paper reflects current neuro-oncology practice, where IDH status and tumor grade are jointly used to determine prognosis.

    2. Attention to Domain-specific Biomarkers: The model incorporates a targeted mechanism (SMEM/DMEM) to exploit the T2-FLAIR mismatch, a known imaging biomarker of IDH-mutant gliomas.

    3. Exploration of State-Space Models in Volumetric MRI Analysis: The paper continues the exploration of Structured State Space Models (Mamba family) in the medical imaging domain. This represents a shift from transformer-heavy architectures, with potential computational benefits. The introduction of additional directional interactions (e.g., cross-slice) suggests a deliberate attempt to model 3D spatial dependencies in new ways.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Lack of Novelty: The proposed Tetra-orientated Mamba (TeoM) is essentially a naive extension of the Tri-orientated Mamba (ToM) used in SegMamba. It lacks substantial structural innovation or theoretical advancement.

    2. Missing Comparison with SegMamba: The proposed network is built upon SegMamba, but the authors do not provide any performance comparison with it. Furthermore, to validate the effectiveness of the proposed improvements, comparisons such as “ToM vs. TeoM” and “FUE + Residual block vs. LGCB” are essential. A comprehensive ablation study demonstrating improvements over SegMamba is needed for the reader to be convinced.

    3. Justification for SMEM

      • The motivation to extract T2-FLAIR mismatch features is valid, as it is a known imaging signature for IDH-mutant gliomas, especially IDH-mutant astrocytoma. However, the authors argue that “Directly subtracting T2 and FLAIR images will treat all regions equally, missing key mismatch information.” I disagree with this claim. In fact, the T2-FLAIR mismatch region tends to have strong intensity when directly subtracted, so the information is not necessarily lost.
      • The SMEM’s advantage over simple subtraction is not clearly demonstrated. While the overall model (MT) performs better than S1, it is unclear whether this is due to SMEM or simply due to additional network (DMEM). I suggest an additional experiment where T2 and FLAIR images are directly subtracted without SMEM and passed through DMEM, and the results are compared to justify SMEM’s necessity.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    < Minor Weaknesses and Suggestions >

    1. Figure 2: The architectural difference between channel-reverse interaction and reverse interaction is not visually clear. A clearer depiction or explanation is needed.

    2. Table 1: It would be more appropriate to follow the updated WHO 2021 classification [1]. For instance, tumor grades should be written as “Grade 2” instead of “Grade II”, and IDH status should be denoted as “IDH-wildtype” rather than “wild”. The authors might consider consulting recent works [2, 3] as useful references for future updates or revisions.

    3. Figure 4: Since the authors propose a module for detecting T2-FLAIR mismatch, it would strengthen the paper to visually confirm (e.g., via Grad-CAM) whether the model focuses on T2-FLAIR regions. See [2] for an example of such interpretability visualization.

    < References >

    1. Louis, David N., et al. “The 2021 WHO classification of tumors of the central nervous system: a summary.” Neuro-oncology 23.8 (2021): 1231–1251.

    2. Byeon, Yunsu, et al. “Interpretable multimodal transformer for prediction of molecular subtypes and grades in adult-type diffuse gliomas.” npj Digital Medicine 8.1 (2025): 140.

    3. Moon, Hye Hyeon, et al. “Generative AI in glioma: ensuring diversity in training image phenotypes to improve diagnostic performance for IDH mutation prediction.” Neuro-oncology 26.6 (2024): 1124–1135.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The decision to reject this paper is primarily driven by concerns regarding lack of novelty, insufficient experimental validation, and weak justification for key architectural components.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Although the architectural novelty is limited, the model delivers clear, significant improvements in Dice and AUC over state-of-the-art baselines. The authors have systematically addressed all reviewer concerns.



Review #2

  • Please describe the contribution of the paper

    The paper proposes a new model to perform glioma segmentation, IDH genotyping, and grading simultaneously. The model is based on a Tetra-oriented Mamba to effectively explore channel-wise and slice-wise interactions. A T2-FLAIR mismatch feature extraction module is designed to capture inter-modality differences, and a channel-spatial siamese mamba fusion module is proposed to fuse the T2-FLAIR mismatch and multi-modal MRI features. The main contribution of the paper lies in the improvement of the model from these perspectives to enhance the performance on the three tasks.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The multi-task learning framework is effective. (2) The proposed network improvements are reasonable. (3) Extensive experiments have been conducted to validate the effectiveness of the proposed method.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    (1) The methodological novelty of the paper is limited. (2) The standard deviations of the Dice values are very large. The marginal improvements in the mean values are not significant, I presume. (3) As far as I know, BraTS2020 provides 369 training data and 125 validation data. It is confusing that the authors only used a partial of the dataset. (4) The datasets are relatively small. Why not use the recently published BraTS2023 or BraTS2024 datasets?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    One of the major advantages of Mamba is its efficiency. The authors should evaluate the efficiency of the method after introducing the various modules.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the novelty of the paper lies in the design of specific network modules, which only bring marginal performance enhancement while possibly sacrificing the computational efficiency.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    Thanks to the authors for trying to address my initial concerns. However, after reviewing the rebuttal, I remain unconvinced about the methodological novelty and the performance improvement.



Review #3

  • Please describe the contribution of the paper

    Summary and Contributions: This paper presents MTamba, a multi-task learning framework based on structured state-space models (SSMs) for joint glioma segmentation, IDH genotyping, and tumor grading. The architecture introduces a Tetra-oriented Mamba (TeoM) for efficient global information modeling, a T2-FLAIR mismatch module for genotype-related features, and a channel-spatial fusion strategy. The model is evaluated on BraTS2020 and UCSF-PDGM datasets and shows improved performance over existing multi-task methods.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Major Strengths:

    1. Multi-task Design: Integrates segmentation, IDH prediction, and tumor grading into a unified pipeline.
    2. Novel Modules: Introduces unique components like TeoM and mismatch feature extraction to enhance interpretability.
    3. Benchmarking: Outperforms several baselines across multiple tasks and datasets.
    4. Availability: Code is made publicly available, increasing reproducibility.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Major Weaknesses:

    1. Complexity and Scalability: The architecture is sophisticated and may pose practical challenges in real-time or resource-constrained clinical environments. The reliance on multi-component modules (TeoM, mismatch extractors, fusion) increases the difficulty of deployment and debugging.
    2. Interpretability and Transparency: Despite its novel modules, the paper lacks visualizations or attention maps that would illustrate what the model is focusing on. This undermines trust in multi-task predictions for sensitive decisions like tumor grading or genotyping.
    3. Limited Component-Level Analysis: Although the architecture is modular, there are no detailed ablations showing the impact of each block (e.g., T2-FLAIR mismatch vs. segmentation-only encoder). This makes it unclear how essential each component is to performance.
    4. Segmentation Results Underexplored: While classification results are detailed, the segmentation component of the multi-task model is under-analyzed. Metrics like per-class Dice or comparison to established baselines (e.g., nnUNet) are missing.
    5. Relatively Small Dataset: Although BraTS and UCSF-PDGM are used, the effective number of training subjects per task is limited, especially given the multi-task nature of the framework. This makes the model more prone to overfitting.
    6. Potential Institutional Bias and Data Leakage: It is not clear if stratified subject-level splits were used to prevent leakage. Without proper controls, performance may reflect site-specific MRI patterns rather than true generalization.
    7. Lack of Comparison to High-Performing Benchmarks: Several multi-task architectures in recent literature have demonstrated higher segmentation Dice scores and classification AUCs. These should be cited and used as points of reference for evaluating the presented method.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Clarity of Presentation: Well-presented with detailed diagrams. However, some abbreviations and technical terms could use better definitions for clarity. Comments on Reproducibility: High reproducibility due to publicly available code and datasets. Modular breakdown of architecture supports transparency. Constructive Comments:

    1. Provide ablation studies on the T2-FLAIR module and TeoM contributions.
    2. Include per-class segmentation metrics to assess tumor structure performance.
    3. Add runtime/performance benchmarks to guide future adopters.
    4. Discuss potential for clinical integration and use cases. Comments on Experiments: Thorough benchmarking on multiple datasets and tasks. However, results would benefit from deeper component-level analysis.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Justification for Recommendation: This is a well-executed, innovative study with strong performance and open-source support. It balances novelty with utility, making it a valuable contribution to the field.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Addressed my comments.




Author Feedback

We thank the reviewers for their constructive comments.We address their concerns below.

Novelty of MTamba(R1Q7,R2Q1,R3Q1):MTamba is an improvement over SegMamba, which mitigates the limitations of ToM in exploring channel and long-range dependencies in feature sequences through Cross-slice and Channel Interaction in TeoM. w/o TeoM vs TeoM in Fig.4 shows TeoM enhances segmentation compared to ToM. Tab.6 shows the effectiveness of each module. Shared-weight MSCBs explore complementary information between T2 and FLAIR at multiple scales for weighted subtraction of T2 and FLAIR, enhancing the T2-FLAIR mismatch signals. S2 shows SMEM enhances the direct subtraction of T2 and FLAIR in Ref.12(JBHI2024). CSSMF explores the complementary information between T2-FLAIR mismatch features and multi-modal MRI features across spatial and channels, and verified in Discussion. MTamba achieves effective glioma segmentation, IDH genotyping, and grading. MTamba shows stronger performance compared to the multi-task method by Wen et al., published at the end of April in BIOMED SIGNAL PROCES 2025. We will include this comparison in the revised version.

More description for data(R1Q5Q6,R2Q3Q4):Most data in BraTS2020 have segmentation labels, but some lack genetic information. Since 2021, the BraTS has focused on segmentation tasks and lacks genetic information. The BraTS2020 we used is consistent with Refs.5,6,8(TMI2022,BIBM,Neuro-oncology2023) and includes more than TCGA in Ref.12(JBHI2024). We use UCSF-PDGM, BraTS2020 as they contain segmentation and genetic information.

Stability for MTamba(R1Q4,R2Q2):Unlike methods that focus on segmentation with large datasets, MTamba achieves high performance when trained on small datasets, with DICE and its variance outperforming other methods on both dataset.Compared to the segmentation variance reported by J Cheng et al. on BraTS2020, published in Big Data Min Anal 2025, and Ref.5(TMI), our variance is lower. M2, based on 2D MRI slices, provides more extensive training but with lower performance.We will collect more data for validation and develop semi-supervised methods in the future.

Efficiency of MTamba(R1Q1,R2): SegMamba is known for its efficiency, and MTamba shows comparable efficiency. Grouped convolutions-based LGCBs are more efficient than Convolutions-based FUE. TeoM does not alter the parameters of ToM, and the increased computation primarily comes from efficient channel-reverse operations. SMEM is based on efficient low-channel shared-weight convolutions. The DMEM is the same as the encoder of SegMamba using TeoM, which is efficient. CSSMF uses depthwise separable convolutions, known for their efficiency, along with a shared-weight TeoM.The public code shows MTamba increases only half the parameters compared to SegMamba, while performing well in 3 tasks, achieving a trade-off.

Comparison with SegMamba (R1Q3Q7,R3Q2): SegMamba is primarily composed of ToM, FUE, and GSC(unchanged). Tab.6S6,Fig.4 show ToM vs TeoM, S4 shows the FUE+Residual block vs LGCB, and S7 shows cross-slice interaction in TeoM vs inter-slice interaction in ToM, showing the superiority of MTamba. We outperform SegMamba in segmentation and will add it to the revised version.

Justification for SMEM(R1Q3,R3Q3):Tab.6S2 compares SMEM with directly subtracting T2 and FLAIR. We placed the visualization of SMEM in the anonymous link. We verify that SMEM, exploring the complementary information between T2 and FLAIR, weighting them separately, and then subtracting them, is effective.

Tumor grading(R3):We follow page 4 of Ref.6(Neuro-oncology2023) to classify LGG and HGG, and wrote it as Grade II. We assure changes will be made.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    The paper introduces MTamba, a multi-task learning framework for glioma progression modeling, with a clinically grounded design that incorporates survival prediction and segmentation. Reviewers commend the paper’s motivation and integration of clinically relevant tasks, as well as its clear presentation. However, significant concerns were raised regarding the limited methodological novelty, lack of justification for architectural design choices, and questions around the scalability and generalizability of the approach. These issues affect the strength of the technical contribution and the clarity of its innovation over prior work. I recommend inviting the authors to submit a rebuttal to address these key concerns.

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Post rebuttal, this paper has received two accepts and one reject. Despite some limitations in terms of architecture, the study seems to produce some improvements at the level of integrating different components together.

    Before this paper can be accepted, the authors must fix the standard errors everywhere in the tables and text.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    BraTS2020 provides 369 training data and 125 validation data. It is confusing that the authors only used a partial of the dataset and do not use the recently published BraTS2023 or BraTS2024 datasets.



back to top