Abstract

The Transformer architecture has demonstrated remarkable results in 3D medical image segmentation due to its capability of modeling global relationships. However, it poses a significant computational burden when processing high-dimensional medical images. Mamba, as a State Space Model (SSM), has recently emerged as a notable approach for modeling long-range dependencies in sequential data, and has excelled in the field of natural language processing with its remarkable memory efficiency and computational speed. Inspired by this, we devise \textbf{SegMamba}, a novel 3D medical image \textbf{Seg}mentation \textbf{Mamba} model, to effectively capture long-range dependencies within whole-volume features at every scale. Our SegMamba outperforms Transformer-based methods in whole-volume feature modeling, maintaining high efficiency even at a resolution of {$64\times 64\times 64$}, where the sequential length is approximately 260k. Moreover, we collect and annotate a novel large-scale dataset (named CRC-500) to facilitate benchmarking evaluation in 3D colorectal cancer (CRC) segmentation. Experimental results on our CRC-500 and two public benchmark datasets further demonstrate the effectiveness and universality of our method.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0663_paper.pdf

SharedIt Link: https://rdcu.be/dZxek

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72111-3_54

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Xin_SegMamba_MICCAI2024,
        author = { Xing, Zhaohu and Ye, Tian and Yang, Yijun and Liu, Guang and Zhu, Lei},
        title = { { SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15008},
        month = {October},
        page = {578 -- 588}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces SegMamba that combines the U-shape structure with Mamba, specifically designed to effectively capture long-range dependencies within whole-volume features at various scales. This approach addresses the challenge of modeling global relationships in high-dimensional medical images while maintaining superior processing speed compared to Transformer-based methods. Additionally, the paper contributes a new large-scale dataset, CRC-500, for 3D colorectal cancer segmentation, filling a gap in the availability of comprehensive datasets in this domain. Furthermore, SegMamba achieves state-of-the-art performance on multiple datasets, which practically good for the community.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A 3D Mamba model is developed for efficient feature representation.
    2. A new dataset is collected along with annotation masks for 3D colorectal cancer (CRC) segmentation.
    3. The tri-orientated Mamba (ToM) module is introduced for feature representation, though the motivation behind it is unclear.
    4. The gated spatial convolution (GSC) module and feature-level uncertainty estimation (FUE) module are implemented to improve feature representation.
    5. An extended ablation study is conducted.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The motivation behind the ToM module lacks comprehensive support and good reasons.
    2. No statistical tests were conducted to validate its effectiveness. 3. Additionally, the impact of the uncertainty module remains unclear to me.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The ToM module simply applies the Mamba module to three sequences. It’s unclear to me why all three sequences are necessary or crucial for medical tasks. The Mamba module should be able to capture long-range dependencies from the sequence, and processing three sequences seems unnecessary and could add extra computational burden.

    2. In 3D medical images, uncertainty usually occurs in the representation of object boundaries. How can the FUE enhance the inherent uncertainty of the representation? No feature visualizations are provided to support this claim.

    3. The performance gain on the CRC-500 dataset is much higher than on two other public datasets when compared with state-of-the-art approaches. What is the reason for this? Did the authors use any pre-training weights? Were all methods evaluated under the same settings?

    4. Since uncertainty is mentioned in the method, I would also expect to see statistical tests of the model’s performance.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    User The motivation and necessity of the proposed modules are not clear to me and no visualization support or acceptable reasons are provided for them. see the detailed comment section.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presents a novel medical image segmentation method based on a Mamba model, enhanced with spatial convolutions to not loose the spatial context, and feature-level uncertainty estimation modules that achieves competitive to improve segmentation results on a colon cancer dataset, brats and aiib.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    A novel segmentation deep learning architecture is constructed with a range of interesting ideas. It demonstrates very good performance across a range of medical image segmentation datasets (colon/brain cancer and airway segmentation). The architecture is clearly described and should be reproducible. Contributing a new dataset is great, assuming it will become publically available.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The resource requirements for training the model (time, memory, number of parameters) could be stated more clearly, also in the context of the ablation study (1000 epochs sounds like a lot, is it needed?).

    A discussion of how much the improvement in the metrics is practically relevant for the different segmentation cancers could be useful. Clarifying some statements in the visual comparison under 3.3 could be useful - what is meant be better consistency? How was it determined that SegMamba detects a greater number of branches in AIIB?

    It is not clear if the CRC-500 dataset will be publically available.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I’m not quite sure if the CRC-500 dataset in the paper will be publically available or not, or if it is at all possible to gain access. I’d expect it will be.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Clarify availability/accessibility of the CRC-500 dataset. It may also help that the paper refers to brats and aiib as well from the start, as this was not quite clear to me before much later in the paper.

    Add a discussion on the resources required by model training and inference, including for the variations in the ablation study.

    Add a discussion on how practically relevant the improvement in the metrics, maybe specifically for CRC-500, as much as this is possible.

    Under visual inspection in 3.3, clarify what is meant by better consistency and how it was judged that it can detect a grater number of branches for aiib. The better boundary detection for brats can probably be justified by the HD95 scores.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents strong segmentation results with a new model architecture that is well explained and demonstrated. There are only minor weaknesses (see above, if addressed well these would of course strengthen the paper).

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper suggests a new implementation of the long-sequence state space model Mamba for 3D medical image segmentation, as a multi-directional and spatial dependencies core module, as well as a channel-wise feature uncertainty using a specialized block. The model is tested on the BraTS2023 glioma segmentation and AIIB2023 airway segmentation benchmarks, and demonstrates superior performance compared to selected baselines, as well as improved memory and training/inference time costs compared to convolutions, Swin and quadratic self-attention. Moreover, they launch a new dataset benchmark for colorectal cancer, CRC-500.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Structure: The paper is clearly written and has relevant references within the field.

    • Improved results: The authors present solid results and significantly improved performance both in terms of segmentation and memory/time footprint across multiple modalities and segmentation objectives. Comparisons with results displayed in the paper with non-cited works seem to confirm the improvement introduced in this method.

    • Method innovations: The introduction of the feature-level uncertainty estimation block is not integral to the proposed methodology, but still contributes a notable amount to the results of the ablation. The authors also address 3D spatial modelling with Mamba using three views. The paper performs exhaustive ablations of their model additions, and while just using Mamba as a backbone still outperforms the baselines, their contributions significantly boost this performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • References: The paper could have benefited from more references to recent general segmentation literature. It is also not obvious why the cited, and highly relevant, reference U-mamba was not included in the comparisons, but it may suspected that U-mamba coincides with method M1 in the paper.

    • Lacks discussion on components: There is discussion lacking on the meaning of the implemented channel-wise FUE-block, and what impact it has on the subsequent feature representation in the model. The same lack of discussion applies to the ToM blocks.

    • Visual comparisons: The paper claims better continuity for branches in AIIB2023, but provides no quantitative measures to support that claim.

    • Statistical analysis: When using smaller datasets it is generally appropriate to perform some sort of k-fold cross-validation, to ensure that the specific data split is not especially favourable.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper has also promised to release a new dataset on colorectal cancer, CRC-500.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Small figures: Fig. 4 is very, very small to actually read. Although since there is a strict page limit and it is passable to read online, it is acceptable. However, I encourage the authors to add arrows and text in the captions to tell the reader what to focus on in the images.

    • FUE discussion: It would seem that the formulation of uncertainty as an entropy would lead to the homogenization of feature channels. While it is hard to say if this is beneficial, it would be an interesting dissection of the block and its functionality.

    • Clarification on the method names: There are inconsistencies between the method names “MX” in the tables and the text.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    • Improved results: The improved results over the baselines are consistent and quite large. The lowered memory footprint is a key selling point in the medical segmentation community.

    • Translatable method innovations: The paper makes contributions on how to handle 3D data with Mamba and makes proper ablations to prove their effectiveness.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

N/A




Meta-Review

Meta-review not available, early accepted paper.



back to top