Abstract

Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks are constrained by their local receptive fields, and vision transformers suffer from high quadratic complexity of their attention mechanism. Recently, Mamba-based models have gained great attention for their impressive ability in long sequence modeling. Several studies have demonstrated that these models can outperform popular vision models in various tasks, offering higher accuracy, lower memory consumption, and less computational burden. However, existing Mamba-based models are mostly trained from scratch and do not explore the power of pretraining, which has been proven to be quite effective for data-efficient medical image analysis. This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Our experimental results reveal the vital role of ImageNet-based training in enhancing the performance of Mamba-based models. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models. Notably, on AbdomenMRI, Encoscopy, and Microscopy datasets, Swin-UMamba outperforms its closest counterpart U-Mamba by an average score of 2.72%. The code and models of Swin-UMamba are publicly available at: https://github.com/JiarunLiu/Swin-UMamba.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1627_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1627_supp.pdf

Link to the Code Repository

https://github.com/JiarunLiu/Swin-UMamba

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Liu_SwinUMamba_MICCAI2024,
        author = { Liu, Jiarun and Yang, Hao and Zhou, Hong-Yu and Xi, Yan and Yu, Lequan and Li, Cheng and Liang, Yong and Shi, Guangming and Yu, Yizhou and Zhang, Shaoting and Zheng, Hairong and Wang, Shanshan},
        title = { { Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents the integration of the Mamba module with Swin-UNet, resulting in the formation of Swin-UMamba and its variant. The study explores the potential of pre-training by initializing the VSS blocks and patch merging layers with parameters that were pre-trained on the ImageNet dataset. To evaluate the effectiveness of the proposed method, three 2D datasets were employed.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The Mamba module is a new and promising addition to the field. Its exploration, improvement, and application within the medical imaging domain should be encouraged.
    2. The proposed model demonstrates superior segmentation performance compared to other advanced segmentation models.
    3. This paper is well-written and organized.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. My primary concern lies in the limited novelty of the method. This approach simply replaces the Transformer block with the VSS block, without any further modifications or improvements. Although this paper may be the first to utilize pre-training in Mamba-based methods, this technique is commonly used and its effectiveness is well-documented.

    2. The paper does not comprehensively demonstrate the effectiveness and efficiency of the Mamba module. It claims that the Mamba module offers higher accuracy, lower memory consumption, and reduced computational burden compared to traditional Transformers. However, it lacks the necessary ablation studies to substantiate these claims.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Refer to weaknesses.
    2. In Table 2, the reason behind setting up an experiment without deep supervision and adjusting the epoch number to 200 needs clarification.
    3. In Table 2, it appears that a higher NSD value indicates better performance. Could you please clarify why this is the case?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Given the limited novelty of the method and the insufficient experimental results provided, I recommend a ‘Weak Reject’ for this paper.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposed the Swin-Umamba model with ImageNet-based pretraining. Experimental evaluation showed it outperformed other comparison methods, including the U-Mamba.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. It proposed a new mamba variant for medical image segmentation

    2. It is a simple yet effective idea to use the pretraining for medical image segmentation

    3. Experimental evaluation is comprehensive

    4. The author claimed to make the implementation code publicly available

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. All comparison methods for abdominal MRI were 2D, is that correct? Can you clarify?

    2. Please double check the reference [18] for the abdomenMRI dataset

    3. I would like to suggest to report the standard deviation for the DSC and NSD in Table 2

    4. It would be helpful to improve the presentation of Fig.2. (a) and provide more visualization examples in the supplementary materials

    5. It would be interesting to explore the corresponding 3D based architecture

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please see the weakness above

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is a nice paper with a novel structure and comprehensive evaluations.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper introduces a novel Mamba-based model, Swin- UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Swin-UMamba demonstrates superior performance compared to CNNs, ViTs, and latest Mamba-based models, on AbdomenMRI, Encoscopy, and Microscopy datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Proposal of two Mamba-based networks Swin-UMamba and Swin-UMamba† for medical image segmentation, which are designed to unify the power of pretrained models with computation requirements towards real-world deployment.
    • First attempt to discover the impact of pretrained Mamba-based networks in medical image segmentation.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Implementation specifics for other models from the literature are not clearly defined. For instance, it’s unclear whether the same number of epochs was employed as with the pre-trained models.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Add the implementation details of the models considered for comparisons (i.e., are they the same used for the method proposed?)

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors introduce a novel architecture and showcase its effectiveness across three distinct datasets. They also reaffirm the significance of pre-training.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank the area chair and all reviewers for their thorough evaluation and insightful comments. We will update our camera-ready version according to the reviewers’ comments.




Meta-Review

Meta-review not available, early accepted paper.



back to top