Abstract

Accurate classification of muscle invasion in bladder cancer using computer-aided diagnosis (CAD) is crucial for timely intervention and improved prognosis. Despite advances in deep learning for medical image analysis, muscle invasion classification remains limited by the scarcity of publicly available annotated datasets. To address this, we introduce T2WI-BCMIC, the first expert-annotated dataset for bladder cancer muscle invasion classification. T2WI-BCMIC contains Non-fat saturated T2-weighted magnetic resonance imaging (MRI) images with five-class annotations, covering various invasion depths. We establish a benchmark using several popular deep learning architectures, providing a solid foundation for future comparisons. However, achieving further performance improvements remains challenging due to the small dataset size. Therefore, we propose a novel search-based data augmentation algorithm that increases data diversity by maximizing the divergence from the class-specific manifold, while preserving the class distribution to maintain class identity. Experimental results on T2WI-BCMIC show that our algorithm outperforms existing methods, achieving significant performance improvements. The T2WI-BCMIC dataset and benchmark are available at: https://github.com/T2-MI/T2WI-BCMIC for further research.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3675_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{HuaHan_T2WIBCMIC_MICCAI2025,
        author = { Huang, Han and Chen, Weiyi and Wu, Qiuxia and Wang, Huanjun and Cai, Qian and Guo, Yan},
        title = { { T2WI-BCMIC: Non-Fat Saturated T2-Weighted Imaging Dataset for Bladder Cancer Muscle Invasion Classification } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {573 -- 583}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a new non-fat saturated T2-weighted MRI dataset for bladder cancer muscle invasion classification, annotated using the VI-RADS system. The dataset includes 353 (or 335) images across five risk categories. Along with the dataset, the authors also propose a novel data augmentation algorithm that optimizes augmented samples by means of a genetic algorithm.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Introduction of a publicly available dataset for muscle invasion classification provided with expert annotations
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The augmentation method is not clearly grounded in existing literature.
    • The Genetic Algorithm implementation is under-specified: What is the fitness function? How are individuals (chromosomes) encoded? What are the evolutionary parameters (e.g., population size, crossover/mutation rate)? What type of evolution strategy is used?
    • The comparative approaches used are not explained, nor is the rationale for selecting these specific techniques provided
    • The results show only modest improvements, and no statistical analysis is provided.
    • The rationale for selecting the three specific models used in the augmentation experiments is not clearly explained
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The dataset contribution is valuable but the proposed augmentation method lacks sufficient contextual grounding in related work, and the experimental rigour is not yet strong enough to fully support the claims.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduce a first expert-annotated dataset for bladder cancer muscle invasion classification.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper introduces the first expert-annotated dataset for bladder cancer muscle invasion classification, and proposes a search-based data augmentation algorithm that maximizes the divergence from the class-specific manifold while preserving the class distribution to maintain class identity.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The novelty of the paper is limited. The Keypoint Detection Module simply filters contours based on predefined pixel intensity thresholds and ranks the selected contours based on area and positional bias. The Augmented Data Search Module (ADSM) is intended to be the main contribution of the paper. However, the Augmented Data Search Module primarily relies on a Genetic Algorithm, which is a well-established technique and not inherently novel in this study.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty of the paper is limited. Please see the major weaknesses section.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper introduces a dataset for the bladder cancer muscle invasion classification, builds benchmark models, and proposes an augmentation methods which can be useful for improving the model performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper introduces a new dataset for the bladder cancer muscle invasion classification, which can be useful to accelerte the research on this problem. The benchmarks are well-estabilished and the experiments demostrate that the proposed augmentation method can significantly improve the performance.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The dataset is not very large, and collected at a single center, having the risk of bias.
    2. As the foundation models are developing fast, including some benchmarks based on the foundation models can be beneficial, especially when the datset is not very large.
    3. The augmentation method is not as straightforward as other augmentation methods. It would be beneficial to further analysis the cost of applying this augmentation method, such as the time complexity.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The dataset proposed in this paper can be useful for bladder cancer muscle invasion classification research. Some benchmark models are established and the proposed augmentation method can improve the performance, demosntrated by the experiments.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors addressed some of my concerns in their feedback. I also agree with the comments from other reviewers, particularly regarding the limited novelty. However, I prefer to view this paper as one that introduces new questions and datasets. From that perspective, I find it acceptable, if the dataset and benchmarks will be open-sourced and reproducible. Therefore, I will keep my score unchanged.




Author Feedback

We thank all reviewers (Reviewer #1, Reviewer #4, Reviewer #5) for their insightful comments and helpful suggestions.

R1.1: Lack of literature grounding for augmentation. The method is grounded in deformation-based techniques, widely adopted for their simplicity and strong support for anatomical plausibility and clinical interpretability [1]. This point will be clarified in the Introduction of the revised manuscript.

R1.2: GA Implementation.

Fitness Function: The Constrained Optimization Model (COM, Sec. 3.1) serves as the fitness function, assessing augmentation quality. As our main technical contribution, further details are provided in R5.1. Encoding: Individuals encode affine transformations as lists of transformed keypoints, e.g., [[x1, y1], [x2, y2], [x3, y3]]. Strategy & Parameters: A standard GA is employed, following PlatEMO [2] and our code implementation. Parameters include population size 100, crossover probability 1.0, and mutation rate 1/16 per coordinate (16 coordinates in total). A more detailed explanation will be added in the revised manuscript.

R1.3: Missing rationale for comparative methods. Traditional augmentations (e.g., rotation, Cutout) are chosen because they serve as widely used, data-independent baseline methods[3]. SOTA approaches are selected due to their strong performance and prominence in recent literature, ensuring a comprehensive and fair comparison.

R1.4: Modest gains, no statistical analysis. The performance gain is substantial given the small-sample nature of medical data. Our main contribution lies in the novel problem modeling that enables these improvements.

R1.5: Model selection rationale. GoogLeNet was chosen for its lightweight architecture, while Res2Net50 and ResNeSt50 offer advanced design and proven performance in medical imaging tasks[4, 5]. Their pre-trained versions enable effective transfer learning on limited data.

R1.6: No mention of open access. The code and dataset are being organized for public release at: https://github.com/T2-MI/T2WI-BCMIC

R4.1: Small, single-center dataset. This limitation is acknowledged. To reduce bias, demographic diversity was considered during data collection (see supplementary). Multi-center validation will be prioritized in future work.

R4.2: Foundation model benchmarks. Several foundation models were evaluated (VGG, ResNet, ViT, Swin). Transformer-based models such as ViT and Swin yielded poor accuracy (<50%) and were thus excluded from final comparisons due to limited relevance.

R4.3: Complexity and cost. The computational cost can be estimated as O(T × N × (C_image + C_eval)), with T as iterations and N as population size. While more complex than one-shot methods, the search process adaptively discovers more effective augmentations, making the cost worthwhile.

R5.1: Limited novelty. Though KPDM and ADSM use established tools, the key innovation lies in the fitness model—our Constrained Optimization Model (COM, Sec. 3.1)—which directs the search process. The GA and deformation type are replaceable components; the real novelty is the adaptive, mathematically guided augmentation strategy.

[1] Goceri, Evgin. “Medical image data augmentation: techniques, comparisons and interpretations.” Artificial Intelligence Review 56.11 (2023): 12561-12605.

[2] Tian, Ye, et al. “PlatEMO: A MATLAB platform for evolutionary multi-objective optimization [educational forum].” IEEE Computational Intelligence Magazine 12.4 (2017): 73-87.

[3] Wang, Yulin, et al. “Regularizing deep networks with semantic data augmentation.” IEEE Transactions on Pattern Analysis and Machine Intelligence 44.7 (2021): 3733-3748.

[4] Xu, Wanni, You-Lei Fu, and Dongmei Zhu. “ResNet and its application to medical image processing: Research progress and challenges.” Computer Methods and Programs in Biomedicine 240 (2023): 107660.

[5] Yao, Xujing, et al. “Glomerulus classification via an improved GoogLeNet.” IEEE Access 8 (2020): 176916-176923.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Although creating a novel expert-annotated dataset is relevant for further research in the field, the limitations in the proposed solution and some high-level motivations of the implementation limit the consideration of the work for publication. The reviewers agree on the interest related to creating a new dataset that can open new studies, but they have also identified limitations in the proposed augmentation strategy and the motivation behind model choices and proposed comparison.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The rebuttal addressed several concerns, e.g., the availability of the dataset to the public. However, key concerns regarding novelty and scientific quality, although were discussed in the rebuttal, were unconvincing.



back to top