Abstract

Active Learning (AL) is a promising solution in medical image segmentation to reduce annotation costs by selecting the most informative training samples. However, traditional warm-start AL methods rely on iterative querying and fail to address the cold-start dilemma. While Cold-Start Active Learning (CSAL) attempts to resolve this, current methods are limited to 2D images and neglect Self-Supervised Learning (SSL)’s potential for uncertainty estimation in AL. Moreover, while hybrid uncertainty-diversity sampling has been discussed in warm-start setting, the efficacy of this combined approach is not explored in CSAL. In this paper, we present CSAL-3D: a novel Cold-Start Active Learning framework for 3D medical image segmentation. Firstly, a CSAL-adapted SSL pipeline for ensemble-based uncertainty estimation and 3D-oriented feature extraction is proposed. Secondly, a novel Uncertainty-Reinforced Diversity Sampling (URDS) strategy is introduced, which synthesizes cluster representativeness and sample-level uncertainty in a hierarchical process. It can select samples that are both uncertain and representative in one shot. Experiments on Brain Tumor, Heart and Spleen organ segmentation tasks from CT or MRI 3D images show that CSAL-3D outperforms other state-of-the-art CSAL counterparts. The source code is available at https://github.com/HiLab-git/CSAL-3D.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2315_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/HiLab-git/CSAL-3D

Link to the Dataset(s)

Medical Segmentation Decathlon dataset: http://medicaldecathlon.com/

BibTex

@InProceedings{ZhuNin_CSAL3D_MICCAI2025,
        author = { Zhu, Ning and Ye, Ping and Zhong, Lanfeng and Yue, Qiang and Zhang, Shaoting and Wang, Guotai},
        title = { { CSAL-3D: Cold-start Active Learning for 3D Medical Image Segmentation via SSL-driven Uncertainty-Reinforced Diversity Sampling } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {120 -- 130}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The main contribution of the paper is the introduction of CSAL-3D, a novel Cold-Start Active Learning framework for 3D medical image segmentation that integrates Self-Supervised Learning (SSL) for ensemble-based uncertainty estimation. It proposes an Uncertainty-Reinforced Diversity Sampling (URDS) strategy to select both diverse and uncertain samples. This method aims at addressing the cold-start dilemma in active learning, particularly focusing on 3D images which have been less explored compared to 2D images.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The logical structure of the paper makes it easy to follow the authors’ arguments and understand their contributions. (1) A clear methodology that combines SSL-driven feature extraction with uncertainty estimation, specifically tailored for 3D medical images. (2) The URDS strategy, which effectively merges diversity and uncertainty sampling to improve model performance under low annotation budgets. (3) Comprehensive experiments demonstrating superior performance over other state-of-the-art methods on various datasets.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    (1) The abstract could be improved for clarity and focus, making the key contributions more accessible. (2) Results from Figure 2 indicate that the proposed method performs well under moderate annotation budgets but may not excel in very low or very high budget scenarios. Authors should provide insights into why this phenomenon occurs. (3) Formulas are presented without sufficient contextual integration, potentially hindering readability. (4) In Figure 3, the Heart dataset results show minimal difference between the proposed method and others, suggesting limited advantage in certain scenarios. (5) It is unnecessary and not ideal to start a new paragraph immediately after Equation (7).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a significant methodological advancement in the field of medical image segmentation by proposing a novel framework that addresses the cold-start problem in active learning. Despite some minor issues with presentation and explanation, the technical quality and experimental validation support its acceptance.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, a new approach called CSAL-3D is introduced, which is designed for Cold-Start Active Learning (CSAL) in 3D medical image segmentation. The key ideas are:

    A CSAL-adapted Self-Supervised Learning (SSL) framework is developed. This framework is tailored to extract meaningful 3D features from the images and, importantly, to estimate the uncertainty associated with those features.

    A novel strategy called Uncertainty-Reinforced Diversity Sampling (URDS) is introduced. This strategy guides the selection of which samples should be annotated, by choosing samples that are both informative (uncertain) and representative of the data as a whole.

    A core aspect of the framework is that it tightly integrates SSL into both feature extraction and uncertainty estimation. This is a departure from previous methods that handled these two aspects separately.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novel Formulation: The paper introduces a new Cold-Start Active Learning (CSAL) framework (CSAL-3D) specifically designed for 3D medical image segmentation. The extension to 3D is a novel and important contribution.

    Original Way to Use Data (SSL for Uncertainty Estimation): Using SSL for uncertainty estimation in this context is an original and interesting approach.

    Sample selection: The paper introduces a new sample selection strategy called Uncertainty-Reinforced Diversity Sampling (URDS). This strategy combines both uncertainty and diversity sampling in the CSAL setting allowing selection of samples that are both informative and representative.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Uncertainty: The paper does not assess the calibration of the uncertainty estimates, which undermines the claim that the method captures meaningful uncertainty. Further the paper would benefit from more discussion on uncertainty itself and the prior work on uncertainty estimation. See [1,2] for calibrated uncertainty assessment.

    Validation: The scope of validation is narrow and evaluation is restricted to three segmentation tasks from the Medical Segmentation Decathlon, without exploring broader anatomical, modality diversities.

    Claim: The following claim is contradictory “However, existing CSAL for medical image segmentation is limited to 2D images. Liu et. al presented a CSAL benchmark for 3D medical image segmentation…”.

    Computational Complexity: The method introduces significant computational complexity—requiring multi-view data augmentation, ensemble reconstructions, and multi-kernel clustering—yet offers no discussion of runtime, scalability, or resource requirements, which are especially critical in medical imaging pipelines.

    [1] Majeedi et al. RICA2: Rubric-Informed, Calibrated Assessment of Actions, ECCV 2024 [2] Oh et al Modeling Uncertainty with Hedged Instance Embedding ICLR 2019

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See strengths and weaknesses

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper contributes to solving the Cold Start Dilemma in Active Learning by proposing a method that enables data sampling based on information content, even in the absence of prior knowledge. Through experiments that balance annotation cost and model performance, the proposed approach demonstrates its effectiveness as an Active Learning strategy in Cold Start scenarios.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    One of the major strengths of this paper lies in its methodological novelty and its ambitious attempt to tackle a challenging problem. It not only extends the analysis from 2D to 3D imaging to extract trainable information for Active Learning, but also demonstrates that the segmentation encoder can be effectively trained with fewer labeled samples by leveraging uncertainty information from other self-supervised tasks such as rotation prediction and inpainting.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The quantitative analysis of the sampled data is relatively limited. Including such analysis would have strengthened the support for the effectiveness of the proposed method. For instance, tumor size is an important factor in ensuring diversity within the training set, which in turn can significantly impact model performance.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (6) Strong Accept — must be accepted due to excellence

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper clearly defines the challenging problem of the Cold Start Dilemma and proposes an effective solution by leveraging self-supervised learning (SSL) to optimize data utilization. Moreover, it addresses a critical real-world need by aiming to reduce annotation burden, which is especially important in clinical practice. Active Learning in the context of medical imaging holds great potential for minimizing unnecessary use of sensitive medical data, reducing reliance on personal information, and lowering computational resource demands. For these reasons, the paper is considered as impactful contribution.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

Dear Area Chair and Reviewers, Thanks very much for your insightful feedback. We have carefully considered the comments and addressed the key point below.

[Response to Reviewer #1]

  1. Abstract improvement. Response: We will revise it for clarity and focus.
  2. Proposed methods does not excel in very low or very high budget scenarios. Response: We respectfully clarify that, in Figure 2, our method achieves the best Dice under high annotation budgets (30 for Brain Tumor, 5 for Heart and Spleen), and consistently ranks top-2 under low budgets (10 for Brain Tumor, 3 for Heart and Spleen). We choose Dice as the main metric here, as it better reflects segmentation accuracy than HD95. The strength of our method lies in its robust performance across datasets and budgets. Other methods show more fluctuation across settings.
  3. Formulas without sufficient contextual integration. Response: We will fix this in revision.
  4. Minimal difference for Heart dataset Response: We agree that 2D visual differences in Figure 3 are minor for Heart. However, Figure 2’s 3D Dice scores better reflect model performance, as 2D views show only one slice and can miss key differences. The visualizations are illustrative, not conclusive.
  5. It is unnecessary start a new paragraph immediately after Equation (7). Response: We will fix it in revision.

[Response to Reviewer #2]

  1. Lack of assessment of the calibration of the uncertainty estimates. Response: Our work focuses on the practical utility of uncertainty in guiding sample selection, which is validated by strong performance gains (Fig. 2). While we did not explicitly assess calibration, results suggest our estimates are effective in practice. We agree that exploring formal calibration is a meaningful future direction and appreciate the references.
  2. Narrow scope and restricted evaluations. Response: Our selected tasks already cover diverse anatomies (brain, heart, abdomen) and modalities (multi-modal MRI, single-modal MRI, CT). These datasets represent key segmentation challenges and include both single- and multi-class targets. Meanwhile, we acknowledge that evaluating broader anatomical and modality diversity would further strengthen our claims. This is an important direction for future work, and we plan to extend our validation to additional datasets and imaging modalities.
  3. Contradictory sentence writing. Response: We will fix this in revision.
  4. Computational complexity issues. Response: The most time-consuming stage of our workflow is self-supervised pre-training, which takes approximately 1 day on a single NVIDIA 1080Ti GPU. This is a one-time cost and is acceptable considering the improved uncertainty estimation it brings. During inference, sample selection introduces minimal time overhead, as the selection process leverages pre-computed features. Therefore, the method remains practical.

[Response to Reviewer #3]

  1. Limited quantitative analysis (e.g. tumor size). Response: We agree that tumor size can reflect one aspect of data diversity, especially in medical image segmentation. However, our primary goal in active learning is to select the most valuable samples to improve model performance. While diversity contributes to sample quality, our method evaluates sample value in a more comprehensive way. As shown in our quantitative results (Figure 2), our approach consistently improves model performance across datasets, validating its effectiveness. We also acknowledge that incorporating tumor size as an explicit diversity factor is a valuable idea, and we plan to explore this direction in future work to examine whether it brings further performance gains. That said, we would like to point out a practical challenge: in active learning scenarios, all data are initially unlabeled, and precise tumor size information is not available before annotation. This makes it difficult to directly use tumor size as a selection criterion during sampling.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top