List of Papers Browse by Subject Areas Author List
Abstract
Domain adaptive segmentation (DAS) of numerous organelle instances from large-scale electron microscopy (EM) is a promising way to enable annotation-efficient learning. Inspired by SAM, we propose a promptable multitask framework, namely Prompt-DAS, which is flexible enough to utilize any number of point prompts during the adaptation training stage and testing stage. Thus, with varying prompt configurations, Prompt-DAS can perform unsupervised domain adaptation (UDA) and weakly supervised domain adaptation (WDA), as well as interactive segmentation during testing. Unlike the foundation model SAM, which necessitates a prompt for each individual object instance, Prompt-DAS is only trained on a small dataset and can utilize full points on all instances, sparse points on partial instances, or even no points at all, facilitated by the incorporation of an auxiliary center-point detection task. Moreover, a novel prompt-guided contrastive learning is proposed to enhance discriminative feature learning. Comprehensive experiments conducted on challenging benchmarks demonstrate the effectiveness of the proposed approach over existing UDA, WDA, and SAM-based approaches.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1547_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/JiabaoChen1/Prompt-DAS
Link to the Dataset(s)
MitoEM dataset: https://mitoem.grand-challenge.org/
BibTex
@InProceedings{CheJia_PromptDAS_MICCAI2025,
author = { Chen, Jiabao and Xiong, Shan and Peng, Jialin},
title = { { Prompt-DAS: Annotation-Efficient Prompt Learning for Domain Adaptive Semantic Segmentation of Electron Microscopy Images } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15965},
month = {September},
page = {521 -- 530}
}
Reviews
Review #1
- Please describe the contribution of the paper
The authors present Prompt-DAS, a domain adaptation model that combines a multi-task architecture—with both image and prompt/point encoders—and prompt-guided contrastive learning to jointly predict binary segmentation masks and object center points for mitochondria detection in electron microscopy images. Pseudo-labels for both tasks are generated using a teacher-student framework.
The model is designed to operate under unsupervised domain adaptation (UDA), weakly supervised domain adaptation (WDA), and interactive domain adaptation settings.
The proposed method is evaluated against state-of-the-art approaches using the two training volumes of the publicly available MitoEM dataset. According to the reported results, Prompt-DAS consistently outperforms competing methods across all tested domain adaptation scenarios.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The authors propose a novel architecture that supports multiple domain adaptation paradigms—unsupervised (UDA), weakly supervised (WDA), and interactive—reducing the reliance on manual annotations and making the approach highly flexible in practical scenarios. The evaluation is conducted on a publicly available dataset (MitoEM), which enhances the reproducibility of the work and facilitates fair comparisons with future methods.
According to the reported results, Prompt-DAS consistently outperforms existing methods across all evaluation metrics and domain adaptation settings. Notably, in the interactive setup, the model achieves performance close to the fully supervised baseline using only 15% of point prompts, highlighting the effectiveness of the proposed prompt-guided strategy.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
Key architectural details of the proposed model are missing, which significantly limits its reproducibility. While the image encoder (ViT-S/8 trained with DINO) is specified, there is no information about the architecture of the prompt encoder, the decoder, or the MLP components.
The paper does not provide access to the implementation (e.g., a code repository), nor does it detail how the baseline state-of-the-art methods were trained and optimized, making it difficult to verify the experimental results or replicate the comparisons.
Training and inference times are not reported. This information is particularly relevant in the context of the interactive setting, where runtime efficiency can significantly impact the practicality of the proposed method.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Reproducibility and evaluation practices should be strengthened. Specifically, the paper lacks details regarding the hyperparameter search space for each model (including baselines), the number of training and evaluation runs, and the validation strategy. I encourage the authors to adopt best practices for experimental reporting, such as those outlined by Dodge et al. in “Show Your Work: Improved Reporting of Experimental Results” (2019).
Minor comments:
- The acronym “PCL” is used before it is defined (page 5). Please ensure all acronyms are introduced before use.
- Page 5: The statement “Both datasets consist of 500 images with a resolution of 4096×4096 pixels” is slightly misleading—this refers to the image dimensions, not the resolution. Consider rephrasing for accuracy.
- Page 5: The sentence “The model is trained on one GTX 4090 GPU with 24 GB memory” likely contains a typo—presumably “GTX” should be “RTX”.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
In its current form, the paper lacks fundamental details that critically impact its reproducibility and overall scientific value. Key components of the proposed model architecture—such as the prompt encoder, decoder, and MLP—are not described, and there is no information about the training setup for the alternative methods used in the comparisons. Moreover, the absence of publicly available source code further limits the ability to replicate or build upon this work. Given these omissions, I do not believe the paper meets the standards required for acceptance.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
Considering MICCAI’s policy on not allowing additional experiments after submission, I have evaluated the authors’ rebuttal based on the original submission, the reviewers’ comments, and the proposed clarifications and textual modifications. While all reviewers rightly pointed out the lack of architectural details and the absence of released code—both of which hinder reproducibility—I expect that these issues will be addressed in the camera-ready version.
Despite relying on established building blocks, the proposed method introduces a practical and annotation-efficient approach to prompt-based domain adaptation for semantic segmentation. The results convincingly outperform competing methods on the dataset considered, highlighting its potential utility in the field.
Overall, I find this to be a reasonable and worthwhile contribution, and I support its acceptance.
Review #2
- Please describe the contribution of the paper
This manuscript proposes a framework, which utilizes point information of organelle instances as a prior for training and testing step, for organelle segmentation in an electron microscopy image. This framework achieves a minimal annotation burden by using sparse points as prior information and organelle center-point detection as auxiliary task for the . This auxiliary task enables us to handle any number of points including zero for prior-point input in this framework. Furthermore, pseudo-label learning for both the segmentation and centre-point detection tasks under the mean teacher framework (teacher model trained on a small dataset and student model on domain-adaptation target dataset) is used for handling label scarcity on the target domain. The authors quantitative and qualitative evaluations with ablation study of by using publicly available MitoEM dataset.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The authors tackles domain adaptation problems in electron microscopy images.
-
The proposed framework can reduce the annotation burden by using sparse points as prior information and organelle center-point detection as auxiliary task.
-
The proposed framework can handle any point number for prior information for the organelle segmentation both in training and testing step by using both unsupervised and semi-supervised manners.
-
The mean teacher framework adopted in the proposed framework presented domain adaptation learning. In this learning, contrastive learning is also used to learn more discriminative feature representation for pseudo-label generation.
-
Experimental evaluations are convincing.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
Wordy presentation. From the current presentation, there is no need to relate the proposed framework to the prompt approach such that SAM. The term “Prompt learning” looks misleading.
-
In section 2, \mu_k is undefined.
-
Unclear descriptions about f_P, f_D, f_S and f_R.
-
In the proposed framework, the performance of a teacher is important. However, there is no experimental evaluation about it.
-
The authors use only training and testing datasets. Model selection procedure is unclear. Howe did they select the best trained model from epochs? This point hinders fair evaluations in the experiments.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This work has many above-mentioned strengths. At the same time, this also has several weaknesses. Its technical novelty is incremental. Since the proposed framework looks fine as the development of a good tool, but each techniques used in the frame work are existing methods. Totally, I think weak accept for MICCAI presentation.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors gave feedback to my comments and these are satisfactory. I hope that the authors add some explanation about mathematical notations that I commented to present the repeatability of this work as the final version.
Review #3
- Please describe the contribution of the paper
The authors present an annotation efficient domain adaptation method for organelle segmentation in electron microscopy images. At train or test time an arbitrary amount of point prompts can be given to improve the segmentation also allowing for interactive improvement. To perform the domain adaptation (here between EM data from humans and rat) a student teacher setup with a contrastive objective is deployed.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
the method is heavily evaluated against other point prompt methods outperforming all its baselines and almost reaches the upper bound of the fully supervised baseline the paper is well written and good structured
-
in their approach the authors developed a versatile framework, especially considering the challenges in many object scenes as given in cell segmentation, which requires frameworks that can handle sparse and quick annotations, this makes the approach enticing beyond EM data
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
The proposed exemplary task (EM volumes) chosen here is intrinsically 3D, but the proposed method operates in 2D. Evaluating the usability of the proposed method for this showcase end-to-end would also require a matching analysis between different z-planes. A 3D approach might be more fitting to the task at hand (Alternative it would be interesting to see how well the approach performs beyond the world of EM data)
-
The claimed interactive improvement sounds very enticing, but at least the Human to Rat data adaptation doesnt support this claim. The adaptation Rat to Human displays at least a slight increase in performance with more point prompts. To support this claim things like displaying the standard deviation or a more comprehensive qualitative analysis might help.
-
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- regarding the supervised upper bound: I assume that the corresponding model was trained directly on the target domain -> should be made more clear and be stated in table to avoid confusion
- Footnote: it would also be interesting to evaluate the performance in the case of high variability in point prompts from image to image during adaptation training since this would be a plausible scenario too, i.e. more points on just few images and just some or none on many.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The proposed method is valuable to the community, still the method is tested on a showcase where 3D segmentation approaches might be more fitting. To assess the method usability to other modalities it would need further benchmarking. Despite that the evaluation is performed nicely.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
I would like to thank the authors for answering my questions, hence I will keep my original recommendation of a weak accept
Author Feedback
We thank the reviewers for their constructive comments. R1: Model architectural details. A: We will include architectural details into the main text. Our decoder is similar to that of SAM but with additional Attention Masks in self-attention and cross-attention to prevent information leakage. In contrast to SAM, our prompt encoder is a standard positional embedding. Our MLP is the same as the MLP of SAM and other standard transformers. More details can be found in our released code. R1: Training details of baselines. A: In Table 1, † indicates the fine-tuning using the source data, and ‡ indicates fine-tuning using the source data and target data with 15% sparse point labels. Concretely, WeSAM and WDA-Net, which can employ weak labels, were trained using the same data setting as our method; Med-SA was finetuned using the labeled source data, while SAM and SAM-Med2D did not undergo finetuning. We adhered to their default settings to train/fine-tune WeSAM, WDA-Net, and Med-SA. For other methods, we directly used the results they reported or their default settings to reproduce the comparison methods. R1R2R3: Code for reproducibility. A: The code repository was not released due to the anonymization requirement and will be released after the review stage. R1: Inference and training times. A: For interactive segmentation, our model is much more efficient for inference. Given all point prompts, our model predicts all instances in one pass, while other SAM-like methods, including SAM, WeSAM, Med-SA, and SAM-Med2D, segment only one object instance in one pass. While all these methods take a similar time (about 0.3s) for one pass with a 1024*1024 input, the inference time of other SAM-like methods depends on the instance number. Moreover, our model can achieve similar performance with only 15% partial points, taking 1/5 of the annotation time of full points. Our model was trained for 6 hours for adaptation on one RTX 4090 GPU, while the finetuning times of WeSAM, and Med-SA are 9 and 12 hours, respectively. Moreover, all other SAM-like models were trained on billion-scale datasets. More details will be found in our released code. R3: Effect of more point prompts during testing. A: Since our model already shows minimal gap (only 1.2% in Dice) with the supervised upbound for Human to Rat data adaptation, it can be expected that additional point annotations will lead to minimal performance gain, similar to most interactive methods. However, for the more challenging Rat to Human data adaptation, more point prompts can improve the performance. R3: 3D vs 2D model. A: Our model is a 2D model, like many EM segmentation methods. Its benefit is threefold. 1) Time efficiency during inference and memory efficiency for training, especially for large-scale EM images and for model adaptation; 2) 2D method suits for volumes with both isotropic resolution and anisotropic resolution, while 3D models usually show degraded performance with large anisotropic resolution; 3) 2D images contain additional features and our 2D model takes advantage of the object detection task to assist the segmentation task. Moreover, our 2D model is flexible to integrate inter-slice information of a volume, which has been investigated by many studies. R2: Model selection from epochs. A: Due to the absence of target labels, we use an ensemble-based selection approach proposed by Dapeng Hu et al. in NeurIPS2024. An ensemble of models with different training epochs is used as a role model for directly assessing candidate models. R2: Performance of the teacher model. A: Since the teacher model is updated with EMA and not used for final prediction, its performance is similar to the student model and not reported as typical mean-teacher methods. R2: Method. A: Similar to SAM, our model can conduct interactive segmentation with point prompts. Different from SAM-like methods, our method can utilize both partial and full point prompts and conduct one-pass cross-domain segmentation.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The authors provided satisfactory replies to the reviewers’ comments. The camera-ready version should address the lack of architectural details and the absence of code release.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
After reading the rebuttal, I agree with the consistent acceptance recommendation raised by three reviewers.