Abstract

Cell microscopy data are abundant; however, corresponding segmentation annotations remain scarce. Moreover, variations in cell types, imaging devices, and staining techniques introduce significant domain gaps between datasets. As a result, even large, pretrained segmentation models trained on diverse datasets (source datasets) struggle to generalize to unseen datasets (target datasets). To overcome this generalization problem, we propose CellStyle, which improves the segmentation quality of such models without requiring labels for the target dataset, thereby enabling zero-shot adaptation. CellStyle transfers the attributes of an unannotated target dataset, such as texture, color, and noise, to the annotated source dataset. This transfer is performed while preserving the cell shapes of the source images, ensuring that the existing source annotations can still be used while maintaining the visual characteristics of the target dataset. The styled synthetic images with the existing annotations enable the finetuning of a generalist segmentation model for application to the unannotated target data. We demonstrate that CellStyle significantly improves zero-shot cell segmentation performance across diverse datasets by finetuning multiple segmentation models on the style-transferred data. The source code for CellStyle is publicly available at https://github.com/ruveydayilmaz0/cellStyle.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1040_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/ruveydayilmaz0/cellStyle

Link to the Dataset(s)

MP6843 from Cell Image Library: https://www.cellimagelibrary.org/project/P2043 BV-2, Huh7, and SHSY5Y from Live-Cell: https://sartorius-research.github.io/LIVECell/ DIC-C2DH-HeLa, Fluo-N2DL-HeLa, Fluo-C2DL-MSC, and Fluo-N2DH-GOWT1 from the Cell Tracking Challenge: https://celltrackingchallenge.net/2d-datasets/ Human kidney and cardia datasets from NuInsSeg: https://www.kaggle.com/datasets/ipateam/nuinsseg

BibTex

@InProceedings{YilRüv_CellStyle_MICCAI2025,
        author = { Yilmaz, Rüveyda and Chen, Zhu and Wu, Yuli and Stegmaier, Johannes},
        title = { { CellStyle: Improved Zero-Shot Cell Segmentation via Style Transfer } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15964},
        month = {September},
        page = {66 -- 76}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    CellStyle is proposed as a new unsupervised style transfer method for microscopy images that exploits the style injection in a stable diffusion model [ref 4 from the manuscript] and image rescaling. The method is proposed as way to avoid retraining DL models by translating the images into a style that is alike the style of the images used to train the model.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The authors propose adapting a recently published style-transfer approach with diffusion models to provide some means of domain adaptation.
    • They use a total of 6 experimental cases (6 pairs of datasets from different modalities) to test the proposed method
    • To provide proof of how style transfer can be used to avoid retraining models (denoted by the authors zero-shot segmentation), they choose 3 well known DL methods.
    • CellStyle for style transfer is compared against other 5 approaches, showing in general a higher downstream segmentation results.
    • The authors propose an ablation study for the adaptive score sclaing factor (alpha) and for the cell size ration (r), showing the benefits of introducing these two factors.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Using generative approaches for corss-modality and to create labelled data has already been done before:

    • Ervin Tasnadi, et al., Structure preserving adversarial generation of labeled training samples for single-cell segmentation, Cell Reports 2023. Therefore, given the paper as it is at the moment, the novelty of the proposed approach is the adaptation of a diffusion model for performing style transfer with microscopy images, rather than improving zero-shot segmentation. In this sense, I’m not fully confident that the proposed style transfer for retraining can be called zero-shot learning, as in reality the models are retrained using a fully supervised approach with synthetic pairs of data. While this is open for discussion, due to the already confusing terminology in the field and all the ambigous definitions given to AI terms, I would encourage the authors to revisit zero-shot definition and avoid using it if it does not fully fit to the proposed pipeline. StarDist is proposed as an approach to segment star-convex objects, so the usage proposed in this manuscript is not fully recommended.

    There are parts of the manuscript that are unclear:

    • It sais “the first work proposing zero-shot domain transfer for cell microscopy imaging without requiring the training of a style model.” and then “CellStyle significantly enhances the zero-shot performance of cell segmentation models”. Please revisit whether this is correct and indeed style transfer and segmentation are being used in a zero-shot manner.
    • Fig 1 as it is, without legends and with arrows connecting everywhere without further explanation, becomes confusing. For example, looking at it, is hard to tell what would simply be the input and expected output of the proposal. For example, it may make sense to distinguish which parts are trained which ones not, and the order in which each of the trainings are perform? Same for the differences between the training and inference mode of the pipeline.
    • Fig 2, please say what each dataset is.
    • Is the evaluation done on real images and using trye manual annotations? Please, be very clear about that in the text.
    • It is exmplained that for pairs 1 and 5, cellpose is retrained without the data from those pairs. This, as far as I understand, is to mimick the situation in which a model would be trained for a specific modality and has never seen images from the other modality that is faked. If so, please explain it clearly.
    • For the comparison, the author say “Additionally, in Table 3, we compare our method to other generative models on the downstream segmentation task in a non-zero-shot setting since they use labeled data during the training process.”. However it is unclear why in this case the setting is non-zeroshot, and what is the difference between the prposed methods and CellStyle that impedes using them in the same manner.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The justification is given in the weeknesses of the manuscript and the fact that the novelty with respect to what has been published already is poor.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose CellStyle, a zero-shot domain transfer approach that leverages a pretrained diffusion backbone (Stable Diffusion) to style-transfer labeled source-cell images toward the visual domain of an unlabeled target dataset. By preserving source-cell shapes (so that source annotations remain valid) but adapting texture, color, and noise, CellStyle enables fine-tuning cell segmentation models without requiring any labels from the target dataset.additional training, CellStyle enables improved zero-shot segmentation performance on diverse, unlabeled microscopy datasets.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Addresses domain shifts in cell microscopy with no labeled target data, a practical scenario where labeling is often expensive.
    • Uses Stable Diffusion for zero-shot style transfer, removing the need for specialized retraining of a style model.
    • Preserves cell shapes by swapping only keys and values in attention blocks, ensuring source annotations remain valid.
    • Achieves measurable gains in segmentation when fine-tuning on styled vs. pure source images.
    • Thorough ablations (e.g., cell size matching, adaptive scaling) reveal the contribution of each component.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    -Experiments focus on similar 2D microscopy domains, providing limited evidence for generalization to very different modalities (e.g., histopathology).

    -Comparisons to other unsupervised style-transfer or domain adaptation methods are limited, making it hard to evaluate relative novelty.

    -Relies on Stable Diffusion’s pretrained weights, with no detailed analysis of potential failure modes on more extreme shifts in cell imaging.

    -Requires approximate cell size matching via a pretrained segmenter, which may fail if that model is inaccurate for the target domain.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    CellStyle addresses a significant need, adapting cell segmentation models to new, unlabeled domains, by applying style transfer with Stable Diffusion and preserving source annotations. Its experiments and ablations show clear improvements over baseline segmentation and highlight each component’s contribution. However, evaluations on relatively similar microscopy domains limit certainty about broader generalization, and comparisons to other unsupervised style-transfer methods are sparse.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This manuscript presents a novel method for zero-shot cell segmentation. This method implements an elegant style transfer approach via stable diffusion. The authors address two practical issues, namely the potential size difference between source and target and the mismatch between scales of the corresponding attention rates. The validation is clear and thorough. Overall, I find this is a very good submission and compliment the authors on this work.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Elegant approach to addressing two practical issues that significantly boosts performance.
    • Clear presentation
    • Thorough validation, including several algorithms, various data sets, benchmarking on multiple alternative state-of-the-art approaches, and ablation studies.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Potential breach of anonymity
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. One of the benchmark approaches, namely [29], is cited as being published as part of the ISBI’2025 symposium, which was (at the moment of writing this review is actually still being) held 14 to 17 April. To my understanding, the proceedings are not being made public until (short before) start the conference. Which means that the authors might have had access to the aforementioned work before the proceedings were published. In the best-case scenario, it means that the author team, entirely or partially, authored that paper as well and did not do a good job concealing this fact.

    2. I agree with the authors that OPCSB is an objective measure of the cell segmentation quality. Hence, it would be interesting to see all the results, e.g., as a supplementary table, rather than the selected values reported by the authors.

    3. Whereas the introduced r ratio is very intuitive, the other parameter, α, is much less so. Looking at the results of the ablation study, reported in Table 4, I see that this parameter, in the majority of the cases, has much influence on the final performance. Hence, it would be interesting if the authors could provide a bit more insights about this parameter, e.g., for which cases it is the most beneficial, etc. I would also reformulate the sentence in the caption of this table that explains the “-“ entries, to make it clearer that for these entries the values used for the ablation study coincide with the calculated values for these cases.

    4. I would also add a sentence about the reported comparison with the state-of-the-art methods; to elaborate on the results reported in Table 3, which are calculated on different data sets for different benchmark methods, without providing any reasoning for this.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a very strong conference submission, with novel methodological component, through validation, and clearly presented.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    My only concern about this submission was possible breach of anonymity by the authors. Although they failed, in my opinion, to properly answer my concern about this issue, it does not have any impact on the quality of the paper. If the chairs see no problem with the anonymity, I think that this paper should be accepted




Author Feedback

We thank all the reviewers for their insightful and constructive feedback.

#1:

  1. [29] was already listed on Google Scholar as accepted to ISBI 2025, likely updated by the authors after internal acceptance. 2. As OPCSB is the average of SEG and DET and can be inferred from Tabs. 2, 3, 4 we had to omit it due to space constraints.
  2. We observed that using the param. alpha is most beneficial when the target and the source features are more dissimilar.
  3. We appreciate the reviewer’s insights and will address the remaining points more clearly in the final version.

#2: Zero-shot (ZS) naming: CellStyle (uses the pretrained Stable Diff. model, we do not train it further) transfers visual features from target to the source dataset while preserving the spatial structure of real source labels. These labels and the styled data is used to finetune seg. models previously trained on other cell datasets (but not on target, see Tab. 2). The finetuned models are then tested on the real target data. As no target labels are used at any stage (except for the final eval.), the setting is considered ZS but it still uses supervised training of seg. models. This naming aligns with established definitions, e.g., SAM [Kirillov et al., ICCV 2023] is trained with supervision but makes predictions on unseen datasets in a ZS manner. [15] refers to using a small number of target labels combined with a lot of source labels as few-shot. We use no target labels, hence the naming ZS is used. We acknowledge that these distinctions were not clearly stated in the submission and will clarify it in the final version. Non-ZS setting: In Tab. 3, the other methods use target labels to generate synthetic data, making them non-ZS. This is why for a fair comparison, we included a non-ZS version of our approach. Tasnadi et al.: This method synthesizes masks using a GAN and overlays textures via style transfer, requiring target labels and training (for GAN, pix2pix and seg. model)—making it more similar to [29] than our work. While both methods leverage style transfer, the implementations differ. Moreover, style transfer has been employed in multiple prior works (as cited in our intro.), and its use alone should not negate the novelty of our approach. Use of Stardist: We agree that Stardist is less suitable for non-star-convex shapes yet it remains a widely used generalist method. As noted, its performance varies across test pairs (better for 2, 3, 4, 6; worse for 1&5). Importantly, in pair 1 where it performs suboptimally, our method still improves its results (Tab. 2). Fig. 2 and Evaluation: Pair names are used in Fig. 2 for clarity and to avoid clutter; corresponding datasets are listed in Tab. 1. The evaluation is conducted on real images with manual annotations. See 4 for Rev. #1.

#3: Generalization to modalities: Our experiments were designed to reflect diversity in imaging characteristics, including different imaging techniques and cell types. Notably, Pair 6 is a histopathology dataset, addressing the concerns from the reviewer. Comparison to other style transfer models: As mentioned in the intro., a key advantage of our method is that it does not require to train the style model. While we agree that quantitative comparisons with additional style-transfer-based domain adaptation methods would strengthen the work, we opted to focus on SOTA generative methods (Tab. 3) in a broader perspective, given space constraints. Stable Diff. (SD) failure modes: Across all tested datasets, SD led to consistent improvements (Tab. 2, 3). While we acknowledge that failure cases exist, space limitations prevented us from including visual examples. Pretrained segmenter for size matching: This is a valid concern. However, this step requires only coarse seg. and we use a generalist model [16] to achieve that. Empirically, this approach has led to performance gains for all pairs (Tab. 4), supporting its utility despite potential inaccuracies in the initial seg. See 4 for Rev. #1.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This manuscript presents a diffusion model-based style transfer method for cell segmentation in microscopy images. It transforms labeled source images to target-style images, which are then used to fine-tune pre-trained segmentation models for cell segmentation in the target domain. The experimental results are promising. The rebuttal addresses the reviewers’ concerns regarding technical novelty, clarity of the presentation, and generalization to other modalities. It would be helpful to clearly explain the zero-shot setting in the (revised) manuscript.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top