Abstract

Tumor segmentation plays a critical role in histopathology, but it requires costly, fine-grained image-mask pairs annotated by pathologists. Thus, synthesizing histopathology data to expand the dataset is highly desirable. Previous works suffer from inaccuracies and limited diversity in image-mask pairs, both of which affect training segmentation, particularly in small-scale datasets and the inherently complex nature of histopathology images. To address this challenge, we propose PathoPainter, which reformulates image-mask pair generation as a tumor inpainting task. Specifically, our approach preserves the background while inpainting the tumor region, ensuring precise alignment between the generated image and its corresponding mask. To enhance dataset diversity while maintaining biological plausibility, we incorporate a sampling mechanism that conditions tumor inpainting on regional embeddings from a different image. Additionally, we introduce a filtering strategy to exclude uncertain synthetic regions, further improving the quality of the generated data. Our comprehensive evaluation spans multiple datasets featuring diverse tumor types and various training data scales. As a result, segmentation improved significantly with our synthetic data, surpassing existing segmentation data synthesis approaches, e.g., 75.69\% -> 77.69\% on CAMELYON16. The code is available at https://github.com/HongLiuuuuu/PathoPainter.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1254_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/HongLiuuuuu/PathoPainter

Link to the Dataset(s)

DCIS dataset: https://doi.org/10.1038/s41374-021-00540-6 CATCH dataset: https://doi.org/10.1038/s41597-022-01692-w CAMELYON16 dataset: https://doi.org/10.1001/jama.2017.14585

BibTex

@InProceedings{LiuHon_PathoPainter_MICCAI2025,
        author = { Liu, Hong and Yang, Haosen and Huijben, Evi M. C. and Schuiveling, Mark and Su, Ruisheng and Pluim, Josien P. W. and Veta, Mitko},
        title = { { PathoPainter: Augmenting Histopathology Segmentation via Tumor-aware Inpainting } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15975},
        month = {September},
        page = {410 -- 420}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents PathoPainter, a conditional latent diffusion-based approach for generating synthetic histopathology image–mask pairs to augment tumor segmentation datasets. The method inpaints tumor regions while preserving background, using regional embeddings from other images as content conditioning. It further proposes an uncertainty-aware filtering strategy that excludes poorly aligned regions, aiming to improve the utility of synthetic samples. The approach is evaluated across three datasets (DCIS, CATCH, CAMELYON16), showing modest improvements in segmentation performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper tackles an important problem — the high cost of detailed tumor annotations in histopathology.
    • Use of regional tumor embeddings from different images to drive diversity in synthetic tumor inpainting is a potentially useful idea.
    • The uncertainty-based filtering of synthetic regions is intuitive and empirically shown to boost performance.
    • Experiments span multiple public datasets, and synthetic data demonstrates modest improvements in segmentation IoU.
    • Visualizations (Fig. 3) illustrate that PathoPainter achieves better alignment between synthetic tumors and masks compared to prior methods.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Limited novelty: The core idea of conditioning a latent diffusion model on tumor structure has already been explored in prior work (e.g., DiffTumor [3]). The main novelty here — using embeddings from a different WSI — is a relatively minor extension. This is a relatively incremental extension rather than a fundamentally new idea. Embedding mixing and semantic conditioning are already explored concepts in the generative modeling literature.
    • Effect of filtering dominates improvement: Ablation results suggest that most performance gain stems from the filtering strategy (Table 2c), not the embedding-based inpainting. When filtering is not used, performance (IoU = 61.62%) drops below DiffTumor’s performance (62.43% in Table 1).
    • Limited ablations: Ablations are conducted only on one (DCIS) dataset with 8 WSIs. The absence of broader ablations undermines the strength of conclusions drawn about the contribution of each component. Ablations on CATCH or CAMELYON16 could have strengthened their conclusions, especially on performance scalability
    • Ambiguous WSI-level evaluation: Training and generation are done at patch level (256x256), yet results are reported at WSI level without clear explanation of how predictions are aggregated.
    • Superficial scale analysis: Comparisons across “scale” (e.g., 10 vs. 20 WSIs) show small differences and lack motivation. It is unclear what scaling effect is being analyzed or how it relates to data efficiency.
    • Unclear filtering mechanism: The procedure for excluding uncertain regions is vaguely described. It seems pixels with low agreement with a pretrained segmentation model are masked out from the loss, but this is never formalized or illustrated.
    • Insufficient evaluation: Results are based solely on IoU. Other metrics (e.g. Hausdorff distance) and analyses comparing the distribution of real vs. synthetic data (e.g., using learned embeddings or statistical distances) are missing.
    • No clinical validation: There is no evidence from pathologists or domain experts evaluating the realism or utility of the generated samples.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • Please clarify how evaluation is done at the WSI level when training and inference are performed on patches.
    • Consider performing ablations across more datasets and include a deeper analysis of how synthetic samples influence generalization.
    • The filtering mechanism is promising, but should be better explained and potentially visualized.
    • Additional evaluation metrics and a distributional analysis comparing real and synthetic data would significantly strengthen the work.
    • Releasing code or pretrained models would greatly enhance the impact and usability of the method.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper addresses an important problem and includes some promising ideas, the contribution is relatively minor, and experimental evidence is insufficient to support a strong case for acceptance at MICCAI.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    While the authors have clarified several implementation details and reiterated the practical relevance of their approach, the rebuttal does not fully address the primary concerns about the paper’s limited novelty, unclear attribution of performance gains, and insufficient experimental depth.



Review #2

  • Please describe the contribution of the paper

    The paper addresses the challenge of generating accurate and diverse histopathology image-mask pairs for tumor segmentation. It proposes to reformulate the task as a tumor-aware inpainting problem, where only the tumor region is generated and the background is preserved. The generation is conditioned on regional tumor embeddings from other images to increase diversity while maintaining biological plausibility. It also introduces a filtering strategy to remove uncertain synthetic regions using a pretrained segmentation model. The method improves segmentation performance across datasets, especially in low-data settings.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. It proposed a method to generate synthetic turmor images where it perverse background and only inpaint turmor areas with embeddings from other images to increase diversity.
    2. the paper introduces a effective approach to exclude unreliable synthetic regions by comparing them against predictions from a pretrained segmentation model.
    3. The method is evaluated on three histopathology datasets with varying tumor types and scales. It consistently improves segmentation IoU, especially in low-data regimes, and outperforms other synthetic data generation baselines.
    4. The method is practically applicable to different use cases in medical image field where data annotation is costly.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. While the formulation is effective, most components (latent diffusion, VQ-VAE, cross-image embedding) are adaptations of existing methods and similar methods are seen in non-medical image field. The main contribution lies in system integration rather than algorithmic innovation.
    2. Limited direct evaluation and comparison on quality of synthetic images.
    3. Limited evaluation on quality of diversity of synthetic images.
    4. It would be good to see how the method can be generalized to different domains.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a practical approach to generating synthetic data via tumor-aware inpainting. The formulation is sensible and results show consistent gains in segmentation. However, the methodological novelty is limited, and the evaluation lacks analysis of image quality and generalization. Overall, it’s a solid application paper with some merit but falls short in novelty.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors introduce PathoPainter, a method aimed at augmenting histopathology datasets for tumor segmentation. To generate image-mask pairs, they utilize an inpainting approach, where images are produced using a denoising diffusion probabilistic model, conditioned on both masks and regional tumor embeddings.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The exposition is clear and well-structured.
    • The combination of input selection and conditioning on the diffusion model is a novel approach for augmenting datasets for tumour segmentation
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • In Figure 1, the DiffTumor and PathoPainter images appear identical. If the tumor region is meant to differ, this is not noticeable due to the low transparency of the target region.
    • Only two state-of-the-art approaches are mentioned in the introduction. The authors should improve the contextualization of their work by discussing a broader range of relevant methods, especially mentioning what conditioning mechanisms have been used
    • The authors should specify which datasets the pre-trained models were trained on, whether they were trained on general-purpose datasets or histology-specific datasets. Since some pre-trained models seem to be trained on general-purpose datasets, the authors should address the limitations of this choice and discuss potential impacts on their method’s performance.
    • In Figure 2(b) (inference), it is unclear why the first part of the architecture is not removed. Specifically, the encoding of the image, masked image, and mask should only be input during training. During inference, only noise and the condition should be fed into the diffusion model (at least according to the text, unless the input image, masked image, and mask are used as a condition for the diffusion process).
    • The meaning of “8WSI” and “20WSI” in Table 1 is unclear. Are these the synthetic datasets used? If so, why are the results changing even when no augmentation is applied?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is clear, well-structured, and presents a novel approach, although the novelty is somewhat limited. Some parts of the paper could be enhanced to improve clarity and provide more detailed explanations (see weaknesses).

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We sincerely thank the reviewers for their positive feedback, recognizing its promising idea(R1), novel approach(R3), consistent improvements(R1,R2), clear and well-structured presentation(R3), and solid and effective methodology(R2). We appreciate the suggestions and will address them in the revised version, as space allows. R1&R2 - Limited Novelty. Regarding our contribution and novelty, we emphasize that this is the first work to successfully apply diffusion-based inpainting for histopathology image-mask pair generation. This is non-trivial due to the structural complexity and heterogeneity of histology images. As shown in Table1, directly applying DiffTumor to the histology domain yields little to no performance gain from synthetic data. While introducing semantic conditioning seems intuitive, it presents challenges: although it increases diversity, it also introduces uncertainty(e.g., noisy regions), which can degrade performance(Table2c). To mitigate this, we introduce a filtering mechanism to exclude uncertain regions, ensuring the model learns from reliable synthetic data. Our contribution goes beyond technical innovations for enhancing dataset diversity while maintaining biological plausibility. As noted by reviewers, our work “tackles an important problem—the high cost of detailed tumor annotations in histopathology”(R1) and is “practically applicable to different use cases in medical image field where data annotation is costly”(R2). We firmly believe this makes a meaningful advancement in histopathology, both through the development of PathoPainter and the validation on diverse datasets. R1 - Effect of filtering. We clarify that filtering helps by removing uncertain regions from the synthetic data, but it is not the sole reason for the performance gain. When the same filtering is applied, DiffTumor and STEDM still underperform, achieving 62.51% and 60.37%, compared to ours at 63.67% on the 8 DCIS WSIs. This is because DiffTumor generates images nearly identical to the originals, making filtering ineffective, while STEDM often breaks the original layout, resulting in most synthetic images being discarded. In contrast, our tumor-aware inpainting generates diverse yet structurally valid images. Although some noise is introduced, filtering effectively reduces its impact. R1 - Limited ablations. While our ablation is limited by space, additional results follow the same trend. For example, applying filtering on 10 CAMELYON16 WSIs improved performance from 70.72% to 72.42%, further confirming its effectiveness. R1 - Evaluation level. We apply patch-level processing during both training and testing. R1&R3 - Scale analysis. For DCIS and CAMELYON16, we sample ~10% of training slides to simulate low-data settings; for CATCH, we follow STEDM and use 7 slides. We use ~2× scaling to assess the impact of additional annotations. R1&R2 - Metric and clinical validation. We primarily use segmentation metrics, as our goal is to reduce annotation cost for segmentation tasks. However, adding more evaluation protocols would strengthen the paper. R2 - Generalization. Our method is generalizable, but due to page limit, we focused on histology. Broader extensions will be explored in future work. R3 - Figure 1. We present different synthesis methods with varying conditions and target regions on the same reference image. The grey area denotes the generation region, and the green arrow indicates the conditioning source. R3 - More related work. We will add as space allows. R3 - Pre-trained models. The VQ-VAE is pretrained on ImageNet, and HIPT is pretrained on TCGA. The filtering model is trained on real annotated dataset. While VQ-VAE may introduce some reconstruction issues due to domain mismatch, this can be addressed by retraining it on domain-specific data. R3 - Figure 2b. The masked image and corresponding mask are concatenated with noise to ensure that only the tumor area is inpainted. R1&R2&R3 - We will provide the code upon acceptance.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This manuscript introduces a diffusion model-based image synthesis method to enhance histopathological image segmentation. It formulates image-mask pair generation as an image inpainting problem, with conditioning on background masks and regional tumor embeddings. The experiments show that the method is able to improve image segmentation performance in three histopathological image datasets. The rebuttal addresses the reviewers’ concerns about technical/implementation details. It will be helpful if the manuscript can clearly explain what the technical novelty of the proposed method is.



back to top