Abstract

Polyp detection is crucial for colorectal cancer screening, yet existing models are limited by the scale and diversity of available data. While generative models show promise for data augmentation, current methods mainly focus on enhancing polyp diversity, often overlooking the critical issue of false positives. In this paper, we address this gap by proposing an adversarial diffusion framework to synthesize high-value false positives. The extensive variability of negative backgrounds presents a significant challenge in false positive synthesis. To overcome this, we introduce two key innovations: First, we design a regional noise matching strategy to construct a negative synthesis space using polyp detection datasets. This strategy trains a negative-centric diffusion model by masking polyp regions, ensuring the model focuses exclusively on learning diverse background patterns. Second, we introduce the Detector-guided Adversarial Diffusion Attacker (DADA) module, which perturbs the negative synthesis process to disrupt a pre-trained detector’s decision, guiding the negative-centric diffusion model to generate high-value, detector-confusing false positives instead of low-value, ordinary backgrounds. Our approach is the first to apply adversarial diffusion to lesion detection, establishing a new paradigm for targeted false positive synthesis and paving the way for more reliable clinical applications in colorectal cancer screening. Extensive results on public and in-house datasets verify the superiority of our method over the current state-of-the-arts, with our synthesized data improving the detector by at least 2.6% and 2.7% in F1-score, respectively, over the baselines. Codes are at https://github.com/Huster-Hq/DADA.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3018_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/Huster-Hq/DADA

Link to the Dataset(s)

kvasir-SEG: https://datasets.simula.no/kvasir-seg/

BibTex

@InProceedings{ZhoQua_Targeted_MICCAI2025,
        author = { Zhou, Quan and Luo, Gan and Hu, Qiang and Zhang, Qingyong and Zhang, Jinhua and Tian, Yinjiao and Li, Qiang and Wang, Zhiwei},
        title = { { Targeted False Positive Synthesis via Detector-guided Adversarial Diffusion Attacker for Robust Polyp Detection } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15970},
        month = {September},
        page = {599 -- 609}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    1. Builds upon a background-only diffusion model to learn purely negative (non-polyp) patterns, then perturbatively guides the generation process (through the Detector-guided Adversarial Diffusion Attacker, DADA) to produce “polyp-like” interference that deliberately confuses a trained detector.

    2. Rather than randomly creating new backgrounds or standard polyp images, it generates challenging negative examples—false positives that resemble polyps yet are actually background—so detectors learn to avoid these misclassifications more effectively.

    3. Demonstrates substantial performance gains (≥2.6% and ≥2.7% in F1-score on the public Kvasir and in-house datasets, respectively) over baseline methods, thus broadening the training data’s diversity and robustness for polyp detection tasks.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper addresses an overlooked yet critical issue in polyp detection: false positives. By synthesizing challenging negative samples that resemble polyps, the proposed method targets the precise weakness of current models, thereby reducing false alarms more effectively.

    2. It combines a background-only diffusion model with the Detector-guided Adversarial Diffusion Attacker (DADA). This integration ensures that generated samples focus on negatively coded regions while still providing realistic, high-value training data to confuse and improve the detector.

    3. The proposed inpainting strategy maintains anatomical consistency outside the user-defined region, ensuring realistic contextual integrity. Consequently, the synthesized images appear natural while introducing critical and difficult-to-detect false positives.

    4. Extensive experiments on both open-source (Kvasir) and in-house datasets demonstrate consistent performance gains (≥2.6% and ≥2.7% in F1-score) over baselines, validating the framework’s effectiveness across diverse clinical scenarios.

    5. The paper is well-structured, presenting a coherent progression from the introduction of polyp detection challenges, through the proposed method’s design and implementation details, to comprehensive experimental validation and discussion. This clear logical flow aids readers in following and understanding the research contributions.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Although the proposed adversarial diffusion framework yields high-value false positives, it involves iterative denoising steps and adversarial perturbations, potentially increasing computational overhead. The paper provides little discussion on real-time feasibility for clinical settings or scalability to larger datasets.

    2. While the method is specifically tailored for colorectal polyp detection, the paper does not extensively examine whether the approach generalizes well to other lesion-detection scenarios or medical imaging domains with different anatomical and imaging characteristics.

    3. The paper does not thoroughly analyze scenarios where the approach might underperform—such as cases with extremely low image quality, rare polyp appearances, or very small training sets. This leaves unanswered how robust the method remains under challenging real-world conditions.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Refer to the Strengths and Weaknesses section.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper
    • The paper proposes a method for targeted false positive synthesis in polyp detection. By combining diffusion models with adversarial attacks, it successfully addresses the key issue that existing data augmentation methods neglect the generation of false positive samples.
    • The paper designs a regional noise-matching strategy to train a background-only denoiser. This strategy masks the polyp regions to force the model to focus on learning background patterns.
    • During the denoising process, the paper innovatively injects adversarial perturbations, enabling the model to generate misleading high-value false positive samples instead of ordinary background samples.
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper approaches the problem from a novel perspective and innovatively solves the pain point of false positives in polyp detection. Most current methods focus on positive sample generation, while this paper, for the first time, targets the frequent false positive problem in polyp detection. By using the adversarial diffusion framework, it generates high-value negative samples with a polyp-like features, filling the gap in the optimization of false positive samples in data augmentation.
    • The paper draws on the principles of adversarial attacks and integrates the gradient feedback of the detector into the denoising process, guiding the generated samples to break through the decision boundary of the detector. This achieves a qualitative leap from “ordinary backgrounds” to “high-value false positives”, significantly enhancing the challenge of the synthetic data to the detector.
    • The paper verifies the effectiveness of the model in the detection task, and the improvement is remarkable.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • There are doubts about the perturbation optimization in Section 2.2. DADA aims to generate false positive samples to mislead the detector. Its purpose is to make the detector focus on the false positive regions rather than the true positive regions. That is, the more deceptive the false positive samples are, the lower the detection loss L_{det} is, and the greater the perturbation is. The more unrealistic the false positive samples are, the higher the detection loss is, and the smaller the perturbation is. Then, will those unrealistic false positive samples never be further optimized?
    • The paper lacks a comparison with more intuitive methods. For example, compared with simple and rough false positive sample generation methods such as “copy-paste” and “light dot generation using OpenCV”, how effective are the false positive samples sampled by the generation method proposed in this paper?
    • In Table 1 of the main experiments, it is unclear whether the training dataset is all generated data or a mixture of real and generated data.
    • The model’s performance and the DADA’s effectiveness depends on the pre-trained detector, and there may be risks in generalization. If the architecture of the target detector changes, the guiding direction of the adversarial perturbation may fail.
    • The paper does not verify the sensitivity differences of different detectors (such as models based on Transformer and CNN) to synthetic samples.
    • The paper lacks a sufficient theoretical analysis of how DADA perturbations affect the decision boundary of the detector.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a very interesting research work with a novel research perspective. It improves the model’s ability to resist false positives by providing difficult samples for the model. I look forward to the author’s answers to some of the current questions in the rebuttal. Overall, this is a quite excellent paper.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper tackles an important problem setting - generating high-quality false-positive polyp detection synthetic images. A novel diffusion model-based method is proposed, which uses the polyp detector to guide the adversarial diffusion attack (DADA module) and inpaints false-positive polyp-like structures in the desired negative locations. The background-only denoiser (BG-De) learns only background (non-polyp) information. It is then integrated with a detector-guided adversarial attack to generate high-value negative samples. Quantitative results comparing against other adversarial attack and inpainting methods are presented.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The problem setting is less explored but important - generating synthetic false positive polyp detection images.
    • The method is novel and creative. They train a background-only diffusion model and use a detector to guide the inpainting of a false negative structure.
    • The experimental design is comprehensive - ablation study, and comparison against adversarial attack and inpainting methods. The quantitive results show strong improvement.
    • The paper is well written.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Major

    • How effective are well trained polyp detectors for the noisy images in intermediate denoising steps? Minor
    • Does the L_cls evaluate the presence of polyp, and not the polyp class like adenoma vs hyperplastic?
    • Why is the training fold split into two?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Less explored yet important problem setting, novel architecture, good quantitive results, well written paper, code is provided.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

  1. Computational cost. (R#2) Our method increases GPU memory use from 1.34 GB to 3.84 GB and generation time from 35 s to 247 s compared to standard DDPM. Since generation is offline for detector fine-tuning, real-time speed is not required. Future work will explore Latent Diffusion Models and fewer steps to improve efficiency.

  2. Generalizability. (R#2) False positive generation is a novel and often overlooked area. We chose colorectal polyp detection as a representative task due to its critical false positive issue, validating our core idea and framework. In theory, the method is not specific to polyps and can extend to similar tasks by adaptively generating hard negatives. Due to MICCAI space limits, we focus on one domain and plan broader tests in future work.

  3. Underperformed scenarios. (R#2) We agree that very low image quality, rare polyp types, or limited data may degrade performance. However, our work pioneers adversarial false positive generation, demonstrating feasibility and potential. Not all challenges can be solved in one study, but we hope to inspire future research to address them stepwise.

  4. Optimization of unrealistic false positives. (R#3) Perturbation magnitude is not determined by the value of the detection loss. As shown in Eq. (5), the perturbation direction is guided by the gradient of the detection loss via the sign function, but the magnitude remains fixed. In other words, regardless of whether the detection loss is high or low, we apply a small, constant perturbation (\alpha) at each step. This ensures that all samples, including initially unrealistic ones, are gradually optimized to elicit a detection response at the targeted false positive location.

  5. Comparison with intuitive methods. (R#3) We plan to compare with simple false positive generation methods and will update results on our anonymous GitHub. We expect better performance because our adversarial approach generates task-specific hard negatives, simulating challenging false positives more effectively.

  6. Training data detail. (R#3) As in Sec. 3.3, for all methods, each original image is augmented with one synthetic false positive image, doubling training data. The detector trains on both original and augmented images, while baselines use only original data.

  7. Detector generalization. (R#3) We validated our method using two detectors, i.e., YOLO (CNN-based) and DETR (Transformer-based), representing mainstream architectures. This supports the claim that generated hard negatives remain valuable across different detector types.

  8. Sensitivity differences to synthesis. (R#3) Our framework targets false positives within the detector’s sensitivity range, generating confusing samples resembling true positives. However, it cannot generate false positives similar to missed true positives, which is a limitation we plan to address in future work.

  9. Theoretical analysis of perturbations (R#3) Samples near the detector’s decision boundary are those easily misclassified, especially false positives. DADA uses adversarial optimization guided by detection loss gradients to generate samples close to this boundary, exposing the detector to challenging cases and refining its decision boundary, consistent with adversarial learning theory.

10.Effectiveness of well trained detectors on noisy images (R#4) In early denoising steps, noisy images are indeed hard to recognize. Inspired by Classifier Guidance Generation, we do not require accurate detection at each step. Instead, we use gradient feedback from the detector to guide the denoising toward regions that the detector is sensitive to, progressively shaping realistic false positives near the decision boundary.

11.What does L_{cls} evaluate? Why split the training set into two? (R#4) According to our focusing polyp detection task, L_{cls} is set to evaluate the presence of polyp. The data split is designed to ensure data independence between the BG-De and the detector, preventing data leakage.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top