Abstract

In the study of skin lesion segmentation, models based on convolution neural networks (CNN) and vision transformers (ViT) have been extensively explored but face challenges in capturing fine details near boundaries. The advent of Diffusion Probabilistic Model (DPM) offers significant promise for this task which demands precise boundary segmentation. In this study, we propose BGDiffSeg, a novel skin lesion segmentation model utilizing a wavelet-transform-based diffusion approach to speed up training and denoising, along with specially designed Diffusion Boundary Enhancement Module (DBEM) and Interactive Bidirectional Attention Module (IBAM) to enhance segmentation accuracy. DBEM enhances boundary features in the diffusion process by integrating extracted boundary information into the decoder. Concurrently, IBAM facilitates dynamic interactions between conditional and generated images at the feature level, thus enhancing the global recognition of target area boundaries. Comprehensive experiments on the ISIC 2016, ISIC 2017, and ISIC 2018 datasets demonstrate BGDiffSeg’s superiority in precision and clarity under limited computational resources and inference time, outperforming existing state-of-the-art methods. Our code will be available at https://github.com/erlingzz/BGDiffSeg.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3253_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/erlingzz/BGDiffSeg

Link to the Dataset(s)

https://challenge.isic-archive.com/data/#2016 https://challenge.isic-archive.com/data/#2017 https://challenge.isic-archive.com/data/#2018

BibTex

@InProceedings{Guo_BGDiffSeg_MICCAI2024,
        author = { Guo, Yilin and Cai, Qingling},
        title = { { BGDiffSeg: a Fast Diffusion Model for Skin Lesion Segmentation via Boundary Enhancement and Global Recognition Guidance } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors introduced a novel model for the segmentation of skin lesions, using a wavelet-transform based diffusion model. Compared to previous work using diffusion probabilistic models, their architecture is more accurate and faster in inference for skin lesion segmentation tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Ablation study to demonstrate the utility of the two novel modules (DBEM and IBAM). Performance on skin lesion segmentation tasks better than other SOTA methods. in terms of computational efficiency, the sampling time is more than 300times faster than MEdSegDiff which allow them to be used in real time cases. The authors started from one recurrent/known issue in skin lesions segmentation, which is the boundary detection, and developed a specific architecture to overcome and improve the previous work in this specific direction.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The whole architecture of the modules supported by Figure1 would deserve more explanation. The Figure alone is not fully understandable, a caption with more comments would be appreciated. The Diffusion Boundary Enhancement Module lacks some explanations , both for understandability and full reproducibility. When the authors claim that the results demonstrate its generalisability, it would be interesting to test the model on different types of datasets : for now, the authors only provide results on the ISIC data, with different configurations (2016, 2017 and 2018 versions) but same type of images. The problem of boundary detection may be relatable in other types of medical imaging, such as MRI. Furthermore, the other methods compared with the BGDiffSeg are not specific to skin lesion segmentation but any type of medical imaging segmentation task. An introduction of the different datasets would be appreciated, such as number of data, statistics on the different types of skin lesions, etc No details of possible applications for clinicians or real life use cases is provided, but the paper would benefit of such information.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Unless the DBEM part which does not seem clear enough to be easily reproducible, the authors provide enough information to reproduce the paper: public datasets, loss function, and good implementation details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please see weakness part

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The model presented in this paper is interesting and contains novel ideas, and their performance outperforms SOTA methods for the skin lesions segmentation tasks. The results could be reproduced based on the material and details included in the paper. To make the paper stronger, more details would be appreciated to fully understand the proposed architecture (detailed caption in figure1, DBEM part) but the overall ideas are clearly stated and explained. Some paragraph discussing the clinical applicability of the model would be very helpful also.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces a skin lesion segmentation model employing a wavelet transform-based diffusion approach to expedite training. It incorporates a Diffusion Boundary Enhancement Module to integrate boundary information into the decoder, and a Bidirectional Attention Module to enhance global recognition of target area boundaries and segmentation accuracy. Evaluation is conducted on three ISIC datasets. While the paper is well-written, the experimental section lacks robustness. Notably, the absence of reported standard deviations obscures result stability, and the incomplete ablation study across datasets undermines the thoroughness of the evaluation. Furthermore, lacking information on computational complexity relative to other models hampers a comprehensive assessment.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Implementation of a wavelet transform-based diffusion approach to expedite training
    • Clear explanation of the mathematical model
    • Readability of the paper.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Lack of reported result stability
    • Incomplete analysis across all three datasets, including ablation study and computational complexity
    • Absence of investigation under realistic conditions such as occlusion and intensity variation.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Stability of Results: Reporting variations (+/-) in results in Table 1 is essential for understanding model stability and the significance of results.

    2. Incomplete Analysis: The ablation study should be extended to the other two datasets, with reported standard deviations of results. Similarly, computational time should be compared to other models shown in Figure 2, as computational complexity (speed and memory) is crucial for assessment.

    3. Sensitivity to Noise: Investigating realistic conditions like occlusion due to hair presence or non-uniform intensity distribution is necessary. While mentioned briefly in section 3.1, using low-frequency components may not adequately address these conditions. Utilizing skin lesion datasets can provide insights into how the model performs under such conditions.

    4. Equations Clarity: Clarifying variables such as I in equation 1, setting integration limits in equation 2, defining E_q in equation 9, and specifying y and y_hat in equation 10 would enhance readability and understanding.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Clear writing and explanation of mathematical formulas, but the results and evaluation part needs more work. In fact there is space in the tables to show stability of results (standard deviations), include the missing ablation study for the remaining two dataset, and computational complexity for the other models shown in the qualitative evaluation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes a new diffusion model for skin lesion segmentation. The two novel modules in the model are named DBEM and IBAM. DBEM is used to amplify an image’s edges by taking the element-wise product of an image with the output Sobel operator, applied to the low-pass image of the wavelet transform. IBAM is used to combine information from the conditional and generative encoders. This ensures that the appropriate image features are used (those that have been learnt to be relevant) and that the appropriate generated features are used (those that match the image). The model computes a predicted segmentation map. The method is evaluated on the ISIC 2016, 17 and 18 datasets, giving the best results, for both mIoU and DSC in each case, against the compared models. Ablation studies show that the IBAM and DBEM modules both provide improved performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper presents two novel modules that improve performance over state of the art in each of the tested datasets.
    • The modules are clearly explained and have a clear intuitive basis.
    • The model is tested across three datasets and a wide variety of recent models.
    • The ablation results demonstrate that the novel modules are necessary to improve performance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    In the comparative results, it is stated that the same experimental protocol is followed for each dataset for fairness. It is not shown exactly how the results for other models are obtained - in some cases, e.g. Fat-Net, the results differ from the original paper so it is assumed that the models are re-trained. It is explained that MedSegDiff is re-trained, but details on the implementation of the model training are missing.

    In the comparative results, Fat-Net gives performance very close to the proposed model in ISIC 2017 and ISIC 2018 - but it has been omitted from the ISIC 2016 results. Given that it is so close, it should be confirmed that the proposed model also outperforms in this case.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors intend to make their code available. This will be quite important given the complexity of the BGDiffSeg architecture.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Preliminaries: DDPM is used as an acronym without explanation, though DPM is. The spacing in some of the equations makes them difficult to read, for example around G-theta in Eq 2, and the spacing between the two equations in Equation 1. If an input image is 1xHxW, how are color channels dealt with - the input in Figure 1 is a colored image. Confirm if Haar wavelets are used.

    Figure 1 In the IBAM box (top right), x1 and x2 could be labeled as mI and mX as in equation 7 or 8. Is there a difference between q, k, v in the IBAM module and Q, K, V in equation 5, if not they could be homogenized.

    Equation 5 - given the residual connection, is V (or Vi) an argument to the similarity function, rather than an additional multiplier.

    Section 4 The work to retrain MedSegDiff is appreciated. Could additional information about the training parameters for this, and other models, be provided? Some models are not tested on ISIC data in their original paper, so we presume they were retrained.

    In training resources results, Table 2, could inference time be given on a more reasonable hardware - a Titan X is unlikely to be available in the clinic. MedSegDiff has x8 FLOPs, but x400 inference time, could be better explained in the discussion. Given that the model is tested on a GPU with 12GB memory and MedSegDiff requires 25GB, this could be the reason for such a large difference.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a novel model for segmentation of skin lesions with SOTA results on the ISIC datasets. There are some weaknesses, specifically the unexplained missing comparisons in the comparative results. However, if - for example - Fat-Net outperformed the proposed model on ISIC2016 - the paper would still have scientific value.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Dear Reviewers and Meta-Reviewers,

I am writing to express my sincere gratitude for the early acceptance of my paper #3253 entitled “BGDiffSeg: a Fast Diffusion Model for Skin Lesion Segmentation via Boundary Enhancement and Global Recognition Guidance”. I deeply appreciate the time and effort you have invested in reviewing my work and providing valuable feedback.

I have carefully reviewed all the comments and suggestions provided. I am committed to incorporating these insights to enhance the clarity, quality, and overall impact of my manuscript. I will ensure that the necessary revisions are made promptly and thoroughly.

Thank you once again for your meticulous review and constructive feedback.




Meta-Review

Meta-review not available, early accepted paper.



back to top