Abstract

Development of artificial intelligence (AI) techniques in medical imaging requires access to large-scale and diverse datasets for training and evaluation. In dermatology, obtaining such datasets remains challenging due to significant variations in patient populations, illumination conditions, and acquisition system characteristics. In this work, we propose S-SYNTH, the first knowledge-based, adaptable open-source skin simulation framework to rapidly generate synthetic skin, 3D models and digitally rendered images, using an anatomically inspired multi-layer, multi-component skin and growing lesion model. The skin model allows for controlled variation in skin appearance, such as skin color, presence of hair, lesion shape, and blood fraction among other parameters. We use this framework to study the effect of possible variations on the development and evaluation of AI models for skin lesion segmentation, and show that results obtained using synthetic data follow similar comparative trends as real dermatologic images, while mitigating biases and limitations from existing datasets including small dataset size, lack of diversity, and underrepresentation.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1426_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1426_supp.pdf

Link to the Code Repository

https://github.com/DIDSR/ssynth-release

Link to the Dataset(s)

https://huggingface.co/datasets/didsr/ssynth_data

BibTex

@InProceedings{Kim_SSYNTH_MICCAI2024,
        author = { Kim, Andrea and Saharkhiz, Niloufar and Sizikova, Elena and Lago, Miguel and Sahiner, Berkman and Delfino, Jana and Badano, Aldo},
        title = { { S-SYNTH: Knowledge-Based, Synthetic Generation of Skin Images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors present a synthetic data generation framework that is based on synthetic 3D skin models utilizing physical properties of skin. The model allows controlled variation in skin appearance and also underlying properties. Authors generated a synthetic dataset by taking different 2D rendering of the model, test out how such dataset can be used to improve lesion segmentation model performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Novelty: Image generation from a physical model is a very good idea. The potential is very high if can be achieved with success. The authors demonstrate good feasibility for this very challenging task.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Even if the idea of using a physics based model is very good, authors’ model is very simplistic. Moreover, experimental design is also simplistic that the results are not that impactful. The performance increase provided by the inclusion of the synthetic data is not substantial.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Authors provided a link at the end of page 2 but it is not clear if this is a link to the source code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The idea of generating a physics based model in Houdini and use that to generate synthetic data. If achieved successively, this can open new doors for synthetic data generation in dermatology. However, generating such a model is a huge task and unfortunately the model provided by the authors is a bit too simplistic. likewise, the lesion growing model is also very simplistic explained in a probabilistic growth process. However, the lesion growth is not a linear process and the growth not only changes the melanocytic portion of the lesion but also other parts (including the background and the areas close to the border change when the lesion grows). The interaction between different lesion components are not taken into consideration.

    Section 3.2 is incomplete. Maybe the authors are providing more information in this section?

    Authors show, how the presented model can be used to increase lesion segmentation performance, however the success is limited to melanocytic lesion, which is not very challenging. Moreover, in many clinical cases, the clinical border of the lesions are not defined by the melanocytic area borders, so the use of the trained model can be limited in this sense.

    Overall, the reviewer thinks this is a really good start to show feasibility.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The provided model is simplistic, but it is a good start to show feasibility. The generated images does not look realistic (Figure 3) The reviewer thinks the paper can be considered as a poster presentation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This work introduces S-SYNTH, a novel knowledge-based, adaptable skin simulation framework for generating synthetic skin. The framework utilizes an anatomically inspired multi-layer, multi-component skin and growing lesion model, allowing for controlled variations in skin appearance. S-SYNTH is used to study the impact of these variations on AI models for skin lesion segmentation, demonstrating that synthetic data can mitigate biases and limitations present in existing datasets, such as small size, lack of diversity, and underrepresentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed physics-based synthetic framework is highly novel and technically sound. Compared to recent diffusion-based learning-based frameworks, the physical model can control variations in skin appearance, such as skin color, presence of hair, lesion shape, and blood fraction, among other parameters, improving interpretability and application scenarios.

    2. The performance in under-represented groups is clearly improved when adding these synthetic images.

    3. According to Fig. 4c, the synthetic data seems especially helpful when the real dataset size is small, which appears to be a very practical real-world application, as there are often only a few training samples in a specific institution.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. More comparison discussions and experiments with learning-based generative models, such as diffusion models, should be added. At least one diffusion-based method should be included as a baseline, which could be [3], [4], or another reasonable diffusion model the author considers suitable.

    2. One main concern is the lack of technical details, making it difficult to follow the methodology behind the proposed framework. Furthermore, a detailed discussion compared to existing physics-based 3D image generation should be included. The differences between the proposed framework and previous models should also be compared.

    3. Mislabelling and skin tone fairness are common and important aspects in dermatology field; some related works [1, 2] should be discussed more in the introduction to further emphasize the necessity and advantages of the synthetic framework. [1] Towards Reliable Dermatology Evaluation Benchmarks [2] Towards Trustable Skin Cancer Diagnosis via Rewriting Model’s Decision [3] Improving dermatology classifiers across populations using images generated by large diffusion models [4] Augmenting medical image classifiers with synthetic data from latent diffusion models

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors do not mention whether they will make the code public, making it difficult for others to reproduce the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The image resolution is only 128x128, which is not common in typical skin segmentation field. What is the reason for resizing images to such a small resolution?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a physics-based generative framework for skin segmentation applications. The idea is very novel, and the technique appears quite practical. The main concerns that might reduce the impact of this paper are: (1) The paper lacks technical details, making it difficult for others to reproduce the results. (2) The paper does not compare any diffusion or GAN-based generative models for synthesizing images, making it hard to determine whether the physics-based method is better than these commonly chosen techniques. However, the paper indeed has many advantages and novelty compared to its weaknesses. I believe the paper is worth a oral presentation if the authors can address the two concerns well.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper presents a simulation framework based on 3D models to generate Dermoscopic skin synthetic images for segmentation tasks. The proposed framework is capable of controlling the variation of diverse skin features, such as skin color, presence of hair, lesion shape, etc., in generating synthetic images. The paper provides evidence of improvement based on Dice scores. The idea is interesting and should be presented.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper has the following strengths: 1- The S-SYNTH framework can control the variation of skin features in dermoscopic images when generating synthetic images. It is a significant approach because it is challenging to capture the diverse features and generate them accordingly.

    2- The framework has shown efficacy in improving Dice scores when synthetic images are replaced or augmented with real images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper does not provide any comparison to the state-of-the-art approaches. How it is different from the existing approaches and what gain it can achieve in comparison.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In this paper, a simulation framework to generate dermoscopic skin images based on 3D models for segmentation tasks is presented. The proposed framework allows controlling the variation of diverse skin features, such as skin color, presence of hair, lesion shape, etc., in generating synthetic images. The paper provides evidence of improvement based on Dice scores. The comparison of this approach with the existing literature is missing in the paper. It would enhance the support of the proposed methodology if compared with the existing methods, even related ones. The idea is interesting and recommended for presentation.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has all qualities in terms of novelty, good presentation of work, evaluation of methods, and comprehensive experiments and results.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank all reviewers for their insightful comments and provide our responses below.

Reproducibility/technical details: all code/data will be made publicly available and have already requested internal clearance for releasing the codes. We included sample rendering code within the manuscript and will add additional details to facilitate full reproducibility in the final code release. Due to manuscript length limitations, we provided only a brief overview of the rendering in Sec. 3.2 and will add additional details within the code release.

Comparison to state of the art: existing real skin datasets are prone to inaccurate annotations [1], such as labelling errors, near duplicates, and incorrect labels, as well as confounders [2], such as rulers, dark borders, dense hairs and air pockets, both of which may negatively affect AI-based diagnostic tools and any data-driven methods (e.g., generative diffusion models [3,4]) used to generate synthetic training augmentation data. Similar to [2], we take into account known confounders to systematically simulate synthetic samples and analyze performance across subgroups. In contrast to [3,4], our synthetic data generation model is not data-driven and therefore does not propagate confounder behavior. To our knowledge, our model is the first to use knowledge-based models to produce skin images without and with lesions.

Additionally, a known limitation of existing real skin datasets used as original training dataset for data-driven techniques is the low prevalence of examples with dark skin tones. A significant contribution of our work is the ability to generate images across a wide variety of skin tones, thus creating the possibility to supplement existing real skin datasets with currently unrepresented samples. We are hopeful that such supplementation can help to remove some of the known biases that result from such significant underrepresentation within the existing real data.

Model is Simplistic/Limit to Melanocytic Lesion: we agree that our lesion growth model is to some degree simplistic, particularly compared to real anatomical and pathological skin characteristics. However, our model is also more sophisticated than other models that have produced lesions based on simple geometrical shapes (see Vasudev, Varun, et al. “Simulation pipeline for virtual clinical trials of dermatology images”, Medical Imaging 2019: Physics of Medical Imaging. SPIE, 2019). As correctly pointed out, one area for model improvement is to consider interaction between the lesion and non-lesion areas. In our current model, however, the lesion growth can be affected by material changes that are lesion dependent and that can vary along the timeline of lesion growth. Moreover, the level of realism needed depends on the potential utility of the model within the target application (e.g., increasing training dataset size, testing, calibration), and not only on visual comparison to real lesion examples.




Meta-Review

Meta-review not available, early accepted paper.



back to top