Abstract

Fairness is an important topic for medical image analysis, driven by the challenge of unbalanced training data among diverse target groups and the societal demand for equitable medical quality. In response to this issue, our research adopts a data-driven strategy—enhancing data balance by integrating synthetic images. However, in terms of generating synthetic images, previous works either lack paired labels or fail to precisely control the boundaries of synthetic images to be aligned with those labels. To address this, we formulate the problem in a joint optimization manner, in which three networks are optimized towards the goal of empirical risk minimization and fairness maximization. On the implementation side, our solution features an innovative Point-Image Diffusion architecture, which leverages 3D point clouds for improved control over mask boundaries through a point-mask-image synthesis pipeline. This method outperforms significantly existing techniques in synthesizing scanning laser ophthalmoscopy (SLO) fundus images. By combining synthetic data with real data during the training phase using a proposed Equal Scale approach, our model achieves superior fairness segmentation performance compared to the state-of-the-art fairness learning models. Code is available at https://github.com/wenyi-li/FairDiff.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3105_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3105_supp.pdf

Link to the Code Repository

https://github.com/wenyi-li/FairDiff

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_FairDiff_MICCAI2024,
        author = { Li, Wenyi and Xu, Haoran and Zhang, Guiyu and Gao, Huan-ang and Gao, Mingju and Wang, Mengyu and Zhao, Hao},
        title = { { FairDiff: Fair Segmentation with Point-Image Diffusion } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Unbalanced training data among diverse target groups is a common problem in clinical datasets. The authors present a technique to enhance data balance through the integration of synthetic images. A joint optimization approach, in which three networks are optimized to improve fairness maximization is introduced by employing a Point-Image Diffusion architecture.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Tackles an important problem which is highly prevalent in clinical datasets

    2. Uses an innovative approach that involves modern technology

    3. Shows superiority over previous approaches

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Code is not shared, which makes it difficult to reproduce

    2. Is only appealed to SLO images. This technology has, however, lost clinical importance in the recent years.

    3. No results for other more relevant procedures such as fundus photography or OCT

    4. No evidence that the approach can translate into improved diagnostic performance

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Without additional information impossible to reproduce

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Provide code

    2. Show applicability in datasets from different modalities such as OCT and fundus photography

    3. Apply in a real word setting and proof that fairness maximization can result in improved performance

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Main reasons is that SLO imaging is not very relevant any more, vendors have stopped production

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors propose a controllable image synthesis method to improve fairness learning in fundus image segmentation. The experimental results show the effectiveness of the proposed approach.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper has sufficient novelty. Several efforts employ synthetic images to train segmentation models in both medical imaging and computer vision fields, but rare work focus on the fairness learning which is pivotal in medical segmentation.
    2. The literature review and references on image synthesis are abundant.
    3. The paper is well organized.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The authors train a separate diffusion model for each sensitive attributes, but how to ensure the diffusion model capture the differences between differen sensitive groups. This issues is critical in your downstream task, it is confused how the segmentation models improve, it is because the actual extra data for minority group or just more abundant data? The experiements do not exhibit relevant discussion.
    2. The compared methods in Sec 3.2 are a bit out-of-date. The recent progress on conditional image generation, e.g. controlnet, should be taken into account. In addition, there is no any ablation study and discussion on your proposed point-mask generation, how it improves image quality and segmentation fairness.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The author should open access to generated data for reproducibility.
    2. Please refer to weakness for more details on experiments.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper demonstrates a commendable level of innovation, and the organization is well. But the paper have no verification and discussion on critical issues. Thus my score is weak accept, if the authors solve my concerns, I will improve the rate.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The author provide a great response to solve my concern, so I raise my rate to Accept.



Review #3

  • Please describe the contribution of the paper

    The authors proposed a method that enhancing data balance for fairness by using Point image Diffusion architecture, leveraging 3D point clouds for improved control over mask boundaries as conditions using control-net through a point-mask-image synthesis pipeline.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Good paper structure with clear abstract and Figure 1 and 2 for the main idea.
    2. The authors provided thorough comparison of methods in table.
    3. The use of 3D point could condition is interesting.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The network is constrained with TransUNet. Maybe more comparison is needed to show the generalisability.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The comparison of synthesis quality good. any possible to compare the 2D mask condition with the same architecture as an ablation?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is a standard MICCAI paper, overall is good.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The author justified and answered the questions and I am satisfied with them.




Author Feedback

We thank all reviewers for their insightful feedback. We appreciate positive comments such as the importance of fairness in clinical datasets (@R3,@R4) , the novelty of our method (@R3,@R5), the superiority of our experimental results (@R3,@R5).

  • [@R3-(1),@R4-(C1)] Code release.

To help the community better reproduce and build upon FairDiff, here is an anonymous codespace link: anonymous[dot]4open[dot]science/r/FairDiff-FB40. We will release the code and data after the paper is accepted.

  • [@R3-(2)] SLO’s clinical importance.

Although OCT provides detailed visualization of retinal microstructures, the high cost of OCT machines makes it less prevalent in primary care. Meanwhile, SLO offers high-resolution images (5-10 μm) that are more detailed compared to color fundus images (15-20 μm).

Additionally, as far as we know, SLO technology is not in the status of not very relevant any more. In contrast, it is continuously advancing, with innovations such as adaptive optics SLO[1], rapid and extensive imaging[2], and ultra-wide-field SLO images[3]. Also, SLO products are still available from vendors like OPTOS and YESight.

[1] Alignment, calibration, and validation of an adaptive optics scanning laser ophthalmoscope for high-resolution human foveal imaging. Applied Optics (2024).

[2] Scanning Laser Ophthalmoscopy Demonstrates Pediatric Optic Disc and Peripapillary Strain During Horizontal Eye Rotation. Current Eye Research (2024).

[3] Algorithm of automatic identification of diabetic retinopathy foci based on ultra-widefield scanning laser ophthalmoscopy. International Journal of Ophthalmology (2024).

  • [@R3-(3)] Applicability in other modalities.

The existing public OCT and fundus datasets lack the demographic information needed for fairness study. Therefore, we chose to conduct our experiments with the Harvard-FairSeg, which is the first fairness-focused dataset for medical segmentation.

  • [@R3-(4)] Maximizing fairness to enhance clinical performance.

According to studies [4], fairness-improved CNNs on radiology images can address the issues of underdiagnosis and misdiagnosis in underserved groups, ensuring more equitable diagnostic accuracy across diverse patient populations. Similarly, enhancing fairness in our SLO fundus segmentation models can lead to improved clinical performance.

[4] CheXclusion: Fairness gaps in deep chest X-ray classifiers. (PSB 2021)

  • [@R4-(1)] Extra data issue.

We have controlled the total number of training samples to be the same across different settings to reduce the effect of extra data. Take Table 2 as an example, the baseline where the TransUnet model is trained on the full real dataset. The total number of training samples is 8000 (with 752 Asian, 1161 Black, and 6087 White). Our training samples are as follows: Asian 2667 (184 real + 2483 syn), Black 2667 (295 real + 2372 syn), White 2666 (1521 real + 1145 syn).

Also, analyzing the metrics, our method may not significantly improve segmentation performance (mIou), but it does enhance the fairness of segmentation (ES-mIoU).

  • [@R4-(2),@R5-(C1)] Comparison with ControlNet.

Thank you for your suggestion. We provide comparison with ControlNet (one-stage label-to-image synthesis) in the following table, compared with our two-stage pipeline where we first sample labels and then synthesis images. The results are as follows:

Method FID↓ MMD↓ COV↑
ControlNet (w/o Point-Mask) 67.29 23.6 9.45
Ours(w/ Point-Mask) 60.51 20.1 10.83

Our Point-Mask method shows effectiveness in generating diverse images, as reflected by the highest COV (Coverage) score among the methods evaluated.

  • [@R5-(1)] Another Backbone.

In fact, we have conducted experiments on another segmentation backbone, SAMed. Due to page constraints, we put it in the supplementary (Please refer to Tab.1-4).




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper presents an innovative approach to addressing the issue of data imbalance in clinical datasets by integrating synthetic images through a Point-Image Diffusion architecture. Initial criticisms largely centered on the reproducibility and relevance of the application to SLO imaging. The authors have addressed these concerns effectively in their rebuttal. While I tend to recommend acceptance of the paper, I agree with R3 that it would be strengthened further if validated on other modalities, such as fundus photography or OCT. Authors should include all details mentioned and promised in the rebuttal in the camera-ready version.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper presents an innovative approach to addressing the issue of data imbalance in clinical datasets by integrating synthetic images through a Point-Image Diffusion architecture. Initial criticisms largely centered on the reproducibility and relevance of the application to SLO imaging. The authors have addressed these concerns effectively in their rebuttal. While I tend to recommend acceptance of the paper, I agree with R3 that it would be strengthened further if validated on other modalities, such as fundus photography or OCT. Authors should include all details mentioned and promised in the rebuttal in the camera-ready version.



back to top