Abstract

Survival prediction for cancer patients is critical for optimal treatment selection and patient management. Current patient survival prediction methods typically extract survival information from patients’ clinical record data or biological and imaging data. In practice, experienced clinicians can have a preliminary assessment of patients’ health status based on patients’ observable physical appearances, which are mainly facial features. However, such assessment is highly subjective. In this work, the efficacy of objectively capturing and using prognostic information contained in conventional portrait photographs using deep learning for survival prediction purposes is investigated for the first time. A pre-trained StyleGAN2 model is fine-tuned on a custom dataset of our cancer patients’ photos to empower its generator with generative ability suitable for patients’ photos. The StyleGAN2 is then used to embed the photographs to its highly expressive latent space. Utilizing state-of- the-art survival analysis models and StyleGAN’s latent space embeddings, this approach predicts the overall survival for single as well as pancancer, achieving a C-index of 0.680 in a pan-cancer analysis, showcasing the prognostic value embedded in simple 2D facial images. In addition, thanks to StyleGAN’s interpretable latent space, our survival prediction model can be validated for relying on essential facial features, eliminating any biases from extraneous information like clothing or background. Moreover, our approach provides a novel health attribute obtained from StyleGAN’s extracted features, allowing the modification of face photographs to either a healthier or more severe illness appearance, which has significant prognostic value for patient care and societal perception, underscoring its potential important clinical value.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0658_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0658_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Hag_Deep_MICCAI2024,
        author = { Hagag, Amr and Gomaa, Ahmed and Kornek, Dominik and Maier, Andreas and Fietkau, Rainer and Bert, Christoph and Huang, Yixing and Putz, Florian},
        title = { { Deep Learning for Cancer Prognosis Prediction Using Portrait Photos by StyleGAN Embedding } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    1) This paper is the first to explore using deep learning to predict cancer patient survival based on 2D facial photographs, marking a novel approach in the field of medical prognosis prediction. 2) The research also provides a way to adjust facial photographs to reflect a healthier or more severe illness appearance, which could have significant implications for patient care and societal perceptions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is clearly written, and this is an innovative way of implementing StyleGAN.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) This work reminds me of disentangled representation learning; I suggest that the authors compare their work with existing papers on disentangled representation learning. 2) The paper does not address the potential variability and quality of images that could affect the model’s accuracy. This variability includes differences in lighting, photo quality, and patient positioning, which are not discussed in detail. 3) I am not sure how this method could be integrated into current medical practices or its acceptance among medical professionals

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) The methodology employed in this study bears resemblance to principles of disentangled representation learning. I recommend that the authors conduct a comprehensive comparison with existing literature in this area. Such a comparison would not only contextualize the novelty of the current approach but also potentially highlight the unique contributions and advantages it offers over previous models. 2) The robustness of the model could be significantly influenced by the variability inherent in the images it processes. Factors such as lighting conditions, the quality of the photographs, and patient positioning could play pivotal roles in model accuracy. I advise the authors to address these variables in their study. Further, a detailed discussion regarding the mitigation of such variabilities and their effects on the model’s performance would be beneficial. This could involve analyzing the model’s sensitivity to these factors and suggesting potential pre-processing steps or model adjustments to enhance consistency and accuracy. 3) Finally, while the technical merits of the study are evident, its practical implications warrant further exploration. I suggest the authors provide a more explicit examination of how this method could be integrated into existing medical workflows. Consideration should be given to the model’s compatibility with current practices, potential barriers to adoption, and the reception of such a technological advancement among medical professionals. A feasibility study or expert opinions could offer valuable insights into the practical application and acceptance of this method in clinical environments.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and innovative. Although I have some concerns about the model’s generalization, this is an interesting paper.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes to perform cancer prognosis prediction using portrait by extracting latent features with StyleGAN. The image feature are then combined with clinical features/SOTA survival prediction model and demonstrated high performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper demonstrated the feasibility of using generative models for the understanding of cancer survival possibility through patient’s face images. The proposed method is based on a simple idea and based its design choices on the need of the data modalities and comprehensive comparisons with other state-of-the-art methods are conducted. The paper is also clearly written and ensured the clarity of the method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The reviewer has one major concern that the StyleGAN is fine-tuned on the patient data, but with diverse background and non-facial objects (mask/cap etc.). Equipping the model with the ability to understand these irrelevant features does not make much sense since the objective is to focus on the face, and could potentially result in a “cheating” since patients in more serious conditions tend to need more supportive/protective measures. The performance gain could be a result of more accurate prediction on patient cases with accessories on them. Another minor concern is regarding the generalizability of the method to different ethnicities.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The author could use some discussions on the non-facial objects existed in the data, as mentioned above.

    The subgroup analysis was only conducted for the portrait method and does not reveal enough information regarding the performance difference between clinical features. Add other methods for side-by-side comparison could be more informative.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work stands for the first attempt of understanding facial attribute that is related to cancer survival. The proposed GAN-based method is simple yet effective, providing good explainability. The paper has good writing quality and scientific soundness.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper adopts StyleGAN for cancer patient survival prediction based on facial photographs, where the StyleGAN latent space feature representations are demonstrated to outperform the deep features extracted via common CNNs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This study is novel as it is the first attempt to use deep learning for cancer patient survival prediction based on facial photographs.
    2. Through combining photographs and clinical data, a C-index of 0.787 was achieved on a large portrait photographs dataset including 13,503 photographs of patients with different cancers, which demonstrates the effectiveness of this study.
    3. A “health” attribute can be extracted to adjust face photographs, which provides interpretability and validation to the model.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The technical novelty of this study is limited. A common StyleGAN was adopted to extract deep features, and then the extracted features were modeled by the widely used DeepSurv and CoxPH.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It seems that the compared methods (ResNet-18/VGG-16 + DeepSurv) were also trained separately in a two-stage manner?(First CNN to extract feature, and then feed the features into DeepSurv). If so, the CNNs might not extract prognosis-related information and this could explain why ResNet-18/VGG-16 + DeepSurv achieved such low C-index. I suggest including some state-of-the-art end-to-end deep survival models in comparison.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Despite the somewhat limited technical contribution, this paper presents the first attempt to use deep learning for cancer patient survival prediction based on facial photographs and achieved a clinically meaningful C-index (0.787, combined with clinical data), which is very interesting and makes it above the acceptance borderline. The “health” attribute used to adjust face photographs is also very interesting and insightful, which is worth to be shared with the community.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We appreciate the reviewers’ positive feedback and insightful comments, which have provided valuable directions for enhancing our work. Due to space constraints, we were unable to fully address all comments in detail. However, we will incorporate concise responses to address your concerns in the revised, camera-ready version of our paper. Please find our feedback below.

R1:

  1. Disentangled representation learning: StyleGAN training does not explicitly use disentanglement learning in the way that methods like β-VAE (beta-Variational Autoencoders) or InfoGAN do. However, it does incorporate several techniques that result in a form of disentanglement within the latent space, leading to more intuitive and controllable image manipulation. Such techniques include the two-stage mapping network, adaptive instance normalization (AdaIN), and progressive growing. In contrast, disentangled representative learning typically utilize specific terms in their objective/loss functions to encourage disentanglement. The primary goal of disentanglement learning is to achieve representations where each latent variable corresponds to a distinct, interpretable factor of variation. Disentanglement learning methods often use supervised or semi-supervised approaches with labeled data to achieve disentanglement, whereas StyleGAN can achieve a level of disentanglement without explicit labels, relying on its architectural design and training process.

  2. Influence of lighting and patient position: Thank you for highlighting this important concern. In our work, all portrait photos were taken with the face facing the camera, ensuring standardized patient positioning and minimizing its effect. In future work, we plan to investigate the influence of lighting conditions and other image quality factors to further enhance the model’s robustness and accuracy.

R2: Ethnicities: We acknowledge the limitation of our work regarding ethnic diversity. Our study primarily utilizes patient portrait photos from our department (Department of Radiation Oncology, University Hospital Erlangen, Germany). Due to data privacy concerns, obtaining photos of more diverse ethnicities is challenging at this stage. In the future, we aim to pursue multicenter collaborations and apply our method to photos from a broader range of ethnic backgrounds.

R3: Comparison methods: In our work, only the StyleGAN-based methods are trained in a two-stage manner, which allows us to fully use the editability and explainability of StyleGAN latent space. However, we want to emphasize that other comparison methods such as ResNet-18/VGG-16 + DeepSurv were all trained in an end-to-end manner.




Meta-Review

Meta-review not available, early accepted paper.



back to top