Abstract

Predicting future disease progression risk from medical images is challenging due to patient heterogeneity, and subtle or unknown imaging biomarkers. Moreover, deep learning (DL) methods for survival analysis are susceptible to image domain shifts across scanners. We tackle these issues in the task of predicting late dry Age-related Macular Degeneration (dAMD) onset from retinal OCT scans. We propose a novel DL method for survival prediction to jointly predict from the current scan a risk score, inversely related to time-to-conversion, and the probability of conversion within a time interval t. It uses a family of parallel hyperplanes generated by parameterizing the bias term as a function of t. In addition, we develop unsupervised losses based on intra-subject image pairs to ensure that risk scores increase over time and that future conversion predictions are consistent with AMD stage prediction using actual scans of future visits. Such losses enable data efficient fine-tuning of the trained model on new unlabeled datasets acquired with a different scanner. Extensive evaluation on two large datasets acquired with different scanners resulted in a mean AUROCs of 0.82 for Dataset-1 and 0.83 for Dataset-2, across prediction intervals of 6,12 and 24 months.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3334_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3334_supp.pdf

Link to the Code Repository

https://github.com/arunava555/Forecast_parallel_hyperplanes

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Cha_Forecasting_MICCAI2024,
        author = { Chakravarty, Arunava and Emre, Taha and Lachinov, Dmitrii and Rivail, Antoine and Scholl, Hendrik P. N. and Fritsche, Lars and Sivaprasad, Sobha and Rueckert, Daniel and Lotery, Andrew and Schmidt-Erfurth, Ursula and Bogunović, Hrvoje},
        title = { { Forecasting Disease Progression with Parallel Hyperplanes in Longitudinal Retinal OCT } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces a novel method for forecasting the progression of AMD. Notably, the method is capable of utilizing unsupervised data for training, demonstrating robustness even in the presence of data exhibiting domain shift.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper presents a valuable approach for fine-tuning models in new domains using unsupervised data, offering practical utility in real-world scenarios.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) The datasets used in this paper are private. 2) The code is not released.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It is suggested to add experimental results of the method on some other open-sourced data.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Reproducibility

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors promise to release the code of this paper.



Review #2

  • Please describe the contribution of the paper

    The paper proposes a novel deep learning method for predicting late dry Age-related Macular Degeneration (dAMD) onset from retinal OCT scans, overcoming challenges like patient heterogeneity and scanner-induced image domain shifts. It jointly predicts risk scores and conversion probabilities, using parallel hyperplanes and time-based bias parameterization.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) The proposed prediction method in the field of dAMD onset is novel and addresses the issue of image domain shifts. Additionally, it holds clinical value in predicting macular degeneration. 2) The introduction of a new dataset on macular degeneration would be highly beneficial if made publicly available. 3) The method introduces a novel domain alignment approach in the context of macular degeneration tasks. 4) It demonstrates good performance in both unsupervised and supervised tasks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1) In the comparative experiments, fewer and less representative methods were compared, and there is a desire for comparisons with state-of-the-art methods. 2) There is a desire for publicly available datasets to validate the model, although it’s uncertain whether such datasets exist in this field.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The request is for both open-source datasets and code to be made available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) Provide open-source datasets and code. 2) Include comparisons with state-of-the-art methods in the experimental methodology.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) It introduces a new dataset of macular degeneration fundus images. 2) It proposes a method for macular degeneration prediction that is applicable to both supervised and unsupervised learning, making significant contributions in the field.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a new method to predict the risk of age-related macular degeneration (AMD) progression from retinal OCT scans. Their method predicts both the risk score and the probability of conversion within a time interval. It also incorporates unsupervised losses to handle data-efficient fine-tuning across different scanners. The proposed method achieves good performance on two large datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposes a novel method for disease progression prediction that combines two approaches: estimating the time-to-conversion probability and classifying patients into risk groups based on a risk score. This allows for both individual risk assessment and population stratification for better disease management.

    • Combines two approaches: The method leverages both risk scoring and survival likelihood estimation for disease progression. This offers a more comprehensive analysis compared to a single approach.
    • Novel risk score definition: The distance from the decision hyperplane (H) is used as a risk score, providing an intuitive interpretation for disease conversion risk.
    • Temporal ordering: Ranking iAMD samples based on conversion time allows for a structured learning process and potentially improves generalization.
    • Continuous-time modeling: The proposed method goes beyond a single risk score by introducing a family of hyperplanes for continuous time-based conversion probability estimation.
    • Leverages feature space: The paper utilizes a CNN encoder to extract meaningful features from the scans, potentially improving the accuracy of predictions.

    The method incorporates unsupervised losses to address challenges associated with limited labeled data and scanner variability. By leveraging temporal consistency within patients’ scans and imposing a ranking loss, the model learns a meaningful feature space and ensures risk scores increase over time as the disease progresses. This is particularly useful for real-world scenarios where data might be scarce.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The method relies on a CNN encoder to map OCT scans into a feature embedding space. While the ranking loss encourages the model to learn informative features, the overall performance might still be limited by the quality of the initial feature representation learned by the encoder.
    • The paper mentions a calibration step using the validation set to transform risk scores into probabilities between 0 and 1. This post-processing step might introduce bias if the validation set is not representative of the target population.
    • Validation details lacking: The description of the calibration step using percentiles from the validation set is brief. More information about the validation process and its impact would be helpful.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    While the paper provide some kind of a hybrid deep survival method they could have compared their approach with time continuous method that are typically based on neuralODE such as [1,2] 1 - Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression 2 - LMT: Longitudinal Mixing Training, a Framework to Predict Disease Progression from a Single Image.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written, they propose a framework that is trained in two step. With a first step that integrate self-supervised learning and the second integrate hybrid method related to deep survival method. The different component that proposed such as the temporal consistency and risk score definition are novel.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Prior to the author’s rebuttal, the paper was already strong enough for recommendation. The authors have successfully addressed most of my concerns. Therefore, I recommend acceptance of this manuscript.




Author Feedback

We thank all the reviewers for their valuable comments and feedback. The reviewers acknowledged several strengths of our paper:

  1. Our novel approach for unsupervised inter-scanner domain alignment with unlabeled target data (R1) is particularly useful in real-world scenarios where labeled data is often scarce (R3, R4).
  2. The novelty of our proposed survival analysis method, which combines survival likelihood estimation and risk scoring by introducing a family of hyperplanes for continuous time-based conversion probability estimation (R4).
  3. The novelty of the task being addressed, namely, forecasting the future risk of dry AMD onset (R1).

While the reviewers recognized the novelty and utility of our work, concerns were raised related to the reproducibility and comparison with additional state-of-the-art: (i) Releasing Code (R1): While external links are not permitted during the rebuttal, we commit to releasing our PyTorch code and will provide a GitHub link in the camera-ready version.

(ii) Releasing Dataset (R1): Our datasets were collected from multiple hospitals through multi-institutional collaboration. Unfortunately, we do not have permission to share the dataset publicly.

(iii) Evaluation on additional public datasets (R3): Our method has been extensively evaluated on 2 private datasets collected from multiple sites using different OCT scanners with large inter-scanner variations. Existing public retinal OCT datasets focus on predicting the current disease stage or segmentation tasks. To the best of our knowledge, there are no public longitudinal retinal OCT datasets with future conversion time-point labels to evaluate our method. We will discuss this as a limitation of the current work in the Conclusion section in the camera-ready version, if accepted.

(iv) Additional SOTA comparisons (R1, R4): We evaluated our method against several popular survival analysis methods, including an ODE-based continuous-time model (SODEN [18]), a semi-parametric model (DeepSurv [4]), and two multi-class classification methods modeling the CDF (Censored Cross-Entropy loss [20]) and hazard function (logistic hazard loss [12]). While more comparisons could strengthen our paper, MICCAI guidelines prohibit additional experimental results at this stage. “LMT: Longitudinal Mixing Training” suggested by R4 is an ODE model similar to SODEN [18] which we compared against but additionally incorporates a temporal mixup augmentation for longitudinal data that can also be employed while training our method. If accepted, we will discuss LMT in the related work section and future research directions in the conclusion. We also aim to provide additional comparison results with LMT in our GitHub repository.

(v) Clarification on risk calibration (“weakness Sec.”, R4): We predict a risk score r for each OCT scan, inversely related to its time-to-conversion. We calibrate these unbounded risk scores to a [0,1] scale for better interpretability. R4 suggested elaborating on this calibration step and raised concerns about potential bias if the validation set isn’t representative of the target population. Details: After training, r is obtained for all scans in the validation set. We learn a bicubic interpolation which maps k-th percentile of r values to k/100 in increments of 10 percentiles (0,10,20 …. 90,100 percentile values are mapped to 0, 0.1,0.2 … 0.9, 1.0). Calibration is part of training, so the test set isn’t used. Instead, validation set quantiles are used to learn interpolation parameters, similar to any hyper-parameter tuning in Machine Learning. This interpolation is then applied during inference to normalize the risk scores. We will expand the calibration step description in the implementation details and discuss the potential for bias if the validation set distribution isn’t representative of the target population (as noted by R4) in the conclusion section.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    After rebuttal, the paper received all Accept.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    After rebuttal, the paper received all Accept.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top