Abstract

Medical imaging data and electronic health records are an integral part of clinical routine and research for prognostication of patient survival and thus directly inform patient management. However, standard regression models used to derive patient prognoses are ill-equipped to handle such non-tabular data directly. Several neural network architectures based on classification or the Cox model have been proposed. Here, we present deep conditional transformation models (DCTMs) for survival applications with medical imaging data. DCTMs include the Cox model as a special case, but parameterize the log cumulative baseline hazards via Bernstein polynomials and allow the specification of non-linear and non-proportional hazards for both tabular and non-tabular data and extend to all types of uninformative censoring. DCTMs yield moderate to large performance gains over state-of-the-art deep learning approaches to survival analysis on a multitude of publicly available datasets featuring tabular or imaging data from radiology and pathology.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1381_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/sinai-computational-pathology/DCTM

Link to the Dataset(s)

TCGA: https://www.cancer.gov/ccg/research/genome-sequencing/tcga Radcure: https://cancerimagingarchive.net/collection/radcure Tabular data (METABRIC, SUPPORT, CBSG): https://github.com/jaredleekatzman/DeepSurv

BibTex

@InProceedings{CamGab_Aflexible_MICCAI2025,
        author = { Campanella, Gabriele and Häggström, Ida and Kook, Lucas and Hothorn, Torsten and Fuchs, Thomas J.},
        title = { { A flexible deep learning framework for survival analysis with medical data } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15974},
        month = {September},
        page = {2 -- 12}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a novel Deep Conditional Transformation Model (DCTM) for survival analysis, capable of handling both tabular and non-tabular data, as well as all types of censoring. The proposed method is evaluated on publicly available datasets comprising tabular data and imaging data from radiology and pathology.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper proposes a flexible deep learning-based survival model that relaxes the proportional hazards assumption of the Cox model.
    2. The survival head is modular and can be combined with various feature extractors, making it adaptable to different data modalities such as tabular, radiology, and pathology inputs.
    3. The model supports all types of censoring, enhancing its applicability to real-world clinical datasets with complex censoring mechanisms.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The literature review primarily cites older works, and would benefit from including more recent developments in deep survival analysis.
    2. The advantages of the proposed method over existing approaches—such as Cox-based deep learning models and classification-based survival methods—are not clearly articulated, which weakens the motivation for the work.
    3. The methodology section is a bit difficult to follow in places and could benefit from clearer explanations and structure.
    4. There are several typos and unclear expressions in the methods section. For example, 1) on page 3: “In order for F_T to be a valid CDF” — F_T is undefined. The phrase “x ∈ [11]” is confusing and appears to be a typo. 2) On page 4, in Equation 3, the transpose symbol (⊤) for ϕ(x) should be written as a superscript for clarity.
    5. Only the concordance index (C-index) is reported. Calibration metrics such as the Brier score should also be included to provide a more complete evaluation of model performance.
    6. Cross-validation results report only the mean C-index. Standard deviations should be included, and significance tests should be reported where possible to assess statistical robustness.
    7. In Table 3, only DeepHit is included as a baseline. It is unclear why DeepSurv is excluded, especially since it is mentioned in the caption. Including DeepSurv in the comparison would strengthen the evaluation.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper addresses an important problem in survival analysis and proposes a flexible deep learning framework that can handle various data modalities and types of censoring, there are several concerns that limit its current impact. First, the motivation for the proposed method is not clearly articulated, particularly in terms of how it advances beyond existing Cox-inspired or classification-based survival models. Second, the methodological section lacks clarity and contains several typographical and notational issues that hinder readability. Third, the evaluation is somewhat limited—only the C-index is reported, without calibration metrics such as the Brier score, and cross-validation results lack variance or statistical significance testing. Additionally, comparisons with relevant baselines (e.g., DeepSurv) are incomplete.

    These weaknesses suggest that the paper requires revision and clarification before it can be considered for acceptance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have adequately addressed all of my concerns raised in the initial review.



Review #2

  • Please describe the contribution of the paper

    The main contribution of this paper is the development of Deep Conditional Transformation Models (DCTMs) for survival analysis with medical data. The proposed framework acts as a flexible survival modeling “head” that can be integrated with arbitrary feature extractors. The authors demonstrate that DCTMs consistently outperform state-of-the-art survival analysis methods across various medical datasets, including radiology and pathology images.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The paper introduces a flexible and generalizable survival modeling framework that extends conditional transformation models with deep neural networks. (2) DCTMs serve as a survival analysis “head” that can be attached to any neural network feature extractor, allowing researchers to leverage existing powerful models (e.g., CNNs for histopathology, transformers for clinical text) without redesigning the survival component. (3) The framework shows superior predictive accuracy in terms of time-dependent C-index across multiple publicly available datasets in radiology and pathology. This strong evaluation demonstrates its practical effectiveness and robustness.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    (1) Outdated Baselines: The choice of baselines is insufficient. The main comparisons are against Cox Proportional Hazards (1972) and DeepSurv (2018), which are significantly outdated given recent developments in survival analysis and deep learning-based model. For example, authors should compare with recent methods. (3) Lack of Clarity in Tables: Table 1 contains multiple notations (e.g., DCTM^S, DCTM^SS, DCTM^G) that are not clearly defined or explained in the main text. The subscript 1 and 10 are also unexplained or only vaguely mentioned. These need to be explicitly annotated either in the table caption or within the text, ideally both.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a novel methodological contribution to survival analysis. The idea of deep conditional transformation is promising and has potential. However, the paper is weakened by insufficient comparisons to recent state-of-the-art and unclear notation in important tables. If the authors address these concerns in the rebuttal, particularly by adding stronger baselines and clarifying the variants, the work would merit acceptance.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have adequately addressed the major concerns raised during the review process. While the initial choice of baselines was a valid weakness, the authors provided a reasonable justification for their selection, emphasizing the focus on modality-specific feature extraction using modern foundation models while retaining well-established survival heads. This modular approach is both practical and relevant. Additionally, the authors committed to improving clarity, notation, and table readability, which are essential for effective communication of their method and results.



Review #3

  • Please describe the contribution of the paper

    The paper proposes a flexible deep learning framework for survival analysis, which allows the specification of non-proportional hazards and non-linear effects for both tabular and imaging data. The framework is based on deep conditional transformation models and enables various parameterizations of the transformation function via Bernstein Polynomials.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper proposes a welcome alternative to traditional Cox Proportional Hazards (CPH) models.
    2. The state of the art is clearly and concisely summarised.
    3. The paper is well-structured and easy to follow.
    4. The method is evaluated on diverse datasets and across multiple data modalities.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Explaining the motivation for addressing non-proportional hazards and non-linear effects would be beneficial for audience from other backgrounds.
    2. The rationale for exploring different parameterizations of the transformation function is not well-articulated. When might these be necessary in practice? Real-world examples would be helpful.
    3. The results section does not include confidence intervals or standard deviations, which limits the assessment of statistical robustness.
    4. Different baselines are used across experiments without explanation. The reasoning behind these non-uniform choices should be clarified for better comprehension.
    5. The authors could discuss why certain datasets favour specific parameterizations or polynomial orders — this could offer deeper insight into model behaviour.
    6. Plotting survival curves and comparing them to Kaplan-Meier estimates would make the evaluation more compelling.
    7. For a fair comparison with Cox and DeepSurv models, the time-independent concordance index (c-index) should be reported in addition to time-dependent metrics.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. Clearly define the outcome of interest in the experiments section. Was this overall survival?
    2. Review the mathematical formulations for typographic errors. For example, in Section 2 on Conditional Transformation Models — it is unclear what set x belongs to.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper introduces a deep learning-based alternative to the traditional Cox Proportional Hazards model. The proposed method appears novel and is validated on a variety of datasets spanning multiple modalities. However, some minor adjustments are required and could add value to the paper.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors make a strong case for their paper in the rebuttal. My expectations from initial review were minor adjustments and justifications in the text. These have been promised by the authors.




Author Feedback

We sincerely thank the reviewers for their insightful feedback, which has helped us improve the manuscript and clarify the contributions of our work.

Motivation and advantages of DCTM [R2]: We clarify in the text that the key advantage of DCTMs lies in their flexibility: they generalize Cox-based models, allow modeling of non-proportional hazards, support parametric and non-parametric survival functions, and can incorporate non-tabular data (e.g., images) using any feature extractor. DCTMs serve as a unified framework for survival prediction that performs competitively across diverse modalities and datasets.

Novelty and significance [R2]: We respectfully clarify that, while DCTMs have been introduced in prior work for other outcome types, this work is, to our knowledge, the first to extend DCTMs to survival analysis with medical imaging and to evaluate them across multiple clinical modalities. This contributes a new direction in combining statistical modeling foundations with modern deep learning for survival outcomes.

Comparisons to older baselines [R1, R2]: We appreciate the concern regarding baselines. Our goal was to compare across a wide range of data modalities (tabular, radiology, histopathology) using representative, state-of-the-art survival models. Survival analysis is typically approached through Cox models, classification methods, or Bayesian techniques. We focus on frequentist methods, using established models like DeepHit and DeepSurv—both widely cited—and enhance them with modern foundation models for feature extraction. This keeps our comparators aligned with the current state of the art, despite relying on older modeling paradigms. Given space constraints, we prioritized clarity and consistency across datasets. We now clarify this choice in the manuscript and emphasize that our DCTM framework is modular and compatible with other feature extractors and survival heads.

No DeepSurv for pathology [R2]:
In most computational pathology works, frozen features from pretrained encoders (foundation models) are the input to a slide aggregator such as MIL attention. This means each sample input has a different size (different number of tiles) with the effect that optimization is done one sample at a time (batch size of 1). DeepSurv requires a larger batch size where samples can be ranked. Because of this, DeepSurv has not been extensively used and we see DeepHit being the standard survival loss in computational pathology. This has been addressed in the revised manuscript.

Evaluation metrics [R2, R3]: We appreciate the suggestion to include additional evaluation metrics. While we originally focused on the time-dependent C-index due to its widespread use, we agree that the Brier score provides complementary insights into calibration and have now included it in the revised version. We observe consistent results across datasets, supporting our original claims. We have also made sure to add the time-independent C-index for our experiments.

Uncertainty reporting [R3]: We thank the reviewer for pointing out the lack of variability estimates. We now report means and standard deviations over cross-validation folds for all reported metrics for consistency and to provide a clearer picture of model stability and robustness.

Optimal parametrization analysis [R3]:
We have addressed this excellent point by emphasizing that the model parameterization can be treated and tuned as a hyperparameter and relates to over-/underfitting.

Kaplan Meier [R3]: We agree that including a Kaplan–Meier plot would be beneficial, and we will add one in the final version if space permits.

Clarity, notation, and formatting [R1, R2, R3]: We have thoroughly revised the manuscript to improve clarity and readability. This includes: correcting typos, unifying notation, improving table readability, and expanding explanations of the model structure and loss functions. We thank the reviewers for identifying these issues.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The method is novel and the authors successfully addressed most concerns of the reviewers.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top