Abstract

Survival analysis stands as a pivotal process in cancer treatment research, crucial for predicting patient survival rates accurately. Recent advancements in data collection techniques have paved the way for enhancing survival predictions by integrating information from multiple modalities. However, real-world scenarios often present challenges with incomplete data, particularly when dealing with censored survival labels. Prior works have addressed missing modalities but have overlooked incomplete labels, which can introduce bias and limit model efficacy. To bridge this gap, we introduce a novel framework that simultaneously handles incomplete data across modalities and censored survival labels. Our approach employs advanced foundation models to encode individual modalities and align them into a universal representation space for seamless fusion. By generating pseudo labels and incorporating uncertainty, we significantly enhance predictive accuracy. The proposed method demonstrates outstanding prediction accuracy in two survival analysis tasks on both employed datasets. This innovative approach overcomes limitations associated with disparate modalities and improves the feasibility of comprehensive survival analysis using multiple large foundation models.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0334_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Qu_Multimodal_MICCAI2024,
        author = { Qu, Linhao and Huang, Dan and Zhang, Shaoting and Wang, Xiaosong},
        title = { { Multi-modal Data Binding for Survival Analysis Modeling with Incomplete Data and Annotations } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposed a framework to handle incomplete modalities and incomplete labels for survival analysis. The model encodes each modality independently and then aligns them in the representation space. Pseudo labels and uncertainty are added to improve prediction accuracy. The method performs well on two datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper proposed a multi-modal classification method that considers both incomplete data and labels.
    2. The model aligns different modalities in the feature space to allow better data fusion.
    3. The method performs well on two different datasets and allows the interpretability of each modality’s importance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. It’s not clear how the data is being preprocessed, e.g., how images are being preprocessed or what the keywords for segment reports are.
    2. Section 2.4 is hard to understand. It’s not clear how are the losses and the lambda_cen being calculated.
    3. There’s no information for the in-house dataset and no experiments on any public dataset.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The font in Fig. 2 is very small and hard to read.
    2. InfoNCE loss does not have either a reference or formula.
    3. Since results in Table 1 and Table 2 are five-fold results, it’ll be great to report standard deviation as well.
    4. It’ll be more informative to report attention in Fig. 3 on a population-based instead of three individuals.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The proposed idea is interesting and contains various modalities.
    2. The paper lacks details for reproducibility, e.g., missing loss explanation and data preprocessing steps.
    3. The paper does not have details on the dataset or run on any publicly available datasets.
    4. The attention score is only shown on three individual patients without content. This does not provide any information.
  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    Some of the concerns are being address. However, the in-house datasets are still not well explained, and the attention would be more meaningful to somehow report by population.



Review #2

  • Please describe the contribution of the paper

    This paper proposes a multimodal deep framework to predict the overall survival predication with incomplete data and censored survival labels. The manuscript adopted several foundation models to encode different modalities. The experiments were done on two different clinical datasets with variable number of subjects and modalities. The paper comes with certain level of novelty and promising results.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The manuscript is generally well-written, well-organized, and easy to follow.
    • The methodology is clear and with certain level of novelty.
    • Experiments were conducted on two different clinical multimodal datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Authors neither share link for their code nor promised to publicly release it.
    • Some typos and language mistakes.
    • No limitations and future work are given.
    • There are some missing details about some parameters in the proposed method.
    • No given statistical analysis.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • What are “w” and “V” represented in Eq. (1)?
    • What is “InfoNCE Loss”? There are neither details nor references for it.
    • Conclusion is too short. It needs to be rewritten to include some limitations and potential future works.
    • Some abbreviations are defined multiple time, e.g., “OS”.
    • The results presented on Table 1 show only slight improvement, compared with ShaSpec.
    • There are some writing typos. For instance, there are commas before the words “accurately” in the Abstract and “separately” in the Method. Also, “a array” in Page 4.
    • References are not ascendingly ordered in the entire manuscript.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See my comments in the “constructive comments for the authors” section.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    Although the authors mostly responded to the reviewers’ comments, it is still unconvincing. In addition, some important details are missing and cannot be included in the final version.



Review #3

  • Please describe the contribution of the paper

    This paper presents a novel multi-modal survival analysis framework tailored to address critical challenges in cancer treatment research, including incomplete data and censored survival labels. It also demonstrates a joint framework that revolutionizes survival analysis by handling incomplete data across modalities and censored survival labels together.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • A variety of foundation models were used to encode each modality and bind them into aligned representations for a more generalizable means of multi-modal data fusion.
    • Superior results were illustrated in two survival tasks across two real clinical datasets.
    • The paper is relatively easy to follow, and the ideas are well-presented. The authors have made certain effort to ensure that the paper is understandable to a wide audience.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Would be better to also include the ablation studies on radiological images and reports.
    • Even though the author offerred a flexible and efficient solution by unifying the aggregation of intra-modal and inter-modal features into a problem of multi-instance aggregation based on the attention mechanism.The proposed method may not explicitly consider the uneven and complementary semantic information contained in the various modalities.
    • Lack of description of results with central tendency (e.g. mean) & variation (e.g. error bars). As performance for certain scenarios were quite close.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N.A.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please see the Main Weakness Section for more detailed comments.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    An interesting approach of tackling the incompleteness of input data in multi-modal survival analysis. However, conventional contrastive loss-based approaches may not have the capabilities to close the semantic gap and assess the similarity of fine-grained image details, making it less convincing to the audience.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    Most of the concerns have been address and the rest will be incorporated into the future work.




Author Feedback

We appreciate all the comments and recognition of our novelty and superior results. All corrections will be made to the camera-ready.

(Q1) Code release. (R1, R4) The code will be released upon acceptance.

(Q2) InfoNCE loss (R1, R4) InfoNCE Loss [R1] is common for contrastive learning. It compares the similarity of samples and encourages the model to identify positive samples among the negatives. [R1] Representation Learning with Contrastive Predictive Coding.

(Q3) Add STD (R4, R5) Due to space limits, we reported the mean of 5-fold validation. On average, the STD of all folds are below 0.5%. Detailed STD will be added.

R1 (Q1) Short limitations and future work. Due to space limits, we briefly mentioned them in the conclusion. The main limitation is the lack of validation on larger multi-center medical datasets, which was planned in our future work.

(Q2) What are w and V represented in Eq.(1)? They are learnable parameters used to obtain the attention weights for each instance.

(Q3) No statistical analysis. We provided p-values for the 2 datasets in the lower left corners of Fig. 3 A and B. They are both less than 1e-7, indicating our efficacy.

(Q4) Slight gain with ShaSpec. Evaluating prognosis using CI and BS metrics is challenging. Compared to ShaSpec, our method shows clear average gains of 0.9% in CI and 0.8% in BS across 4 tasks on 2 datasets.

R4 (Q1) Unclear data preprocessing, e.g., images and keywords to segment reports. In Sec 2.1, we explain the preprocessing and encoding process for all modalities. Due to space limits, we did not detail the keywords used to segment the reports. They were provided by pathologists through structured text report data, including tumor location, size, grading, etc.

(Q2) Sec 2.4 is hard to understand. How are the losses and the lambda_cen being calculated. Survival analysis predicts the probability of patient survival at a series of time points. We divide survival times into intervals and construct a maximum likelihood loss from these discrete labels. A prediction layer is used to regress death hazards and survival probabilities. For uncensored patients, the model can be optimized through accurate labels. For censored patients, we propose estimating hazards beyond the censoring time by predicting risks, assigning hazards, normalizing with softmax, and forming soft labels. A time-dependent Gaussian function is used to weight soft labels in training. lambda_cen is a hyperparameter set empirically to balance the losses.

(Q3) No information for the in-house dataset and no public dataset is used. In Sec 3.1, we detail the composition and quantity of the 2 datasets. Due to the shortage of public datasets with sufficient cases and complete multi-modality data (images, reports, and clinical notes), we instead experiment on 2 large in-house datasets with 367 and 193 cases.

(Q4) Reporting attention in Fig.3 for a population would be more informative than individuals. In Fig. 3, we selected 3 typical patients with 4, 3, and 2 complete modalities for illustration. Overall, the weights will be averaged out across the population. It also may not be statistically meaningful to compute the mean when many of the cases are missing certain modalities.

R5 (Q1) Ablation studies on radiology images and reports. We consider pathology images and reports as essential modalities (all patients have these 2 modalities). But radiology images and reports are often missing, making it infeasible to conduct ablations with only radiology images and reports.

(Q2) Not explicitly consider the uneven and complementary semantic information in various modalities. Yes, we only learn the weights for the different sources of information (features) via attention mechanism, which relies on the training data and functions implicitly. Thank you for the valuable comment and we will try to integrate the semantic information (maybe as a form of prior knowledge) using more data in future work.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    After rebuttals, there are still considerable concerns from two reviewers. Plus for the proposed model, there are very small increases (and no statistical analysis to show whether it is significant) compared to related work. It is also questionable whether it is valid to generate pseudo labels for censored data (what about the pseudo labels are wrong for certain patients?).

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    After rebuttals, there are still considerable concerns from two reviewers. Plus for the proposed model, there are very small increases (and no statistical analysis to show whether it is significant) compared to related work. It is also questionable whether it is valid to generate pseudo labels for censored data (what about the pseudo labels are wrong for certain patients?).



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Both the paper and rebuttal are read clearly. The work addresses an interesting and clinically valuable problem by considering issues of incomplete modality and labels in multimodal survival prediction. The authors’ rebuttal effectively resolved most of the reviewers’ concerns, demonstrating the robustness and applicability of their method. Given the innovative nature and clinical significance of the work, along with the satisfactory responses in the rebuttal, I recommend acceptance.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Both the paper and rebuttal are read clearly. The work addresses an interesting and clinically valuable problem by considering issues of incomplete modality and labels in multimodal survival prediction. The authors’ rebuttal effectively resolved most of the reviewers’ concerns, demonstrating the robustness and applicability of their method. Given the innovative nature and clinical significance of the work, along with the satisfactory responses in the rebuttal, I recommend acceptance.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Paper proposes a multimodal deep framework to predict the overall survival with incomplete data and censored survival labels. Reviewer concerns include missing parameters, no statistical analysis, lack of info on pre-processing, data description, Majority of these concerns have been reasonably addressed in the rebuttal

    This leaves two points. Performance gains are relatively minor (~0.8-0.9%) and the lack of statistical analysis is a concern. A bigger point is whether is makes methodological sense to “guesstimate” hazards in the fashion explained for censored data (as pointed out by MR1).

    I think between MR3 and the 2 reviewer weak accepts, I would vote for a “weak accept”. The paper clearly has some merit to it, and a MICCAI discussion would probably benefit it overall.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Paper proposes a multimodal deep framework to predict the overall survival with incomplete data and censored survival labels. Reviewer concerns include missing parameters, no statistical analysis, lack of info on pre-processing, data description, Majority of these concerns have been reasonably addressed in the rebuttal

    This leaves two points. Performance gains are relatively minor (~0.8-0.9%) and the lack of statistical analysis is a concern. A bigger point is whether is makes methodological sense to “guesstimate” hazards in the fashion explained for censored data (as pointed out by MR1).

    I think between MR3 and the 2 reviewer weak accepts, I would vote for a “weak accept”. The paper clearly has some merit to it, and a MICCAI discussion would probably benefit it overall.



back to top