Abstract

The early detection and precise diagnosis of liver tumors are tasks of critical clinical value, yet they pose significant challenges due to the high heterogeneity and variability of liver tumors. In this work, a precise LIver tumor DIAgnosis network on multi-phase contrast-enhanced CT, named LIDIA, is proposed for real-world scenario. To fully utilize all available phases in contrast-enhanced CT, LIDIA first employs the iterative fusion module to aggregate variable numbers of image phases, thereby capturing the features of lesions at different phases for better tumor diagnosis. To effectively mitigate the high heterogeneity problem of liver tumors, LIDIA incorporates asymmetric contrastive learning to enhance the discriminability between different classes. To evaluate our method, we constructed a large-scale dataset comprising 1,921 patients and 8,138 lesions. LIDIA has achieved an average AUC of 93.6% across eight different types of lesions, demonstrating its effectiveness.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1629_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1629_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Hua_LIDIA_MICCAI2024,
        author = { Huang, Wei and Liu, Wei and Zhang, Xiaoming and Yin, Xiaoli and Han, Xu and Li, Chunli and Gao, Yuan and Shi, Yu and Lu, Le and Zhang, Ling and Zhang, Lei and Yan, Ke},
        title = { { LIDIA: Precise Liver Tumor Diagnosis on Multi-Phase Contrast-Enhanced CT via Iterative Fusion and Asymmetric Contrastive Learning } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes a framework to perform liver diagnosis and liver segmentation from multiple contrast CTs. The paper utilizes recently proposed transformer based architecture to perform several class prediction and an encoder decoder based method to segment liver. The authors have evaluated with an in-house dataset and achieved comparable better results than the SOTA methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The authors address a heterogenous problem in liver data where there are multiple contrast CT scans.
    2. The authors utilize recently proposed transformer based methods to boost their performance.
    3. The authors have constructed an in-house dataset to evaluate their proposed method.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The novelty is limited. The authors utilizes mask2former architecture to
    2. Lack of ablation studies. There is no ablation study to compare Iterative fusion method with respect to standard fusion method.
    3. I feel if we carefully fine-tune nnUNet or use Attention UNet we can achieve better results than the proposed method as the difference between the SOTA methods and the proposed method is very less.
    4. In order to get the performance analysis clear, it is better to have compared with several transformer based architectures as well. Eg. Swin UNetr.
    5. How does the model perform on the different domain data such as LiTS ?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    It is difficult to reproduce the architecture of the paper because of lack of data and lack of architectural details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The papers addresses a very important problem in CT which is heterogeneous data. However, The paper lacks novelty along with more experimental results to compare with several SOTA segmentation + classification architectures. The authors might need to do ablation study inorder to show the necessity of the component proposed.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Lack of Novelty, Lack of ablation studies, Lack of comparison with other SOTA methods.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    My main concern was comparing with the SOTA methods which the authors failed to show in the paper. The authors have promised to add it in the revision version and they claim that the proposed method is superior than the SOTA methods. Even though the paper has limited novelty, I feel the paper is worthy for MICCAI due to clinical importance.



Review #2

  • Please describe the contribution of the paper

    Contributions of the paper: The paper proposes LIDIA, a method to segment and classify various types of liver tumors from multiple Contrast-Enhanced CT scans. The method combines registered CT scan phases (non-contrast, arterial, venous, and an optional delayed) though a proposed Iterative Fusion Module and encodes the combined CT features with a nnUnet-like encoder. Features from a Feature Pyramid Network-like decoder are then extracted and utilized in a Mask2Former module from which tumor classification scores are derived. Along with the standard classification and segmentation training objectives, an asymmetric contrastive loss is employed to further improve distinction between different liver tumor types.The paper investigates and compares methods to derive a patient-level classification score, proposing LiverMax (softmax output of the semantic segmentation) as the best method. The proposed LIDIA method outperforms various strong baselines on both an internal and external cohort.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Relevance of Problem: The paper addresses the significant and relevant problem of liver tumor classification. Strengths of the proposed model:

    1. Comprehensive Use of CT Data: Utilizes all available CT scan data to enhance diagnostic accuracy.
    2. employing nnUnet Preprocessing 3.nnunet Encoder: Employs nnUnet’s CNN encoder for robust feature extraction. When training from scratch this remains effective in comparison with more SOTA encoders trained in SSL methods. 4.Mask2Former Usage: Incorporates Mask2Former for diagnosing small lesions, enhancing the model’s diagnostic capability. 5.Training Objective: Includes an Asymmetric Contrastive Learning approach, which is well-suited for distinguishing between different types of liver tumors.

    Patient-Level Tumor Score: Focuses on deriving a patient-level tumor score, directly addressing clinical needs and enhancing practical utility.

    CT Scan Phase Combination: The method for combining various CT scan phases, including handling optional phases, is innovative and can potentially be adapted for other CT-based diagnostic methods.

    3D Adaptation: Adapting the complete pipeline to 3D is non-trivial and promises significant value, especially if the methodology is open-sourced.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    From the diagram provided it seems like only the NC phase (or only 1 phase) is mandatory and all other phases are treated in the same manner, iteratively concatenating and combined though convolutions. Clarification on this component can be helpful.

    The work only focuses on tumor positive cases. Is it assumed that the model will only be deployed when it is already known that there is an abnormality? Or is detection also interesting? Clarification on its real world utility will be useful

    In the Patient-wise accuracy LIDIA convincingly outperforms the other methods, while it underperforms on the pixel-wise accuracy. Since the results are rather close, it would be good to determine the statistical significance.

    Although not mandatory, it would be good from the authors good to mention the limitations of their work

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    With the code for mask2former and nnunet being publicly available, combined with a clear description of the proposed implementation, it should be possible to reproduce everything the model. It would be great if the complete source code could be made publicly available since many other abdominal CT applications can benefit from this architecture. I do not think data will be shared.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    A more detailed explanation on how each CT scan phase contributes to the model performance, particularly focusing on the necessity and impact of optional phases beyond the mandatory NC phase would be useful.

    Discuss LIDIA’s capabilities for detecting tumors in non-preselected cases. Can it be used as a detection method or is that outside the scope of this work?

    Statistical significance of results: Perform statistical tests to confirm the significance of the differences in patient-wise and pixel-wise accuracy, offering a clearer assessment of the method’s performance.

    Extension to other domains: Discuss how LIDIA might be adapted for other types of tumors

    Discussion on Limitations and Future Work: Please provide some analysis of the limitations of your current study and suggest future research directions

    Is all the data used and annotated, including the external cohort, collected during this study?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper presents the LIDIA, an effective approach to segmenting and classifying liver tumors from Contrast-Enhanced CT scans. It demonstrates clear strengths in utilizing multiple phase CT data, employing robust architectural elements such as components from the nnUnet and Mask2Former, and proposing an effective asymmetric contrastive learning component in the training phase. Additionally, the focus on deriving a patient-level tumor score directly addresses clinical needs, significantly enhancing the method’s relevance and applicability in real-world settings. The method could potential also be adapted to other diagnostic areas other than liver tumors, making it more broadly interesting. It could be highly beneficial if made publicly accessible to the wider research community.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The paper addresses liver tumor classification in Contrast-Enhanced CT scans using the newly proposed LIDIA method. LIDIA builds on top of the mask2former, proposes the Iterative Fusion Module (to deal with incomplete phase issues) and effectively applies the approach to address a relevant problem. Sufficient evidence is provided to promote the effectiveness of the proposed approach over prior work. While not every aspect is novel (as suggested by reviewer 1), utilizing effective components (mask2former) and introducing the necessary novel elements (IFM and ACL) to advance the application field, warrants publishing and acceptance of the work. Additionally, I do agree with review 1 that further implementation details and, ideally, released code, would strengthen the work.



Review #3

  • Please describe the contribution of the paper

    The paper presents a novel network named LIDIA, designed to improve the diagnosis of liver tumors using multi-phase contrast-enhanced CT scans. LIDIA addresses two primary challenges: the often omitted delayed phase in scans and the differentiation of rare tumor types from more common ones. It achieves this through an iterative fusion module that integrates all available CT phases and an asymmetric contrastive learning approach to handle the diversity of tumor types.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1- Diagnostic Network (LIDIA): The introduction of the LIDIA for its iterative fusion of all available CT phases, which enhances the analysis of lesion characteristics across different time points. This approach is important because it addresses the common issue of incomplete phase scanning by effectively utilizing whatever phase data is available, thereby not limiting the diagnostic potential when the delayed phase is missing.

    2- Asymmetric Contrastive Learning: The implementation of asymmetric contrastive learning for handling the heterogeneity of tumor types, especially rare ones. This method improves the network’s ability to differentiate between common and rare tumor types by enhancing intra-class compactness and inter-class discriminability.

    3- Evaluation and Validation: The paper conducts a benchmark evaluation of the LIDIA network in comparison with other state-of-the-art networks including nn-UNet, Mask2Former, and PLAN.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Generalization to Rare Lesion Types: The paper discusses using asymmetric contrastive learning to distinguish between common and rare tumor types. However, it may not fully address the challenge of significant diversity within the ‘others’ class. This could lead to misclassification or inadequate clustering of rare tumor types, affecting the model’s diagnostic accuracy in real-world scenarios.

    2- Scalability and Computational Efficiency: The paper does not explore the computational demands of the LIDIA network, especially given the complex operations involved in multi-phase iterative fusion and asymmetric contrastive learning.

    3- Dataset and Annotation Limitations: The model’s performance is validated on a dataset where two-thirds of the cases include the delayed phase. This might not accurately represent real-world situations where the delayed phase is often missing. Additionally, the quality and consistency of lesion annotations across such a large dataset can vary, potentially impacting the training and evaluation of the model.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1- Please provide access to the code and dataset to ensure the reproducibility of the work.

    2- Please consider a computational comparison between your proposed network and other state-of-the-art networks, such as the number of parameters and training time, etc.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In recommending this paper, I focused on a couple of key features: the use of the Diagnostic Network (LIDIA) and Asymmetric Contrastive Learning. LIDIA’s approach to integrating all available CT phases is important because it makes the most of the data at hand, ensuring effective diagnosis even when scans are incomplete. This method greatly enhances the usefulness of the network in clinical settings. Also, the introduction of Asymmetric Contrastive Learning helps tackle the challenge of different tumor types, especially rare ones. It improves the network’s ability to tell these tumors apart, boosting diagnostic accuracy. The thorough evaluation and comparison with other advanced networks like nn-UNet and Mask2Former also support the effectiveness of the methods used in this paper, making it a positive contribution to the field.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The Iterative Fusion Module (IFM) and Asymmetric Contrastive Learning (ACL) effectively address the incomplete phase issue and improve discriminability for rare tumour types. The internal and external validations confirm LIDIA’s performance over strong baselines. The authors’ explanations regarding computational resources, the limitations of the LiTS dataset, and the handling of rare tumour types are clear and satisfactory. While LIDIA requires more computational resources, the accuracy gains justify this tradeoff. However, the improvement in Dice accuracy is not substantial compared to nnUNet and Mask2Former, which is my only concern.




Author Feedback

We appreciate all the reviewers (R1, R3, R4) for their thoughtful comments and constructive suggestions. We also appreciate the recognition of our work as a “novel network” and for addressing a very significant problem (R2, R3). Novelty and Contribution @R1: This work proposes a precise LIver tumor DIAgnosis (LIDIA) network on multi-phase contrast-enhanced CT for real-world scenarios. We design a novel Iterative Fusion Module (IFM) to address the incomplete phase issue and employ Asymmetric Contrastive Learning (ACL) to increase discriminability for rare tumor types. Comprehensive internal and external validations verified LIDIA’s superiority compared to strong baselines. Experimental Comparison @R1: We have included nnUNet in the comparisons (Table 1). It is generally considered that nnUNet’s pipeline is well-established thus further adjustment yields marginal performance gains. We also compared with strong baselines, Mask2Former and PLAN, in which LIDIA shows superior performance. Ablation studies and comparisons with other fusion methods are detailed in Tables 2 and 3. Networks such as nnUNet, Attention UNet, and Swin UNETR were primarily designed to enhance segmentation performance, yet their classification capabilities (tumor diagnosis) may be limited, as evidenced by the results of nnUNet in Table 1. Due to rebuttal policies, we cannot provide new results here, but we will compare with more SOTA methods in our extension paper. LiTS dataset @R1: LIDIA focuses on a clinically relevant task: differential diagnosis of liver tumor types using multi-phase CT. However, the LiTS dataset has only one lesion type annotated and contains only one CT phase. It is designed for liver and tumor segmentation without diagnosis, thus may not be suitable for LIDIA, compared with our two datasets with 2749 samples, 8 tumor types, and 4 CT phases. IFM Design @R3: Yes, IFM starts with the NC phase and iteratively combines other phases. It is inspired by the clinical prior of the multi-phase imaging progress, while offers the potential to handle cases with arbitrary missing phases. We agree that it is beneficial to study how each phase contributes to the model and plan to supplement in the extension paper. Tumor detection @R3: We evaluated on tumor-positive cases in this paper. However, LIDIA can also handle tumor-free cases without needing to modify its structure, if such samples are given in the training set. We will investigate the situation in our future work. Statistical Significance @R3: t-test results show LIDIA is significantly better than all baselines in patient-wise diagnosis, but nnUNet and Mask2Former is significantly better than LIDIA for pixel-wise Dice (only 0.6% better in avg Dice). LIDIA’s strength in classification is aligned with our key objective. Limitations and Future Work @R3: LIDIA still has room for improvement in rare and hard tumor types such as ICC and “others”. Our future work has been noted in questions above. It is also interesting to apply LIDIA to other organs. Discussion about rare types @R4: Due to the limited number of rare types, it may not fully address the challenge of significant diversity. However, it is foreseeable that ACL helps distinguish between common and rare types, increasing the likelihood of rare cases being classified into “others,” which is beneficial. Delay Phase Ratio @R4: The delay phase is recommended to be scanned in liver tumor diagnosis guidelines, but it also depends on each center’s routines. The ratio of delay phases is 2/3 in our internal dataset while less than 10% (53/828) in the external dataset. Our method achieved the best AUC in both datasets, confirming its effectiveness despite different ratios of the delay phase. Efficiency @R4: LIDIA requires more computational resources (~12G, ~110s/epoch vs. nnUNet’s ~8G, ~70s/epoch), a tradeoff we deem justified given its gains in accuracy.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The author has successfully addressed the concerns raised by the first review.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The author has successfully addressed the concerns raised by the first review.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top