Abstract

Intracerebral hemorrhage (ICH) is a cerebrovascular disease with high mortality and morbidity rates. Early-stage ICH patients often lack clear surgical indications, which is quite challenging for neurosurgeons to make treatment decisions. Currently, early treatment decisions for ICH primarily rely on the clinical experience of neurosurgeons. Although there have been attempts to combine local CT imaging with clinical data for decision-making, these approaches fail to provide deep semantic analysis and do not fully leverage the synergistic effects between different modalities. To address this issue, this paper introduces a novel multi-modality predictive model that combines CT images and clinical data to provide reliable treatment decisions for ICH patients. Specifically, this model employs a combination of 3D CNN and Transformer to analyze patients’ brain CT scans, effectively capturing the 3D spatial information of intracranial hematomas and surrounding brain tissue. In addition, it utilizes a contrastive language-image pre-training (CLIP) module to extract demographic features and important clinical data and integrates with CT imaging data through a cross-attention mechanism. Furthermore, a novel CNN-based multilayer perceptron (MLP) layer is designed to enhance the understanding of the 3D spatial features. Extensive experiments conducted on real clinical datasets demonstrate that the proposed method significantly improves the accuracy of treatment decisions compared to existing state-of-the-art methods. Code is available at https://github.com/Henry-Xiong/3DCT-ICH.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2046_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/Henry-Xiong/3DCT-ICH

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Xio_Multimodality_MICCAI2024,
        author = { Xiong, Zicheng and Zhao, Kai and Ji, Like and Shu, Xujun and Long, Dazhi and Chen, Shengbo and Yang, Fuxing},
        title = { { Multi-modality 3D CNN Transformer for Assisting Clinical Decision in Intracerebral Hemorrhage } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The contribution proposes an architecture for treatment decision-making in intracranial haemorrhage that can work on multimodal data (CT + demographic + clinical). A 2D+3D CNN, followed by a transformer in which textual and numerical data are integrated, and a convolutional MLP produce results comparable with other methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The article is sufficiently clear and easy to read. The method seems quite original and well motivated from a clinical point of view. Some of the results are convincing.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Methodological and data information is sometimes essential and the paper would have deserved some additional information. The training of the transformer is unclear and, given also the variance of the cross-validation, overfitting conditions may have occurred that should have been better discussed. The ablation result is very strange, with an accuracy that stops at 56% for 3 out of 4 configurations and jumps to 85% when both variants are activated. A deeper methodological analysis following partial results throughout the proposed and varied version of the architecture and more ablations and justifications are suggested.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Given the private nature of the data, the results are difficult to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In addition to what said before, many acronyms are not exploded and this hinders the readability of the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall an interesting paper despite some quality/robustness aspects that remain difficult to appreciate given a certain lack of detail information.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposes a multi-modality model which combines CT images and clinical data for intracerebral hemorrhage diagnosis. It uses a CNN-Transformer architecture to capture the 3D spatial information from input image and applies a pre-trained CLIP module to extract demographic feature from clinical data. Furthermore, a CNN-based MLP is designed to enhance the 3D spatial features.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The paper is well-written and easy to follow. (2) The design of network is properly based on the requirements and limitations of the problem. (3) The achieved performance with 0.903 AUC and 0.846 Accuracy in cross-validation is significant superior that other SOTA methods. (4) The authors have made their code available.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) The overall architecture of the proposed method is somewhat similar to CLIP model, with modifications made to some of its modules, such as the use of the cross-attention module which already exist in literature. (2) The proposed CMLP is not properly motivated. The design choices are not explained either. Moreover, CMLP may not be effective, as evident by Table 2: adding CMLP resulted in a lower AUC (0.603 vs 0.608).

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Code will be made available. I encourage the authors to make their dataset available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    (1) The effectiveness of CLIP is very significant (improved by ~30% in AUC), more discussion about this result is welcome. (2) Provide more information about the dataset would be helpful.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper designs an appropriate method to address a clinically significant problem. However, there are some weaknesses as mentioned above.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper introduces a multi-modality model to perform treatment prediction for intracerebral hemorrhage. CT-based features are extracted using 3D CNN and clinical data-based features are encoded using a pretrained CLIP module. Both features are integrated and exploited using a Transformer block with cross attention. Finally, a custom CNN-based MLP layer outputs the prediction.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    • Interesting use of 2D convolutions to reduce the axial slice size and alleviate computation. • Novel use of the CNN-based multilayer perceptron improving the capabilities of the ViT MLP limitations, allowing a deeper spatial information processing. • Original concatenation of CNN and Transformer to extract local- and global-level image features. • From clinical perspective, the model follows a similar analysis procedure to the one followed in a clinical diagnostic approach. Gives explainability. • The ablation study included demonstrates substantial improvement when both the CLIP and CMLP modules are included. • Surpasses SOTA.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    • Single-center data without external validation. • Neither 3D CNNs nor Transformers nor CLIP are novel on their own. • Missing the original metrics of the GCS-ICHNet and TOP-GPM models, which are higher than those obtained with authors multimodal data. It is fair to do the comparison based on the same data, but the original metrics should be included too.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    It could be interesting to evaluate the model with a couple of external databases to see its behavior when exposed to such variability.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    They propose a solution for Intracerebral hemorrhage making use of 3DCNNs, transformers and CLIP. Although these are well-known technologies separately, they innovatively merge them adding a novel MLP module and justifying how the performance improves when including both CLIP and the novel MLP. In addition, they outscore the current methods in the field. However, they are missing further validation with external datasets.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

N/A




Meta-Review

Meta-review not available, early accepted paper.



back to top