Abstract

Pulmonary embolism (PE) is a life-threatening condition where rapid and accurate diagnosis is imperative yet difficult due to predominantly atypical symptomatology. Computed tomography pulmonary angiography (CTPA) is acknowledged as the gold standard imaging tool in clinics, yet it can be contraindicated for emergency department (ED) patients and represents an onerous procedure, thus necessitating PE identification through non-contrast CT (NCT) scans. In this work, we explore the feasibility of applying a deep-learning approach to NCT scans for PE identification. We propose a novel Cross-Phase Mutual learNing framework (CPMN) that fosters knowledge transfer from CTPA to NCT, while concurrently conducting embolism segmentation and abnormality classification in a multi-task manner. The proposed CPMN leverages the Inter-Feature Alignment (IFA) strategy that enhances spatial contiguity and mutual learning between the dual-pathway network, while the Intra-Feature Discrepancy (IFD) strategy can facilitate precise segmentation of PE against complex backgrounds for single-pathway networks. For a comprehensive assessment of the proposed approach, a large-scale dual-phase dataset containing 334 PE patients and 1,105 normal subjects has been established. Experimental results demonstrate that CPMN achieves the leading identification performance, which is 95.4% and 99.6% in patient-level sensitivity and specificity on NCT scans, indicating the potential of our approach as an economical, accessible, and precise tool for PE identification in clinical practice.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2986_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Bai_CrossPhase_MICCAI2024,
        author = { Bai, Bizhe and Zhou, Yan-Jie and Hu, Yujian and Mok, Tony C. W. and Xiang, Yilang and Lu, Le and Zhang, Hongkun and Xu, Minfeng},
        title = { { Cross-Phase Mutual Learning Framework for Pulmonary Embolism Identification on Non-Contrast CT Scans } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a new learning-based approach for detecting pulmonary embolism (PE) using non-contrast enhanced CT (NCT) images. The method employs a mutual learning strategy, where parallel network training with paired NCT and CT pulmonary angiography (CTPA) is conducted. The extracted features and predicted labels from the CTPA branch are utilized to guide the training of the NCT branch. Additionally, a feature alignment unit between the two branches and modified dense center loss are applied to enhance network training. Validation results using an in-house dataset and two public benchmark datasets demonstrate that the proposed MLS framework improves PE detection performance compared to baselines and earlier methods when only NCT data is available.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The core strengths of this new approach lie in the various techniques employed to guide training with NCT data alongside the parallel CTPA network at different stages. Firstly, a feature alignment graph is utilized in the encoders to encourage synchronization of the extracted features. Secondly, the minimization of the Kullback-Leibler divergence is employed in the classification decoders to align the prediction results. Finally, at the segmentation tail, the modified dense center loss is utilized to further enhance the consistency of the label cluster. The results of the ablation study clearly demonstrate the performance gain from these techniques, and the full framework allows for comparable performance with NCT data when compared to the baselines obtained with CTPA data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the method presented in this paper is the fact that this training strategy requires paired and registered CTPA and NTC image data for each subject. Compared to existing unsupervised domain adaptation (UDA) techniques, this method has stricter data requirements and thus could be more difficult to generalize to a more flexible framework.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. UDA-based methods could be directly compared to this approach, where they have the clear advantage in terms of the flexibility of required training data. Efforts to better address the comparison to those methods will be appreciated by the audience.
    2. Another potential addition to the validation study is to check if the same strategy could achieve further performance gains when a stronger backbone model is used. For example, with parallel nnFormer networks across CTPA and NCT training, can we also deploy MLS+IFA+IFD, and should we expect further performance gains?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall, the method proposed in this paper is interesting and the results are impressive. However, compared to earlier UDA-based methods which share a similar design philosophy, this technique has stricter training data requirements. The authors should properly address this in the discussion.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposes a method for pulmonary embolism identification and segmentation on non-contrast CT images by leveraging information from CT pulmonary angiograms in training. The method incorporates multiple modules to achieve this goal, and is evaluated on multiple datasets including >1400 patients.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper addresses an important clinical task and does so in an interesting way by proposing a method which is able to improve performance when only noncontrast CTs are used in inference.

    The different modules for features alignment and segmentation improvement are well-motivated and well-explained in the text. Table 1 provides ablation results that support the value of each module, which seem to particularly improve segmentation but perhaps not classification to as high a degree.

    The paper outperforms previous baselines on the segmentation and classification tasks. The paper also provides an interesting comparison with radiologist classification of images in the test set, which provides further valuable context for the results. Tables 2 is compelling, and the provision of statistical significance strengthens the results provided.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Generally, the weaknesses I see with this paper are minor.

    1. There are instances where additional context might be warranted in the results or discussion section regarding the significance of results presented.

    While the results in Table 2 clearly demonstrate superiority on each of the tasks for the proposed method, this is almost certainly due to the baselines being trained on only noncontrast CT rather than having access to any additional information from CTPA. The proposed method is obviously significant and because it is able to leverage this information in training and not require it in inference, which the baselines cannot do, but a discussion of this point should be present. An additional set of results could also be provided to demonstrate the performance of these baselines (nnUNet, nnFormer, etc) when taking CTPA as input. If the paper’s proposed method is able to come close to those results, it would make the paper even more compelling.

    Similarly, radiologist diagnosis of PE from noncontrast CT is uncommon when compared with the use of CTPA meaning radiologists may be less accustomed to this task. While this is again a strength of this work because, as the paper mentions, not all patients can have CTPA for various reasons, mentioning that radiologists are not necessarily experienced in noncontrast CT diagnosis of PE might provide more context to the radiologist comparison results.

    1. There are portions of the paper that could be made a little bit more clear. For instance, the discussion of the IFA module is quite hard to follow.

    2. The FUMPE dataset results are a little bit not clear, I assume due to space limitations. Is only the segmentation task performed on that dataset? If so, is there a reason for this? Furthermore, I’m not sure that the rationale for why the proposed network would improve performance on a single data stream would make sense. While having a publicly available external dataset would certainly be a strength of the work, I think that the use of FUMPE is not particularly well-justified here and I am unsure whether its addition adds anything to the paper.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    It does not appear that code is being provided for this paper. Additionally, it does not appear that the main dataset of interest (the paired NCT and CTPA data) will be provided. Provision of these two would certainly strengthen the work by ensuring reproducibility, but the description of the approach is mostly clear enough to allow reproduction. I think some clarity on the implementation of IFA would improve this front as well.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall, this paper is fairly well-organized and written, with a strong premise and clinical motivation. The methodology is well justified through both logical motivation and performance in ablation experiments, and the method outperforms existing approaches.

    I think that this paper could even more strongly benefit from additional discussion of the results in the context of some of the points made in the weaknesses section, and that the inclusion of results on models trained only on CTPA would add to this. Even if the proposed approach does not outperform CTPA-trained and evaluated models, near equal performance would be a strong result.

    Additionally, how is the DeLong test implemented for Table 2? Shouldn’t CPMN be compared with DML since DML is the next highest performing method? The AUC values are close, so I am not sure that there would be statistical significance there.

    I also think that further discussion of the FUMPE dataset is warranted. I’m not sure that the motivation for single-stream overperformance of the proposed method is clear, and the FUMPE results are condensed to the point that they don’t strongly add to the message of the paper.

    Finally, release of the dataset/code would really strengthen the work. If this is not possible, a more in-depth description of the implementation for IFA is necessary to say that this paper is reproducible.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is nearing the borderline in my opinion. The results and motivation are strong as is the task. However, I think that either the release of code or much more clarified description of the method is needed to fully accept the paper. The other discussion points mentioned in the detailed comments would also help, but my primary concern is reproducibility.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper considered the pulmonary embolism identification problem using multi-modalities, and formulated it as a binary classification task and a binary sgementation task in multi-task learning regime. Several techniques are proposed to improve the identification performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The problem is well motivated, and the idea is pretty novel, and the empirical results are very promising. This paper is well written and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Though the classification performance is very high, the pulmonary embolism segmentation task only achieved dice of 78.5. No analysis of the potential reasons provided.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Would be better if further analysis of the failure pattern in classification task was performed
    • Would be better if further analysis of the failure pattern in segmentation task was performed and the potential reasons were discussed
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The problem is well motivated, and the idea is pretty novel, and the empirical results are very promising. This paper is well written and easy to follow.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Thanks for all valuable feedback and concerns regarding our submitted paper.

[R3 - UDA-related] We acknowledge that our proposed method requires paired and registered CTPA and NCT image data for each subject during the training phase. However, it is important to note that our target is to transfer the knowledge from the CTPA network to the NCT network, which is a different objective compared to unsupervised domain adaptation (UDA) techniques. UDA aims to adapt a model trained on a source domain to perform well on a target domain without requiring labeled data in the target domain. In contrast, our approach leverages the available paired data to transfer knowledge from the CTPA domain to the NCT domain, enabling the NCT network to benefit from the rich information present in the CTPA scans. While UDA techniques can be useful in scenarios where paired data is not available, our method specifically targets the scenario where such paired data exists and aims to exploit it effectively for improved performance on NCT scans.

[R5 - additional context for results and discussion] We appreciate the reviewer’s suggestion to provide additional context and discussion regarding the significance of our results. We agree that the superior performance of our proposed method can be attributed to its ability to leverage information from CTPA scans during training, which the baselines do not have access to. We will expand the discussion section to highlight this point and emphasize the significance of our method in utilizing this additional information during training while not requiring it during inference.

Regarding the suggestion to provide an additional set of results demonstrating the performance of the baselines when taking CTPA as input, we acknowledge that such results would further strengthen the paper. However, due to the page limitations, we were unable to include these additional experiments. Nevertheless, we believe that the current results sufficiently demonstrate the effectiveness of our proposed method in leveraging dual-phase knowledge for improved performance on NCT scans.

[R5 - the discussion of the IFA module is quite hard to follow] We appreciate the reviewer’s feedback on the clarity of certain portions of the paper, particularly the discussion of the IFA module. We will revisit the explanation of the IFA module and strive to improve its clarity and readability. We will provide a more intuitive and step-by-step explanation of how the IFA module captures pair-wise spatial feature similarities and enhances spatial contiguity and mutual learning between the dual-pathway network. We will also consider including additional illustrations or diagrams to aid in the understanding of the IFA module. The code will be released asap.

[R5 - The FUMPE dataset] We apologize for any confusion regarding the FUMPE dataset results. Due to space limitations, we focused on the segmentation task when evaluating our method on the FUMPE dataset. The primary reason for this is to demonstrate that our proposed framework is robust and effective, even when applied to a single data stream (CTPA only). By comparing the performance of our segmentation framework with public methods on the FUMPE dataset, we aim to show that the advancements brought by CPMN are not solely due to the chosen framework but rather the effectiveness of the CPMN itself. The purpose of including the FUMPE dataset results is to provide additional validation of our method on a publicly available dataset and to showcase its generalizability. We will clarify this rationale in the revised version of the paper to better justify the inclusion of the FUMPE dataset results.

[R6] Thanks for your suggestions, we will do more analysis in next version.

[All reviewers] We hope that the responses address the concerns raised by the reviewers and provide the necessary clarifications. We will incorporate the suggestions to improve the clarity and strengthen the paper. Thank you once again.




Meta-Review

Meta-review not available, early accepted paper.



back to top