Abstract

Coronary artery disease (CAD) poses a significant challenge to cardiovascular patients worldwide, underscoring the crucial role of automated CAD diagnostic technology in clinical settings. Previous methods for diagnosing CAD using coronary artery CT angiography (CCTA) images have certain limitations in widespread replication and clinical application due to the high demand for annotated medical imaging data. In this work, we introduce the Spatio-temporal Contrast Network (SC-Net) for the first time, designed to tackle the challenges of data-efficient learning in CAD diagnosis based on CCTA. SC-Net utilizes data augmentation to facilitate clinical feature learning and leverages spatio-temporal prediction-contrast based on dual tasks to maximize the effectiveness of limited data, thus providing clinically reliable predictive results. Experimental findings from a dataset comprising 218 CCTA images from diverse patients demonstrate that SC-Net achieves outstanding performance in automated CAD diagnosis with a reduced number of training samples. The introduction of SC-Net presents a practical data-efficient learning strategy, thereby facilitating the implementation and application of automated CAD diagnosis across a broader spectrum of clinical scenarios. The source code is publicly available at the following link (https://github.com/PerceptionComputingLab/SC-Net).

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0006_paper.pdf

SharedIt Link: https://rdcu.be/dV59c

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72120-5_60

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Ma_Spatiotemporal_MICCAI2024,
        author = { Ma, Xinghua and Zou, Mingye and Fang, Xinyan and Liu, Yang and Luo, Gongning and Wang, Wei and Wang, Kuanquan and Qiu, Zhaowen and Gao, Xin and Li, Shuo},
        title = { { Spatio-temporal Contrast Network for Data-efficient Learning of Coronary Artery Disease in Coronary CT Angiography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {645 -- 655}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a new DNN design for classifying image regions are normal or abnormal in spatiotemporal cardiac MRI. It proposes a new data augmentation scheme specific to the application. It also proposes a transformer based DNN design. Another contribution is the dual-task learning scheme for object detection and “point”/region classification.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The insights into the data augmentation for the specific applications at hand are interesting. This is better than blind data augmentation. The use of the transformer provides empirical evidence that it is applicable in this context. Strong application focus.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The role of the Hungarian algorithm isn’t clear. Loss function Lod (in equation 5 and 4 both; including the selection of sigma^ and the indicator function 1_{}) doesn’t seem to be differentiable, seems heuristical because it isn’t tied into the optimization problem, and can add significantly to the difficulty in the optimization.

    Table 1 shows “Ours” as TLA-Net. Typo ?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    More details in Section 2.3 and the concerns raised can be helpful.

    Fonts in Figure 3 (bottom) are too small.

    In Equation 7, it is guaranteed that C is invertible (because it is used as such) ?

    There are some missing parentheses in Equations 6 and 7.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    please see the comments above.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    A method claiming to improve data-efficiency for coronary plaque classification based on straightened curved-multiplanar reformations (CMPR) of the coronary is proposed. The contribution consists of three components named A) clinically-credible data augmentation, B) spatio-temporal semantic learning, and C) dual-task contrastive optimization. A) essentially seem to combine CMPR volume parts with and without lesions by random substitution of subvolumes (sect. 2.1, Fig. 2 left). B) “Spatio” refers to an object detection net using 2D CMPRs and 3D CMPR volumes, “temporal” refers to a net classifying presumably fixed sampling points along the coronary based on CMPR cubes centered at the sampling points. C) Combines the two, including a loss term “employing each other’s prediction results as ground truth, thereby enhancing data utilization efficiency through mutual supervision”. The evaluation shows that the method is superior to >5 other methods and includes an ablation study.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • method is shown to be superior to > 5 other method in an in-depth evaluation, even when trained on only 50% of the data (compared to 100% used for the other methods) and therefore very data-efficient
    • the importance of the proposed sub-methods are demonstrated in an ablation study (Fig. 4)
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • I found the paper hard to follow, some examples: – Regarding “Spatio-temporal Semantic Learning”: “temporal”, especially when used in combination with “spatio”, usually refers to something relating to time as distinguished from space. The time component in the context of the proposed approach or data is unclear to me. The data does not seem to have a time/temporal component. The L_sc defined in eqn (6) that “optimizes temporal semantics learning” (page 6) is a normal cross-entropy loss – why is it temporal? – Regarding “clinically-credible data augmentation”: Why is the combination of CMPR volume parts with and without lesions by random substitution of subvolumes (if I understood this correctly) “clinically credible”? Are the transitions between substituted/non-substituted parts smooth? If not, why is this not a problem for training a classifier/detector that is supposed to generalize to work on patients coronaries that do not have such transitions? – Regarding the “dual-task contrastive loss L_dc”: Why is “employing each other’s prediction results as ground truth” helpful and “enhancing data utilization efficiency through mutual supervision”? (page 6) – page 5: “r_i is a vector that defines RoI center coordinate and the weight in the CPR volume” – what is the weight in the CPR volume? The intensity at that position?
    • no statistical evaluation of results: paired tests would give statistical weight to the argument of “superiority” of the proposed method.
    • the method is quite complex and specific to the addressed task.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?
    • the values chosen for the weights w_cpr, w^i_vw are not given (or are these learned?)
    • the plaque classification task (Table 1, lower part) seem to have three classes (Fig. 3). How are precision/recall/F1/specificity calculated, by macro or micro averaging?
    • Table 1: The underlining of numbers (probably meaning second best result) is not explicitly defined.
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • t and T in eqn (1) is not defined in the text and not used in the equation.
    • Table 1 and Fig 3 refer to the proposed method as “TLA-Net”, but everywhere else it is called “SC-Net”.
    • I am not convinced by the argument that data efficiency is of particular importance for the addressed task: “Previous methods for diagnosing CAD using coronary artery CT angiography (CCTA) images have faced hurdles in widespread replication and clinical application due to their dependence on large amounts of annotated data” (abstract). Deep learning methods often depend on large amounts of annotated data. Of course, a method that can get more out of an existing amount of annotated data than other methods is of advantage – this is always the case and nothing special. And I doubt that “widespread replication and clinical application” of the other methods is hindered by “their dependence on large amounts of annotated data” – if someone has a large enough training corpus for a good performing method, “widespread replication and clinical application” of the trained approach is, in principle, not a problem. Thus, my problem with emphasizing the data efficiency argument is that it might be confusing and misleading. I agree in the interpretation that the method is data-efficient (according to Table 1 it is, in most cases, better than the other methods even if trained on only 50% of the data), which is great, but nevertheless I suggest to consider not to overly emphasize the importance of data-efficiency – the method is simply superior compared to other methods and data-efficient.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is shown to be superior to > 5 other method in an in-depth evaluation, but, unfortunately, there are several issues that make the paper hard to follow (see 6.)

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors proposed a SC-Net to detect the coronary artery diseases in coronary CT angiography images. The proposed method leverages the multi-view information of the 3D data for data-efficient training. Experiments were conducted on a large clinical datasets and extensive comparison to the state-of-the-art methods shows that the proposed method achieves superior performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Overall, the paper is well-written and the proposed method is well-presented. The proposed dual-task optimization scheme is interesting as it combines the benefits of object detection and sampling-based approaches with a mutual supervision. The dataset involved in this study is larger and more diverse compared to the previous studies, thus evaluation on a single dataset is still reasonable.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The authors claim the proposed method is spatial-temporal, which does seem correct to me. Temporal usually refers to time-related processing on longitudinal data. The “temporal” semantic learning in the paper basically means sampling-point classification, which is another spatial learning but in a different direction, i.e., z-axis. Thus I strongly recommend the authors to replace the ‘temporal learning’.
    • In ablation study, Fig. 4 (a) shows the impact of using the clinically-credible data augmentation. However, it remains unclear to me if the performance improvement is from the proposed data augmentation technique or simply from the pre-training. To fully understand the benefits of the proposed augmentation, it is necessary to compare under the same ‘pre-training to fine-tuning’ setting.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • If temporal learning refers to the sampling-point classification along z-axis, I would recommend the authors considering to rephrase it.
    • More details about the dataset would be more helpful for readers, e.g., image sizes, resolutions, etc.
    • In Table 1, there are typos regarding the proposed method SC-Net: “TLA-Net (ours)”. Please use consistent naming of the proposed method.
    • From equation (3), the three loss terms have the same weighting, i.e., 1. However, I wonder if it would be better to have a ramping-up weighting for the third loss term L_{dc}, because at the beginning of the training process the mutual supervision loss is not very reliable.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Overall the paper is well-written and the proposed method is interesting. The application (coronary artery diseases in coronary CT angiography) and the large dataset are also valuable to the MICCAI community. However, as mentioned in the constructive comments, the results of the proposed SC-net was not included in the table, but the reviewer believes this was an accidental error and thus a minor concern.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

N/A




Meta-Review

Meta-review not available, early accepted paper.



back to top