Abstract

Coronary Artery Calcification (CAC) is a robust indicator of coronary artery disease and a critical determinant of percutaneous coronary intervention outcomes. Our method is inspired by a clinical observation that CAC typically manifests as a sparse distribution of multiple instances. Existing methods focusing solely on spatial correlation overlook the sparse spatial distribution of semantic connections in CAC tasks. Motivated by this, we introduce a novel instance-aware representation method for CAC segmentation, termed IarCAC, which explicitly leverages the sparse connectivity pattern among instances to enhance the model’s instance discrimination capability. The proposed IarCAC first develops an InstanceViT module, which assesses the connection strength between each pair of tokens, enabling the model to learn instance-specific attention patterns. Subsequently, an instance-aware guided module is introduced to learn sparse high-resolution representations over instance-dependent regions in the Fourier domain. To evaluate the effectiveness of the proposed method, we conducted experiments on two challenging CAC datasets and achieved state-of-the-art performance across all datasets. The code is available at https://github.com/WeiliJiang/IarCAC

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2701_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Jia_IarCAC_MICCAI2024,
        author = { Jiang, Weili and Li, Yiming and Yi, Zhang and Wang, Jianyong and Chen, Mao},
        title = { { IarCAC: Instance-aware Representation for Coronary Artery Calcification Segmentation in Cardiac CT angiography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Authors propose an instance aware network for CAC segmentation in CCTA. A novel instance ViT and instance-aware guided modules are proposed. A validation in comparison to other state of the art architectures is performed.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Authors propose novel modules to improve instance representation in segmentation
    • A thorough comparison to state of the art architectures is performed in two datasets
    • An ablation study of some of the hyperparameters and modules added is conducted
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The state of the art presented by the authors is somewhat deficient. While it is true that most work in CAC segmentation has focused on CSCT, the examples provided by the authors are limited to kNN, SVMs and decision trees. However more recent work using complex deep learning architectures should also be features. It should also be noted that reference 21 does not actually use SVMs and is not for CSCT - probably authors wanted to cite a different study here(?). Furthermore, there is already significant literature in CAC segmentation in CSCT, for example 10.1016/j.media.2016.04.004, 10.1148/radiol.2021211483, 10.1007/s10554-014-0519-4 and 10.21037/qims-21-775.
    • While 5 different metrics are used for validation, the clinical value in CAC segmentation is Agatston scoring (or total CAC volume) and this is not validated. As such, it is unclear if the improvements in the validation metrics shown lead to a superior clinical CAC quantification.
    • Also regarding validation, annotations of CAC were performed manually on CCTA by experts. However, authors state in the introduction that “In CCTA, it is non-trivial to distinguish between CAC and attenuated lumen,” and even for an expert this is not straightforward (the reason for which CAC quantification is performed on CSCT in clinical practice). As such, it can be expected that the reference annotations are not 100% reliable and the clinical validity of the results can be put into question.
    • It is not clear if the two datasets were used separately or joined for training/testing. It is also not clear what was used to replace the modules removed in the ablation study.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The reproducibility of this study somewhat concerns me, particularly in terms of the encoder-decoder blocks for which only limited information is given as well as some of the experiment settings.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • It would be extremely important to update the literature review to mention more recent studies in CSCT CAC segmentation, as well as previous work in CTCA CAC segmentation. These works should then be mentioned in the discussion for comparison with the obtained results.
    • Please add additional details to improve reproducibility. These are probably too long for the main manuscript but can probably be added as supplementary material.
    • Please improve the description of the experiments as mentioned in the main weaknesses.
    • It is unclear why reference [9] is given for “This upsampling layer elevates the resolution of the input channels by utilizing bilinear interpolation”. This is a fairly standard operation and I don’t see why a reference is needed. I would advise the removal of this reference.
    • It is not clear what is meant by “a threshold-based segmentation method was employed to eliminate the lung trunk from the raw CCTA images.”. What is the lung trunk? What threshold was applied and why? Please add further details to improve reproducibility.
    • On the results tables, bold and underlines are used but the meaning of these should be given.
    • Please review the manuscript for grammar and correctness - some issues: “Using” (pg 1), “we select the two CAC dataset” -> “we select two CAC datasets” (pg 5), “InrCAC” -> “IarCAC” (Table 1), “MISS” -> ? (pg 6), “followed by the prior work” (pg 6)
    • Please review bibliography formatting particularly in terms of capitalization and acronyms: “oct”->”OCT”, “mri”->”MRI”, …
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I believe this manuscript may be worthy of publication as it introduces novel architectures for a significant clinical issue. However, I have significant concerns in terms of reproducibility and the clinical validity of the performance evaluation. I believe these concerns can be at least partially addressed through changes to the manuscript and supplementary material and discussion of limitations and thus recommend this manuscript for weak reject (dependent on rebuttal).

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors have addressed most of my concerns in the rebuttal particularly in terms of the methodology and reproducibility. I remain sceptic in terms of the clinical validity of CAC segmentation in CTCA but if authors add interobserver variability as stated in the rebuttal this becomes a smaller issue. I have thus upgraded my decision to weak accept.



Review #2

  • Please describe the contribution of the paper

    This paper presents a novel instance-aware representation method for CAC segmentation, explicitly leveraging sparse connectivity patterns among instances to enhance the model’s instance discrimination capability.

    Contributions:

    1. An instanceViT is introduced to capture the variable distribution of intra-instance semantic information in input image content.
    2. An instance-aware guidance module is presented to learn sparse high-resolution representations on instance-related regions.
    3. Extensive experiments are performed on two CAC datasets, demonstrating superior performance compared to other methods.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The idea of proposing instanceViT based on the clinical observation that CAC typically manifests as a sparse distribution of multiple instances is quite innovative. Corresponding ablative experiments and comparative results also validate its effectiveness.

    2. The problem modeling is clear, and the explanations of the roles of various modules in the paper are also clear.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In the InstanceViT section, there is no explanation of what the feed-forward network (FFN) is used for afterwards.

    2. How are the interaction weights Kins obtained from the InstanceViT branch?

    3. Is the input of ViT the same as that of InstanceViT?

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. In the Introduction section, there should be further explanation as to why “instance segmentation” is used instead of “semantic segmentation” for the task of CAC segmentation. This implies why you chose to use instance-aware representation.

    2. In Figure 1, the Instance-aware Guided Module box seems like it should include the ViT box? Otherwise, it might cause ambiguity.

    3. In the InstanceViT section, in line 7, change “l-1-th” to “(l-1)-th”.

    4. In the InstanceViT section, there are inconsistent formats in several places for M(A, k).

    5. In the Instance-aware Guided Module section, there is a spelling error in the fourth line from the end, “leverage”.

    6. The format of “instanceViT” is inconsistent throughout the paper; sometimes it is capitalized, and other times it is lowercase.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper focuses on leveraging sparse connectivity patterns among instances to enhance the model’s instance discrimination capability, achieving the segmentation of CAC, which is novel. However, there are still many aspects in the paper that require further explanation. Please explain the issues mentioned in the “weaknesses” section.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    This paper addresses an important clinical issue and the rebuttal has essentially answered my questions.



Review #3

  • Please describe the contribution of the paper

    The authors propose a new CAC segmentation method for CCTA images. The main novelty here is to incorporate the idea of “instance” (based on the fact that calcium is a sparse instance compared to the background) in the self-attention so that only the pixels/elements that belongs to the same instance will be used in the following segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • use threshold to select elements in the score matrix that has the highest correlation, so that only the pixels that belongs to the same instance will be used in the downstream task, reducing the noise caused by unrelated pixels (e.g.., calcium and background).

    • use Fourier transform to further refine the results.

    • Ablation studies showing the effectiveness of the each module as well as the most effective top-k elements in the score matrix.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    N/A

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall, this is a solid study. It begins from a simple observation that calcium is a sparse instance compared to its surroundings. Based on this, the study designs an instance-aware self-attention mechanism and incorporates Fourier transform to refine the results. Additionally, necessary ablation studies are conducted to demonstrate the impact of each module.

    However, there are a few concerns:

    1. If I understand correctly, the use of FFT is intended to better understand feature correlations in the frequency domain. However, the authors need to provide a more detailed explanation of the motivation for this approach, beyond simply stating its effectiveness based on the literature. For example, they should clarify what constitutes low and high-frequency features in CCTA and CAC segmentation, provide examples of these features, and explain how they influence the segmentation process.

    2. Please provide more details on how the instance-wise detection metric is calculated. Specifically, clarify how many pixels identified within a calcium region by DL segmentation are considered sufficient to recognize that calcium instance (i.e., what is the pixel threshold for detection)?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    solid study

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    keep the original score.




Author Feedback

We thank all reviewers for their valuable and insightful reviews. They described our method as “quite innovative” (R3), “solid study” (R4), and “thorough comparison to state-of-the-art architectures” (R1). Here we address their main concerns:

  • Motivation of Instance-aware Guided Module (R4) Using FFT to analyze feature correlations in the frequency domain can enhance the contrast of CCTA images by distinguishing between broad anatomical structures (low-frequency features) and fine CAC details (high-frequency features). However, not all low or high-frequency information is contributing to CAC segmentation. Therefore, we propose an Instance-aware Guided module that adaptively determines which frequency information should be retained. We agree with R4 on providing feature examples to explain how FFT affects segmentation, which will be explored in an extended version.

  • Recent study on CAC segmentation in CSCT(R1) 1) We agree with R1 that deep learning methods in CSCT should also be discussed, such as the methods [a, b, c, d]. Specifically, [a] proposed a two-stage CNN pair to identify CAC from coarse to fine; [b] uses Unet to segment CAC; [c] and [d] focus on coronary artery segmentation and then combine coronary artery information and voxel intensity values to identify CAC. We will discuss these methods in the final version; 2) References that are miscited or standard operations should be edited in the final version.

  • Clarity and notation of the method (R1&3) 1) In clinical settings, CACs are evaluated individually, which implies that CAC segmentation should be formatted as instance segmentation. Additionally, small CAC instances tend to be overlooked in semantic segmentation (R3); 2) The position-wise feed-forward network applies a fully connected network to each position in the sequence, adding non-linearity and enhancing the model’s ability to capture complex patterns (R3); 3) Sorry for the confusion. The ViT box in the instance-aware guided module box should be image embedding. We will revise Fig 1 in the final version; 4) The weight of InstanceViT is obtained by a series of two 1 × 1 convolution layers, BN and GELU (R3); 5) Inputs of ViT and InsViT are the same (R3); 6) The notation errors and inconsistent formats will also be corrected with minor modifications (R1&3).

*Data annotation and Analysis (R1) 1) Our data were independently annotated on CCTA by a radiologist and a cardiologist via 3D Slicer. The Dice between each annotation and their union measured annotator preference. Inconsistencies were rechecked. We will revise the data annotation in the final version; 2) Accurate total CAC volume estimation correlates with a higher Dice score, which indicates better overlap between predicted segmentation and ground truth. We will revise Tab.1 by adding total CAC volume analysis for clinical CAC quantification in the final version.

*Experimental details and Reproducibility (R1&4) We will release the code in the final version, ensuring the reproducibility of the work (R1&4). We will revise experimental details in the final version to clarify the following points: 1) The encoder-decoder block of our method is the same as the standard Unet [3] (R1); 2) The “lung trunk” is the major bronchial structure. We used a threshold range from -224HU to 600HU for coarse-segment the lung, then used the seed-filling algorithm to fine-segment the lung. Subtracting the lung from the original image can eliminate noise to better segment CAC (R1); 3) The pixel threshold for detection is 0.5 (R4); 4) Two datasets are used separately for training/testing. In the ablation study, when validating the InsViT module, we used the combination of Unet and InsViT. We replaced InsViT with ViT when validating the Guided module (R1).

[a] 10.1016/j.media.2016.04.004 [b] 10.1148/radiol.2021211483 [c]10.1007/s10554-014-0519-4 [d]10.21037/qims-21-775




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Well-presented paper, thorough comparison to SOTA works, good results.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Well-presented paper, thorough comparison to SOTA works, good results.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top