Abstract

Accurate vessel segmentation from X-ray Angiography (XA) is essential for various medical applications, including diagnosis, treatment planning, and image-guided interventions. However, learning-based methods face challenges such as inaccurate or insufficient manual annotations, anatomical variability, and data heterogeneity across different medical institutions. In this paper, we propose XA-Sim2Real, a novel adaptive framework for vessel segmentation in XA image. Our approach leverages Digitally Reconstructed Vascular Radiographs (DRVRs) and a two-stage adaptation process to achieve promising segmentation performance on XA image without the need for manual annotations. The first stage involves an XA simulation module for generating realistic simulated XA images from patients’ CT angiography data, providing more accurate vascular shapes and backgrounds than existing curvilinear-structure simulation methods. In the second stage, a novel adaptive representation alignment module addresses data heterogeneity by performing intra-domain adaptation for the complex and diverse nature of XA data in different settings. This module utilizes self-supervised and contrastive learning mechanisms to learn adaptive representations for unlabeled XA image. We extensively evaluate our method on both public and in-house datasets, demonstrating superior performance compared to state-of-the-art self-supervised methods and competitive performance compared to supervised method.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1441_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zha_XASim2Real_MICCAI2024,
        author = { Zhang, Baochang and Zhang, Zichen and Liu, Shuting and Faghihroohi, Shahrooz and Schunkert, Heribert and Navab, Nassir},
        title = { { XA-Sim2Real: Adaptive Representation Learning for Vessel Segmentation in X-ray Angiography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15006},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper presents a method for vessel segmentation from X-ray angiography (XA) images using a network trained on digitally reconstructed vascular radiographs (DRVR). The DRVRs are generated from 3D CT angiography (CTA), which are further transformed to generate simulated XA images that look like real XA images using a GAN. A U-net is then trained on these simulated XA images to segment vessels using 2D ground truth obtained from corresponding CTA 3D vessel segmentations. To adapt the segmentation to real XAs without ground truth, the network is further trained to minimize the difference between simulated and real XA image features (from third-to-last convolutional layer) averaged over respective vessel regions (using ground truth for simulated XAs and using Unet generated mask for real XAs) as well as background. Consistency of predictions from images and their augmentations and contrastive loss for vessels to have similar average features and dissimilar from background are enforced.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Perhaps some novelty (see weaknesses) in the loss functions for adapting Unet trained on DRR to real XAs without ground truth via representation alignment, consistency and contrastive loss.
    2. The results show convincing improvement over other self-supervised methods and has comparable performance to supervised method for one of the data sets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The novelty of the paper is a bit unclear given that there is already a published paper that does something very similar: “Zhang, Z. et al. (2024). Self-supervised Vessel Segmentation from X-ray Images using Digitally Reconstructed Radiographs. In: Maier, A., Deserno, T.M., Handels, H., Maier-Hein, K., Palm, C., Tolxdorff, T. (eds) Bildverarbeitung für die Medizin 2024. BVM 2024. Informatik aktuell. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-44037-4_64”.

    The initial steps of DRR generation and GAN to transform DRR to real XA domain are also straightforward applications of known methods as already noted by the paper.

    Another relevant paper that does domain adaptation and representation alignment for segmentation is J. Kang, B. Zang and W. Cao, “Domain Adaptive Semantic Segmentation via Image Translation and Representation Alignment,” in 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, 2021 pp. 509-516. doi: 10.1109/ISPA-BDCloud-SocialCom-SustainCom52081.2021.00076”

    The authors may cite and explain difference in their basic idea from this paper.

    1. The loss function has 5 terms, making it tricky to balance the different terms in the loss function and perform optimization in general.
  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper doesn’t claim to release code or real data. But it mentions that the digitally reconstructed and simulated data and its annotations will be released.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. For initial unet training on simulated XAs, what is the role of consistency given data augmentation is already part of the standard training pipeline?
    2. For both stages of unet training, the motivation for specifically using third-from-last convolutional layer needs to be explained. Were other layers tried?
    3. Why is the class centroid contrastive loss computed only for simulated XAs (f_B) and not for real XAs (f_A)? Was this tried and didn’t work?
    4. A minor point: why is the adversarial loss described only for augmented images (A, B) in the paper and not the actual images (A, B)?
    5. Why doesn’t domain randomization include spatial transforms such as rotation, translation, deformation etc? Was this tried and didn’t work?
    6. Since there are 5 terms in the loss function, it is a bit tricky to find the appropriate weights. How were they determined? Please describe any specific methods that were used for this?
    7. Regd. comparison with unet, the paper should use nnunet as that is the current state of the art unet.
    8. A comparison with popular angiography segmentation method, for e.g. AngioNet will add value.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the problem is of interest to the community and the paper describes the method well and demonstrates interesting results, the novelty of the paper is a bit unclear compared to previously published work.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    The novelty is still not clear given the prior work mentioned in the review.



Review #2

  • Please describe the contribution of the paper

    This paper proposes a vessel segmentation method for X-ray angiography images, which can be built with less annotation by exploiting CT data and domain adaptation techniques. The efficacy was evaluated with the public XCAD dataset and their private datasets. The results show better performance than self supervised techniques and the same level with a supervised segmentation (U-net).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper describes a training method with pseudo labels generated from CT.
    • Domain gap issues are decreased by using several known techniques (e.g. contrastive loss).
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Each component is a variant of known techniques, and seems not so novel.
    • In the comparison study with other self-supervised methods (table 1), the accuracy was equivalent to a standard supervised segmentation(U-net). Since the proposed domain adaptation procedure is a bit complicated, the advantage of the method seems low.
    • In the ablation study (table 2) to evaluate the five components of the method, each component’s impact was not so large that it was unclear whether all of them are necessary, and which elements is worth to utilize.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Because the proposed method is a kind of domain adaptation technique, the experimental comparison with self-supervised training methods seems no appropriate.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Training AI model from limited amount of data is a common challenge in medical image recognition. The paper uses several solid techniques to reduce domain gaps. The algorithm configuration is reasonable, and it will be a good reference for engineers.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors clarified the difference from the previous works, and will elaborate the description of experiments.



Review #3

  • Please describe the contribution of the paper

    The authors describe a new method to segment blood vessels from coronary angiography examinations. The method relies on the availability of the annotated 3D CCTA dataset from the ASOCA challenge, and consists of 4 modules. 1) DRR images, and corresponding vessel maps, are generated from 3D CCTA images along 11 classical orientations; 2) domain transfer is operated to make DRR images resemble actual images from the XCAD dataset; 3) a UNet model is pre-trained using the transformed DRR images and corresponding vessel maps, a Siamese setup is used with an augmented version of the input image for robustness and generalization; 4) This Siamese UNet is refined on pairs of transformed DRR and actual images to adapt to an actual XA images dataset. Experiments demonstrate superior performance to 3 SOTA methods, and slightly less but competitive compared to fully supervised UNet, on XCAD public dataset and 2 in-house datasets (70 test annotated XA images).

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Despite the complexity of the method, the paper is very well written and easy to read. Figure 1 is especially very informative and clear. I appreciate the account for details (e.g. in physics consideration to generate DRR images) and the right balance in the explanations (e.g. space allocated to section 2.3, yet it could probably deserve a lot more discussion). Experiments are conducted against 4 SOTA methods, 3 of which are dedicated to vessel segmentation, both on public and private datasets of reasonable size.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method is sophisticated, but I ponder about its somewhat complexity. The development in Section 2.3 for Adaptive Representation Alignment, leads to a rather complex cost function in Eq 9. The authors take care in providing an ablation study, but the diversity of weight values makes one wonder about the difficulty to tune them and their impact. Also, DRR are often generated without taking all physics aspects into account. Are all the aspects in Eq 1, especially the spectrum, necessary and couldn’t the subsequent domain transfer encompass them? Why do the authors include noise other than Poisson in domain randomization when X-ray images are known to be tainted with Poisson noise? On the experimental side, first the experiment on the public dataset is somewhat biased since the XCAD dataset was also used for domain transfer. Second, connectivity of the vessel map is often key in blood vessel segmentation. A score to assess this characteristic would have been appreciated. Third, the ability to accurately segment pathological vessels is also clinically important. Comments on this aspect are missing. The idea to leverage 3D annotated data is interesting, but results on Fig. 2 seem to show that interventional devices are segmented with the vessels. Indeed, they are not present in CCTA images. This is not the case for UNet which relies on XA annotation. Is this indeed typical? Could this be the reason for a slightly reduced performance with respect to UNet? Also, isn’t there a risk of missing out small vessels, hardly visible in CCTA? Only the XCAD dataset is large. ACOSA dataset contains 40 patients, half of them with pathologies, and may not be able to capture the full diversity of pathological cases, or anatomical variations. The experiments are indicative of a potential clinical feasibility but test on extended datasets are necessary to assess it.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Page limitation does not allow for a complete description of such a sophisticated method. But details are missing for it to be reproducible, e.g. X-ray beam (pose, spectrum, material masks…) parameters to compute DRR images, extent of variations for domain randomization.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Details are missing on the in-house datasets: proportion of pathological images (pre- post- intervention)? how were they annotated? How many patients were involved?

    Are \pm figures standard deviations in Table 1?

    Typos:

    • p.2, 2nd par, l. 13: Recently, a self-supervised curvilinear object segmentation method is proposed -> was
    • p.2, 3rd par., l. 2: imporve -> improve
    • p.4, section 2.2, line below Eq 2: paramete -> parameter
    • p.5, 2nd par, l. 3: addition two -> two additional? two?
    • p. 5, Eq 6: there are two occurrences of n (external sum over i, and internal sum over j): shouldn’t it be w for the internal sum (similar to Eq 2)?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is shown to be efficient, the idea to leverage 3D annotated CCTA data is interesting, the methodology is always rigorous, even though more page space would be necessary to explain everything in detail. Experiments are conducted against SOTA methods on both public and in-house databases.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    I appreciate the authors’ answer to my major concerns. I was not aware of the work by Zhang et al mentioned by Rev1, but this work is very recent, which makes me believe the authors’ work is original, and proves this common seminal idea to leverage CTA data has potential and could yield interesting discussions at MICCAI.




Author Feedback

We appreciate the reviewers’ thoughtful assessments and valuable insights. They found our work novel (R1,4) and well-organized (R1,3,4) with rigorous evaluation (R4), achieving convincing advancements(R1,3,4).

  1. Clarify the difference from other works(R1,3):Thanks for providing previous works[1,2], we will reference them and clarify how our method differs. The core difference from [1,2] and the success of our work lie in the rigorous and effective use of pseudo mask via our proposed adaptive representation alignment. As described in Sec.2.3, class centroid contrastive loss is proposed to enhance the feature distinctiveness between vessels and backgrounds, and it is computed mainly for simulated XA domain to reduce the reliance on pseudo mask. Also class centroid feature alignment loss is proposed to minimize the class-level representation distance between simulated XA and real XA. An improved adversarial learning configuration is proposed to explicitly align the prediction space distribution and contribute to the quality of the pseudo mask. Meanwhile, in order to have a good quality of pseudo mask at beginning and improve the stability of the training process, the segmentation network is initialized with pretrained weights rather than training from scratch as in [1,2].

[1] Zhang et.al, Self-supervised Vessel Segmentation from X-ray Images using Digitally Reconstructed Radiographs. [2] Kang et.al, Domain Adaptive Semantic Segmentation via Image Translation and Representation Alignment.

2.Restate experiment results(R3):We will clarify this in the manuscript. Our method and selected SOTA methods, i.e, SSVS, DARL, and FreeCOS, all are proposed based on domain adaptation and trained following self-supervised manner for this specific task. So they are appropriate for comparison. As a fact, existing SOTA works consider the performance of supervised Unet as a general upper bound. From Tab.1, our method outperforms these SOTA methods and achieves competitive performance compared to the supervised Unet, highlighting our method’s superiority. Tab.2 shows that even our baseline surpasses three SOTA methods, indicating the significant role of our proposed XA simulation module. Additionally, our adaptive representation alignment further improves the dice score from 0.685 to 0.727, and each component contributes positively, with slight improvements due to our baseline being close to the upper bound.

3.Comments on the weights tuning(R1,4):Due to limited pages, we directly provided the well-tuned weights in Eq.9. Based on the conducted study before submission, randomly assigning weights to each term was first experimented to balance their contributions to the overall loss. Then the weights were further studied via grid search on the public XCAD dataset.

4.Q&A(R1):Thanks for your questions. Data augmentation introduces diversity in the training data to improve generalization, while consistency loss focuses on explicitly enhancing robustness through consistent predictions; We find that feature maps from the third-from-last convolutional layer, which have the same spatial resolution as the images, achieve the best performance. The raw set (A, B) is included in the augmented set (A, B), so the proposed adversarial loss is only applied on the augmented set;

5.Q&A(R4):Thanks for your comments and questions, we will add them as a discussion or future work. As mentioned in Sec.2.4, only one option is randomly selected for each effect, so the X-ray images only will be tainted with one random noise; For the I-XA dataset, it was collected from 20 patients during the pre-intervention phase. For the II-XA dataset, it was collected from 15 patients during the post-intervention phase. And they are manually annotated using 3D slicer segmentation tools. Thanks again for constructive suggestions on typo.

6.Reproducibility(R1,3,4):We will release the code, as we agreed in the submission system.

Thanks again for your comments and suggestions.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top