Abstract

Segmentation of blood vessels in murine cerebral 3D OCTA images is foundational for in vivo quantitative analysis of the effects of neurovascular disorders, such as stroke or Alzheimer’s, on the vascular network. However, to accurately segment blood vessels with state-of-the-art deep learning methods, a vast amount of voxel-level annotations is required. Since cerebral 3D OCTA images are typically plagued by artifacts and generally have a low signal-to-noise ratio, acquiring manual annotations poses an especially cumbersome and time-consuming task. To alleviate the need for manual annotations, we propose utilizing synthetic data to supervise segmentation algorithms. To this end, we extract patches from vessel graphs and transform them into synthetic cerebral 3D OCTA images paired with their matching ground truth labels by simulating the most dominant 3D OCTA artifacts. In extensive experiments, we demonstrate that our approach achieves competitive results, enabling annotation-free blood vessel segmentation in cerebral 3D OCTA images.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1494_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1494_supp.pdf

Link to the Code Repository

https://github.com/bwittmann/syn-cerebral-octa-seg

Link to the Dataset(s)

https://huggingface.co/datasets/bwittmann/syn-cerebral-octa-seg

BibTex

@InProceedings{Wit_SimulationBased_MICCAI2024,
        author = { Wittmann, Bastian and Glandorf, Lukas and Paetzold, Johannes C. and Amiranashvili, Tamaz and Wälchli, Thomas and Razansky, Daniel and Menze, Bjoern},
        title = { { Simulation-Based Segmentation of Blood Vessels in Cerebral 3D OCTA Images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15008},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this work, the authors try to improve the segmentation of 3D Optical Coherence Tomography Angiography (OCTA). To do so, they use synthetic data to train a neural network. When generating the synthetic images, they ensure both the vasculature shape and artifacts are faithfully modeled.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Using modeled synthetic images to feed a neural network, to optimize its efficiency is a good idea. Optimizing the synthetic model by tackling separately the vascular tree shape (volume generation) and the background artifacts is probably the best move.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Unfortunately, we have no means to evaluate the efficacy of the synthetic model by itself. How reliable is it compared to some ground truth ?

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    At least, a statistical evaluation of the synthetic images vs. ground truth patches could help (i.e compare the number of branches, the density, the average radii, etc within a cropped area). Even better, some advanced objective quality evaluation would be very nice. I would suggest the authors to try evaluating the similarities between the ground truth OCTA acquisitions and the synthetic images. I am not convinced that the MSE would faithfully estimate the similarities, but maybe some other quality metrics could be used. Some normalized cross-correlation might prove useful (Template Matching) ? I’m not sure I understand why the authors opted for a Gaussian noise addition, to be followed by a Gaussian filtering ? Why not directly generating a proper frequency noise ? Furthermore, how confident are you that the noise generated from OCTA acquisitions is actually Gaussian ? Is this common knowledge ? Typically, MRI generated noise is considered to be Rician. Maybe the authors should consider showing both distributions (actual OCTA acquisition, and modeled version). Moreover, if I understand correctly, it seems the training was performed either on the real or on the synthetic data. It would be interesting to evaluate the performances when training the neural network on both, i.e. in a data augmentation scenario. Maybe it’s what the authors did, and i didn’t get it, but, then, it should be very clearly stated.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I regret the lack of any evaluation of the synthetic images. I believe it would be a significant asset to the paper.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors have clarified a few questions raised by the reviewers. This paper should be accepted, but again, the authors should, to my opinion, at least mention some perspectives on a data augmentation scenario.



Review #2

  • Please describe the contribution of the paper

    This paper uses a segmentation algorithm, 3D U-Net, on blood vessel OCTA images which they have synthetically generated from 6 real sources by simulating 3 types of artifacts. The authors have applied the techniques within 3 scenarios: the entire patch, a small region and a large region.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and clearly presented. It is a strong paper in many aspects including a relevant and current application (3D segmentation of blood vessels). The problem space has a clear clinical benefit if good results are achieved. The ablation study looking at different additions of simulated artifacts is very interesting as it overcomes pitfalls in the input quality and shows promise to the techniques implemented with performance increasing. Additionally, testing the methods in 3 different scenarios of the entire patch, smaller and larger regions was a strength of the paper as it shows the method performing under varying conditions and complexities. The results themselves are empirically strong and the data being taken from a range of mice increases the inter-strain variation in vessel structure, increasing the potential generalisation ability.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There were a few weaknesses identified within the paper. Firstly, justification was provided for why patches were discarded, however not using the entire vascular corrosion cast is a weakness for a number of reasons including the applicability of the approach to aspects of the discarding criteria such as sparsely populated areas and areas with no large vessels. Additionally, in situations where the whole structure requires segmentation, this method isn’t appropriate. It is unclear what the ground truths are for the synthetic data. In one part of the paper it says ‘ground truth labels given by the unmodified voxelized volumes’ indicating they have not been altered, but within the supplementary material, Figure 6 shows a comparison of ground truth labels implying alterations were made to the ground truth. If no modifications were made to the ground truth, then the model will be disregarding the simulated artifacts in the synthetically generated volumes as it is using the ground truth as reference. On the other hand, if modifications have been made to the ground truths, this should be done with caution as it could introduce bias or inaccuracies and produce misleading results that aren’t reflective of the true data. There is no description of the data used to generate the results of b (Frangi, Otsu) in Table 1. Are the results for these methods (Frangi, Otsu) generated using the same data as the study? If not, the comparison is slightly inappropriate given the performance cannot be directly compared as one task may be more complex than other. Although performance of the synthetic data is promising, it does not outperform the performance on the original, manual annotations. Given this application has clinical applications, if a method does not surpass the current gold standard method it is unlikely to replace it. Although automation of tasks is important as discussed for time and cost saving aspects, more importantly the highest possible performance is key. The final weakness is that no limitations are discussed in the paper.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    no

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    A few additions would strengthen the paper. Within the paper it describes having 3 test sets, however detail on these test sets such as how many volumes were used in each. Justification for how the most dominant artificats were chosen would be good. Additionally, information on the sampling of patches would complete the additions. For instance, although the patch size 250x250x250 was stated, it is unclear where the centre point was to determine whether it met the discard category, as well as though what rate was the grid scanning. Was there any overlap? Additionally, how much of the vascular corrosion cast was used/unused. It would be good to discuss how this would work in practice of segmenting the whole structure, potentially a limitation of the approach as it would limit the generalisability as choosing sections that are more ideal (e.g. discarding sections) is removing the ability to apply the method to more complex scenarios. ‘Upper bound’ is mentioned a few times with no reference to what the upper bound is referring to.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is very interesting. It has important clinical relevance and the results are promising. Neither the application nor chosen algorithm is particularly innovative however it does contribute to a current clinical problem that still requires more progress. There is some novelty in the methodology through the generation of synthetic data and also introduction of artifacts. Finally, the presentation of findings is explained well but appropriately concise.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    1.The paper propose the use of synthetic cerebral 3D OCTA images for blood vessel segmentation to overcome the lack of available manual annotations. 2.A synthesis pipeline that can be adapted to the data at hand is proposed ,which models projection artifacts, angle-dependent signal loss, and artifacts dominated by localized granular noise patterns when synthesizing data. 3.The author open-source the code, synthetic dataset, and manually annotated OCTA images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.For the first time, the authors present a synthesis pipeline capable of adaptively generating synthetic 3D OCTA images of the brain based on existing data to address the shortcomings of manual annotation. 2.Furthermore, in the supplement, the authors clearly demonstrate the pseudo-code of the synthesis algorithm and make the code publicly available, which allows other researchers to clearly reproduce the work.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The implementation of the third contribution is solely explained from the perspective of the synthetic algorithm, without experimental validation using data from different OCT system designs and acquisition protocols.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The method description is clear, and both the code and data have been made openly available, ensuring high reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1.We suggest revising the citation and referencing of figures in the manuscript, as some references fail to link to their corresponding locations, causing inconvenience during reading. 2.We recommend more experiments to demonstrate that the method can generate OCTA images based on data from different devices to enhance the persuasiveness of the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method proposed in the paper demonstrates a certain level of originality, addressing the limitations of manual annotation for 3D OCTA brain images, laying a foundation for future segmentation work.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The author has responded point by point to the questions raised




Author Feedback

We appreciate that the reviewers consider our work ‘strong in many aspects’ [R3], ‘a foundation for future work’ [R4], ‘a good idea’ [R1], to have important clinical benefits [R3], highly reproducible (we open-source synthetic data, real manually annotated data, and code) [R1,R3,R4], to have empirically strong results [R3], clearly presented [R1,R3,R4], and thoroughly evaluated and ablated [R3].

[R1] Lack of any evaluation of synthetic images: Training on synthetic images leads to strong segmentation results (Tab1, Fig5) achieved on real OCTA volumes (test set), empirically demonstrating their efficacy and quality (Fig3). With the sole purpose of image synthesis being segmentation, this represents the primary evaluation measure. Moreover, given that synthetic images were generated from patches of real vasculature embedded in corrosion casts covering the exact same cortical areas (Fig4a), we found morphological properties of vessels (node degrees, radii, length, etc.) to be extremely similar between synthetic and real images. An additional comparison of intensity histograms revealed similar characteristics in the distribution of background and foreground intensity values. Analysis of PSNR values painted a similar picture. We, therefore, argue that our synthetic images resemble real OCTA images very closely, thank R1 for the thoughtful comment, and will provide more details in the adjusted paper.

[R1] Why Gaussian noise & smoothing: It is known from literature that background noise in OCTA images can be approximated by a Gaussian distribution [Fig1 in A]. To mimic the OCT’s point-spread function, we convolve the synthetic image with a Gaussian smoothing kernel.

[R1,R3] Train on both real & synthetic images; high performance is key: Great suggestion! Given that the main motivation of our work is to propose a synthesis pipeline for annotation-free OCTA segmentation, we advise future work to experiment on how to best combine real and synthetic images for potentially increased segmentation performance.

[R3] Details on Frangi filter & Otsu thresholding: Parameters of the Frangi filter were tuned on our validation volume and can be accessed in test.py. Otsu thresholding is a parameter-free approach. Both methods were applied to the same three test volumes.

[R3] Discarding patches limits applicability; more information on patch sampling: We solely discard patches containing vascular structures that will never be observed in real, in vivo OCTA images. In particular, due to constrained OCT depth penetration, rendering whole-structure segmentation impossible, we discard patches that lie out of reach for modern OCT systems. Further, discarding sparsely populated areas with no larger vessels enables us to exclude artifacts of corrosion casts. We, therefore, argue that discarding patches is a necessity to curate a highly representative, synthetic dataset precisely tailored to real-life applications. To provide exact details on patch sampling, we will additionally open-source our volume generation script.

[R3] Synthetic ground truth labels: Synthetic ground truth labels used in our main experiment (Tab1c) are given by unmodified, voxelized patches extracted from corrosion casts (Fig2 left). Modifications were solely made in the experiment on curvature (Tab1e, Fig6 LTAC). We thank R3 for the remark and will stress this in the paper.

[R4] More experiments to demonstrate that the synthesis pipeline can be adjusted to data from different devices: We fully agree that this would be an interesting addition to the paper! However, given that we are the first to open-source large amounts of annotated 3D OCTA data, it is currently impossible to properly quantitatively evaluate our synthesis pipeline on data from different devices. Nevertheless, we demonstrate the diversity of our pipeline in Fig7 and thank R4 for the comment.

We will revise the paper to clarify minor points that could not be addressed in the rebuttal.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top