Abstract

Segmentation of fetal brain tissue from magnetic resonance imaging (MRI) plays a crucial role in the study of in-utero neurodevelopment. However, automated tools face substantial domain shift challenges as they must be robust to highly heterogeneous clinical data, often limited in numbers and lacking annotations. Indeed, high variability of the fetal brain morphology, MRI acquisition parameters, and super-resolution reconstruction (SR) algorithms adversely affect the model’s performance when evaluated out-of-domain. In this work, we introduce FetalSynthSeg, a domain randomization method to segment fetal brain MRI, inspired by SynthSeg. Our results show that models trained solely on synthetic data outperform models trained on real data in out-of-domain settings, validated on a 120-subject cross-domain dataset. Furthermore, we extend our evaluation to 40 subjects acquired using low-field (0.55T) MRI and reconstructed with novel SR models, showcasing robustness across different magnetic field strengths and SR algorithms. Leveraging a generative synthetic approach, we tackle the domain shift problem in fetal brain MRI and offer compelling prospects for applications in fields with limited and highly heterogeneous data.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3487_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3487_supp.pdf

Link to the Code Repository

https://github.com/Medical-Image-Analysis-Laboratory/FetalSynthSeg

Link to the Dataset(s)

https://www.synapse.org/Synapse:syn25649159/wiki/610007

BibTex

@InProceedings{Zal_Improving_MICCAI2024,
        author = { Zalevskyi, Vladyslav and Sanchez, Thomas and Roulet, Margaux and Aviles Verdera, Jordina and Hutter, Jana and Kebiri, Hamza and Bach Cuadra, Meritxell},
        title = { { Improving cross-domain brain tissue segmentation in fetal MRI with synthetic data } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15001},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces FetalSynthSeg, a domain randomization method to segment fetal brain MRI, which accommodating specific fetal anatomical properties, acquisition artefacts and heterogeneity due to fetal brain development and super-resolution-reconstructed algorithms. The experiment results shows the effectiveness of the proposed model trained only using synthetic data when evaluated on out-of-domain data and robustness of the propose approach to unseen super-resolution-reconstructed algorithms and magnetic field strength.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The FetalSynthSeg demonstrates robust fetal brain tissue segmentation across datasets with significant domain shifts. The synthetic image generation helps the proposed model overcome discrepancy caused by MRI acquisition variations and super-resolution-reconstructed algorithms. The generalization of the FetalSynthSeg to low-field MRI is of utmost significance, offering an avenue to enhance fetal MRI accessibility in underserved cohorts and low-income regions, by providing a cost-effective diagnostic solution.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    FetalSynthSeg adpots the SynthSeg to fetal brain MRI by introducing four fetal-specific meta-labels to generation model. However this strategy is similar to the one used in SynthSeg for cross-domain cardiac MRI segmentation. Please clarity the difference between the SynthSeg and FetalSynthSeg. Although the experiment result shows FetalSynthSeg overcome the difference from super-resolution-reconstructed algorithms and MRI acquisition variations, the description and implementation details of how the method to overcome it are vague.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Due to the domain shift, it is difficult for the automatic segmentation methods maintain robustness. FetalSynthSeg demonstrates the generalizability of fetal brain tissue segmentation on variations of the fetal brain morphology, MRI acquisition, and super-resolution-reconstruction algorithms. The paper need to add method description about how to overcome these domain shifts. FetalSynthSeg introduces four meta-labels to improve synthetic image generation. It lacks a quantitative experiment to illustrate the necessity of four meta-labels. All quantitative results also need to be presented in tables for easy and clear comparison.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work is mainly extenstion of Synthseg in fetal brain tissue segmentation. It lacks a certain degree of innovation. The quantitative comparison of experiment results does not demonstrate the effectiveness of the proposed network design. The paper need to add details analysis about how the FetalSynthSeg to overcome fetal brain MRI domain shifts.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Authors have provided a satisfactory rebuttal.



Review #2

  • Please describe the contribution of the paper

    The paper presents a novel domain randomization technique for segmenting fetal brain MRI images. This approach is inspired by SynthSeg, leveraging synthetic data to enhance the robustness and accuracy of the segmentation under varying imaging conditions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1.Diverse Dataset Validation: The paper leverages multiple datasets constructed through various super-resolution (SR) methods to validate the proposed segmentation technique.

    1. Including validation on low-field MRI datasets. 3.The manuscript is well-written, with clear and concise explanations of the methodology, results, and implications.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The paper utilizes only 35 real images for baseline model training compared to 7,000 synthetic images for the domain randomization approach. This discrepancy in dataset sizes may lead to unfair comparisons and could skew the performance metrics in favor of the synthetic model.
    2. Lack of Ablation Studies.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The author should add some ablation study, such as network layer, activate function, the number of

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I recommend that the authors incorporate ablation studies to strengthen the paper’s contributions. Specifically, it would be useful to explore the impact of various network layers, activation functions, and the numbers of synthetic images used in training.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper written is good, using the multiple datasets to make validation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Reject — should be rejected, independent of rebuttal (2)

  • [Post rebuttal] Please justify your decision

    The clarity of the technical novelty should be improved. While the potential of the work is evident, its current form falls short of the standards expected for publication.



Review #3

  • Please describe the contribution of the paper

    The paper presents a method for generating synthetic fetal brain MRI data to train robust segmentation models. Models trained solely on the synthetic data outperform those trained on real data when evaluated out-of-domain.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The proposed method was validated on mulitple datasets varing on parameters and SR methods.
    2. The experimental result is sound.
    3. The paper is well-written and easy to follow.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The clarity of the technical novelty should be improved.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I suggest the authors release the weight and code of their model for better reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • I suggest the authors also emphasize their technical novelty in the contribution part. 
    • The SynthSeg also introduced a class clustering solution to replace using segmentation directly (Fig. S3). What is the difference between the proposed one and SynthSeg?
    • Is there any specific reason why fit_nnUnet outperformed the proposed solution?
    • What is the performance of those investigated models on sub-classes (WM, GM, and CSF)?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper conducts robust experiments on a diverse dataset. Despite minor technical novelty, it has the potential for significant clinical translation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors successfully addressed my concerns. I would recommend the author list the difference between FetalSynthSeg and SynthSeg and its motivation as an additional technical contribution to this paper.




Author Feedback

We thank the reviewers for their comments and their time. We address their questions and remarks below.

  • R1, R3. Technical novelty and differences with SynthSeg cardiac MRI. Our study demonstrates the potential for using synthetic data in fetal brain segmentation, focusing on single-source-domain generalization in highly heterogeneous domains. To our knowledge, this is the first study to apply SynthSeg segmentation models to fetal neuroimaging as we highlight the effectiveness of synthetic data in real-world clinical settings and incorporate low-field imaging, which can accelerate the adoption of this technology.

We organized the paper and allocated space to highlight our main contribution in the new application domain, which seemingly resulted in an under-representation of the methodological aspect of the work. The main methodological difference between FetalSynthSeg and SynthSeg lies in the use of meta-labels (merged target segmentation labels) on which we perform intensity-based splitting tailored to deal with fetal super-resolution domain shifts. This differs from cardiac SynthSeg, which splits original segmentation labels into subclasses.

R1. Meta labels link to overcome domain shifts. The meta-label-splitting strategy is aimed at accurately mimicking tissue heterogeneity and super-resolution (SR) reconstruction artifacts and errors, thus mitigating domain shifts caused by these factors. Furthermore, using meta-labels’ subclasses helps overcome the limitations of a small number of generation classes, which can create artificial boundaries between brain regions leading to intensity borders aligned with ground truth segmentation labels.

  • R1, R3, R4. Ablation Studies and Quantitative Results. Our quantitative results do demonstrate the effectiveness of our model over baseline. Due to space constraints, we did not include direct comparisons between SynthSeg and FetalSynthSeg or detailed metrics for each segmentation class. However, we recognize the importance of this information, as noted by multiple reviewers.

Preliminary analysis comparing FetalSynthSeg to the baseline model and SynthSeg showed a consistent Dice improvement across all tissues and splits with an average increase of 5.16%±3.71 and 4.84%±2.71 respectively. Notably, the only case where FetalSynthSeg performance is slightly lower (54.9±15.9 vs 54.5±17.7) than the baseline is the GM segmentation on KISPI-MIAL, which is explained by the low quality of the ground truth segmentations on this split, as mentioned in the paper.

We will shorten the supplementary material section to include a table with these detailed results, as it is important for all reviewers and provides a more detailed view of our experiments.

  • R4. Ablation experiments. Although in the current paper we have used the same model architecture and training schedule to ensure comparability between the experiments, we acknowledge the importance of ablation studies and architecture optimization, which are intended for an extension of this work.

  • R4. Fairness of Comparison (7000 images vs 35):

Both approaches are based on the same 35 real images for training on each split. FetalSynthSeg uses 7000 synthetic images generated offline from ground truth segmentation labels of these 35 images. Whereas the baseline model directly uses the original 35 T2w intensity images and performs on-the-fly data augmentation. The same augmentations are used for both models, and both models have the same amount of supervision signal and training schedule. This comparison is similar to the one in the SynthSeg paper, which evaluates whether using synthetic data generated from segmentations rather than corresponding intensity images helps in achieving domain generalization.

  • R3. Comparison with fit_nn_UNet. The model outperforms FetalSynthSeg as it is trained on 3 times more data (120 cases) and it uses ensemble of models for prediction (5 nnUnets) along with post-processing techniques.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper has sufficient novelty and validation for acceptance. One of the reviewers changed his/her score from 2 to 5 after the rebuttal, as s/he became convinced of the difference from existing approaches.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper has sufficient novelty and validation for acceptance. One of the reviewers changed his/her score from 2 to 5 after the rebuttal, as s/he became convinced of the difference from existing approaches.



back to top