Abstract

Volume Interpolated Breath-Hold Examination (VIBE) MRI generates images suitable for water and fat signal composition estimation. While the two-point VIBE provides rapid water-fat-separated images, the six-point VIBE allows estimation of the effective transversal relaxation rate R2* and the proton density fat fraction (PDFF), which are imaging markers for health and disease. Ambiguity during signal reconstruction can lead to water-fat swaps. This shortcoming challenges the application of VIBE-MRI for automated PDFF analyses of large-scale clinical data and population studies. This study develops an automated pipeline to detect and correct water-fat swaps in non-contrast-enhanced VIBE images. Our three-step pipeline begins with training a segmentation network to classify volumes as “fat-like” or “water-like,” using synthetic water-fat swaps generated by merging fat and water volumes with Perlin noise. Next, a denoising diffusion image-to-image network predicts water volumes as signal priors for correction. Finally, we integrate this prior into a physics-constrained model to recover accurate water and fat signals. Our approach achieves a <~1\% error rate in water-fat swap detection for a 6-point VIBE. Notably, swaps disproportionately affect individuals in the Underweight and Class 3 Obesity BMI categories. Our correction algorithm ensures accurate solution selection in chemical phase MRIs, enabling reliable PDFF estimation. This forms a solid technical foundation for automated large-scale population imaging analysis.



Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0269_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/robert-graf/MAGO-SP

Link to the Dataset(s)

https://doi.org/10.1148/radiol.2015142272

BibTex

@InProceedings{GraRob_MAGOSP_MICCAI2025,
        author = { Graf, Robert and Möller, Hendrik and Starck, Sophie and Atad, Matan and Braun, Philipp and Stelter, Jonathan and Peters, Annette and Krist, Lilian and Willich, Stefan N. and Völzke, Henry and Bülow, Robin and Pischon, Tobias and Niendorf, Thoralf and Paetzold, Johannes C. and Karampinos, Dimitrios and Rueckert, Daniel and Kirschke, Jan},
        title = { { MAGO-SP: Detection and Correction of Water-Fat Swaps in Magnitude-Only VIBE MRI } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {327 -- 337}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    A method for correcting water-fat swaps in Magnitude-only VIBE MRI is presented and evaluated. The main contribution lies in the presented pipeline, and the evaluation on big datasets (NAKO and UKBB imaging) The three step pipeline first detects water-fat swaps. In the second step, signal priors for the water-fat recovery are predicted (most fat/water-reconstruction methods rely on signal priors for the reconstruction). The last step is a simplified physics-constrained model to recover water-fat swaps.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    All steps in the pipeline builds on existing methodology. A novel aspect is the segmentation of data with simulated swaps (step 1). The approach seems to work well on the big datasets used in the paper. The method presented is potentially very valuable for large scale analysis in big datasets such as UKBB imaging and NAKO.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The evaluation heavily relies on the segmentation network, since swaps are detected based on the resulting segmentation. It is not clear if the segmentation network is reliable enough. Details about the image-to-image generators is missing. The methodological advancement is mainly at a high level (assembling the pipeline), and neither the problem statement nor the used methods are new.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is a solid paper, but the methodological advancement is mainly at a high level (assembling the pipeline), and neither the problem statement nor the used methods are new.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    My concerns about method details and segmentation reliability have been taken into consideration. It is a weak accept.



Review #2

  • Please describe the contribution of the paper

    To detect and correct water-fat swaps in 2-point and multi-point DIXON acquisitions, when only magnitude MR images are available.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Train and evaluate a method on several large datasets (NAKO, UK-Biobank) Evaluating the method to the end point of swap-corrected images.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The methods are described a bit too compactly. While the techniques used seem to be referenced it is not always clear which part of the reference is used nor what the output of the referenced method is. Specifically it is not entirely clear what the nnU-NET predicts (global, patch, or voxel-wise label) or what the configuration of the diffusion model is during training and inference.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The use of large open datasets is a nice aspect of the paper. In the reference list: Check capitalization; typesetting of T2*, etc. Introduction, just before the ‘water-fat swaps’ section: Obviously the disentanglement of water and fat in the 2-point method requires in and out of phase images. In my opinion the fact that images at specific ti are required should be mentioned. The arms of the subjects were removed from the analysis, but the method used for that is not specified. How is the classification region provided to the nnU-NET? Is it a pixel-wise classification or patch/image wise classification? What was the distribution of the threshold (with which the Perlin noise was thresholded) to get kappa_Perlin? The large (relatively open) datasets are beneficial for replication. Method; swap detection: please do not put experimental results in the methods. (dice score on training dataset) Method is still quite vague, though references are provided. I do not understand when voxels are classified as correct or wrong from the statement: ‘A voxel is counted as correct if the absolute difference between the re constructed water voxel and the reference water voxel is smaller than that of the reference water voxel.’ How were the subjects classified in the different obesity classes for table 3? (Is there no risk of circularity there by using an MRI derived measure?) Minor: English writing quality/clarity issues. E.g. in: ‘Due to atomic bonding differences, fat-bound protons experience a shift in their Larmor frequency relative to water, causing them in phase and out of phase states between water and fat signals.’ I would suggest to use the object ‘have’ rather than ‘experience’; ‘them’ refers to fat-bound protons but then the description is on signal phase differences between water and fat protons.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I am sufficiently familiar with the objective and methods to evaluate the paper.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors indicate to seriously address the comments. With the promised release of code the method becomes sufficiently reproducible.



Review #3

  • Please describe the contribution of the paper

    This paper presents a novel method for retrospectively correcting a water/fat swapping artifact that is common in Dixon MR imaging. This remains an important problem in clinical The method is applicable to both 2-pt and 6-pt acquisitions and works on magnitude data (complex data not required). The method first detects regions of fat/water swaps with a segmentation model; if any swaps are detected a generative diffusion network is used to estimate a water image, and both the swap maps and generated image are used as input to a physics-based algorithm (MAGO or MAGORINO) to generate corrected images. The method is trained and tested using two large public datasets. In a test set with known swaps, the proposed method outperforms the physics-based methods alone. The swap detection rate component is used to evaluate full datasets, which reveals increased levels of swapping in data at the extrema of the obesity spectrum

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper addresses an important clinical imaging method. The overall strategy is practically and thoughtfully designed. The use of the large public datasets is very good. The training of the segmentation model is novel, using a Perlin noise generator to synthesize swapped regions. The results produced look quite good, and the error rates (0-2%) are low enough for routine use. The analysis of swap frequency by dataset and obesity category (table 3) is a distinct and partially independent analysis, which adds some significance to the work.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Some of the methods are unclear. 1) There must have been a manual review of vendor-processed images performed to identify swap cases prior to data splitting – this step is not described (how many readers, experience, review process, etc) and the quality of this data preparation would impact the results. Very small swaps of only a few pixels may not be reliably determined with this method. 2) Details of the synthetic patching based on the Perlin noise process are not clear. The size and frequency of the patches is important but not described. 3) In Results, it is not obvious why the analysis of Table 2 only reports results with the proposed method, without comparison to other methods to put the results in context. This makes it harder to evaluate if this is a major or modest improvement over the state of the art. 4) Regarding clarity, this manuscript is generally well written, with clear explanations, good background and problem description. The presentation of results were somewhat difficult to follow, especially the analysis presented in table 3 – this is quite distinct from the rest of the work but this was not anticipated from the Method section.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The methods could be explained better (or my misunderstandings corrected) in the rebuttal. Two minor language issues that you might correct: “autochthon” is not commonly used in English, and the sentence claiming 879/1003 volumes were “fully inverted” is unclear. Future work (beyond scope of rebuttal) could evaluate accuracy of PDFF measures, which could add clinical value. Furthermore, since this can all be done retrospectively it would be great to make this available to the larger community to improve impact.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is an important topic, with a well-designed overall methodology to address the problem. The novelty of generating synthetic swaps for training was creative and adds general interest.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The rebuttal was clear and addressed minor details as requested. I generally agree with the other reviewers and their comments. The critical question is that of R2 on significance - this is an assembly of established methods used to address a problem. The rebuttal acknowledges the situation correctly. I value such contributions, and do not feel that a methodological novelty is mandatory for an impactful MICCAI paper. The authors and I appreciate and understand the significance critique, but I weigh that factor less, and continue to recommend acceptance.




Author Feedback

We sincerely thank the reviewers for their constructive and insightful feedback. We are pleased that the reviewers found the problem clinically relevant (R1, R2, R3), appreciated the novel synthetic training strategy (R1, R2), and recognized the potential impact of our method on large-scale population imaging (R1, R2, R3). Below, we address the main concerns raised: R2: Novelty – While R2 notes our novelty is “high level,” this is precisely our contribution: we solve a previously unsolved, clinically relevant problem using a novel combination of classical and deep learning methods. Such a pipeline may seem logical in retrospect, but it required a series of nontrivial insights to develop and implement effectively. We are the first to improve the existing magnitude-only methods, enabling the application in vivo. R2: Segmentation reliability – We validated synthetic data and real swaps. Dice scores exceed 0.98 on synthetic data and show high reliability in real cases (Table 2, manual review). Our results demonstrate high reliability, as confirmed by clinical partners, R1, and R3. R1–R3: Code availability and reproducibility – We will release all code, pretrained models, and inference functions for direct use on NAKO and UKBB data, ensuring full reproducibility. Requested implementation details will be added to the manuscript and in the GitHub-repo. R1-4 / R3: Clarity of Table 3 and obesity stratification The BMI classifications are provided by the NAKO/UK Biobank via tabular data and are not derived from MRI. We intended to show the prevalence of water-fat swaps, especially since colleagues have had to remove the “underweight” cohort in UK Biobank studies due to the high prevalence of swaps. This table is not a direct evaluation of the method but a reference for other researchers. Furthermore, we aimed to show that swaps are biased and that subgroups are affected differently. Thus, Table 3 highlights the relevance of this previously underestimated problem. — The reviewers kindly requested several minor clarifications and additional details added to the manuscript. We address them below and will include the missing information in the camera-ready version: R1-1 How was the segmentation collected: “A PhD-level researcher with over 5 years of MRI experience selected 500 swap-free NAKO volumes.” R1-2 / R3 Details Perlin: Thank you—we will include the detailed parameters: “low-frequency (1-3 maxima per image), min-max normalized and applied a random threshold ($\mathcal{U}[0.15, 0.85]$).” R1-3: Table 2 lacks baseline comparison This is true; due to limited space, we focused more on the signal-prior aspect. We cited an extensive Master’s thesis by Fanny Asketun and Lisa Hellgren, which goes into more detail. R2 / R3: Generator details The image-to-image generator (Pallett-Diffusion) is a U-Net backbone with Silo activation, inference method (DDIM with 50 steps, t = 1000), and conditioning strategy. Image size was 256×256 with an L1 loss. All details can be found in the config files of the pretrained models. R3: nnU-Net – The nnU-Net predicts voxel-wise labels. The input is the water or fat image and the first two sequence images. R3: Arms exclusion Table 2: This test was performed by a medical expert instructed to ignore swaps in the arms. Arms in UKBB and NAKO are often in regions where the signal loses integrity and are cut in frame. They are generally not useful for any medical analysis. R3: Classified as correct or wrong from the statement In voxel-wise evaluation, a voxel is classified as correctly reconstructed if its computed value is closer to the reference than the swapped water voxel. We will rephrase this for clarity. R3 / R1: Minor comments We will correct grammar and typos as suggested, including capitalization in references. “autochthon” → spinal muscle; “fully inverted” → completely swapped, dice in method section, others




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    A thorough study. Reviewers are satisfied with all phases of it, even though many of the steps of the methodology have limited novelty. The authors also promised to make the code and trained models available.



back to top