Abstract

Magnetic Resonance Image (MRI) is a powerful medical imaging modality with non-ionizing radiation. However, due to its long scanning time, patient movement is prone to occur during acquisition. Severe motions can significantly degrade the image quality and make the images non-diagnostic. This paper introduces MoCo-Diff, a novel two-stage deep learning framework designed to correct the motion artifacts in 3D MRI volumes. In the first stage, we exploit a novel attention mechanism using shift window-based transformers in both the in-slice and through-slice directions to effectively remove the motion artifacts. In the second stage, the initially-corrected image serves as the prior for realistic MR image restoration. This stage incorporates the pre-trained Stable Diffusion to leverage its robust generative capability and the ControlUNet to fine-tune the diffusion model with the assistance of the prior. Moreover, we introduce an uncertainty predictor to assess the reliability of the motion-corrected images, which not only visually hints the motion correction errors but also enhances motion correction quality by trimming the prior with dynamic weights. Our experiments illustrate MoCo-Diff’s superiority over state-of-the-art approaches in removing motion artifacts and retaining anatomical details across different levels of motion severity. The code is available at https://github.com/fengza/MoCo-Diff.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1678_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1678_supp.pdf

Link to the Code Repository

https://github.com/fengza/MoCo-Diff

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Li_MoCoDiff_MICCAI2024,
        author = { Li, Feng and Zhou, Zijian and Fang, Yu and Cai, Jiangdong and Wang, Qian},
        title = { { MoCo-Diff: Adaptive Conditional Prior on Diffusion Network for MRI Motion Correction } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15006},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces MoCo-Diff, a new method using deep learning to fix movement errors in 3D MRI scans. It combines two advanced techniques: one that adjusts images initially and another that refines these adjustments to improve image quality. An uncertainty predictor enhances the correction by dynamically adjusting the process based on the certainty of the corrections.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. MoCo-Diff integrates a transformer + diffusion model, presenting a cutting-edge approach to effectively tackle MRI motion artifacts.
    2. The paper incorporates an uncertainty predictor to assess and refine confidence of corrections making the MRI images more reliable for clinical use.
    3. better artifact removal and detail preservation compared to existing methods, the authors support this with extensive testing across multiple datasets.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Using a dual-branch transformer and diffusion model approach may require significant computational resources, potentially limiting its applicability in less-equipped settings. 2.While the paper presents results from simulated datasets, real-world effectiveness of the model remains somewhat uncertain without broader clinical discussion. Also a general short discussion or limitation section would be interesting to critically evaluate the presented approach.
    2. Also, the specificity of the model to the datasets and motion patterns used during training might not generalize well to other types of motion artifacts or different MRI modalities.
    3. To me the introduction feels generated with ChatGPT, specifically using words like ‘vital’, ‘leveraging’, ‘grasping’, ‘intricate’, are words commonly generated with ChatGPT.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • It would help to expand on how exactly the uncertainty predictor works within your framework—how does it interact with other components, and how does it affect the final images?
    • Can you discuss any limitations or challenges you saw in your tests? Where does MoCo-Diff struggle, and why?
    • It would be beneficial to discuss how well MoCo-Diff adapts to different types of MRI scans and motion patterns. This would give a clearer picture of versatility + talk about the computational demands and scalability of your model.
    • The uncertainty predictor is a standout feature. I would suggest to explain in more detail what the design and purpose is.
    • Could you clarify what makes the adaptive conditional prior in your diffusion model different from other similar techniques?
    • In case of rebuttal, it would be great to include all necessary experimental details—like training duration, computational resources, and specific settings-so others can replicate your work.
    • Some of your figures pack in a lot of info. Maybe simplify these or add more detailed captions to help readers follow along better. Specifically, for example, Figure 1 is showing the workflow but also has many different arrows and elements and seem not very clear to me.
    • Double-check that all figures and tables are properly referenced in the text and clearly relate to your discussion points.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Please see the comments I gave to the authors to improve the strength of the paper. I feel the topic is very interesting and the method is novel, however, I feel the paper still needs more work and I am not sure if this can be done in the rebuttal period.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    Based on the other reviewer’s feedback and the author’s comment, I am willing to give a weak accept. I would suggest the authors should address as many concerns that were mentioned for a final version of the manuscript.



Review #2

  • Please describe the contribution of the paper

    The paper introduces MoCo-Diff, a novel two-stage deep learning framework for MRI motion correction. MoCo-Diff’s innovations lie in its attention mechanism, adaptive prior strategy and uncertainty predictor, which collectively improve the accuracy and quality of motion correction in 3D MRI volumes.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper are as follows: The authors have provided a clear and well-defined objective for their research. Their dedications for addressing the challenge of motion artifacts in MRI and improving image quality are evident. The proposed MoCo-Diff framework showcases their innovative thinking and commitment to advancing the field of medical imaging. Novel Two-Stage Framework: The paper proposes a two-stage framework, MoCo-Diff, for MRI motion correction. This framework combines a Dual Branch Transformer (DBT) model with an adaptive prior strategy for the diffusion model. This novel formulation allows for improved synthesis fidelity and perception within the motion correction domain. Attention Mechanism: MoCo-Diff introduces a shift window-based transformer with attention mechanisms in both the in-slice and through-slice directions. This attention mechanism is a novel way to effectively learn 3D motion features through a 2D computation framework. It enhances the accuracy of motion correction by focusing on relevant areas of the image. Adaptive Prior Strategy: The adaptive prior strategy in MoCo-Diff controls each step of the generation process with the prior derived from the first stage. This strategy effectively mitigates the inclusion of “fake” details in medical images, resulting in more accurate and realistic restoration. This aspect is particularly interesting as it helps to improve the quality of motion-corrected images. Evaluation of Segmentation Performance: The paper evaluates the impact of the recovered tissue details on downstream segmentation. This evaluation helps gauge the quality of the motion-corrected images and demonstrates the effectiveness of MoCo-Diff in artifact removal and detail preservation. This strong evaluation provides evidence of the practical feasibility and clinical relevance of the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    One limitation of this paper is that while it discusses 3D consistency, the comparison results provided are only for single images. It would be beneficial to showcase a sequence of sufficient number of consecutive images to demonstrate the effectiveness of the proposed method over time. If the space constraint is an issue, it would be acceptable to compare only the results of “GT” and “Ours”. Alternatively, using line graphs to represent the errors of different methods on the same sequence of images would also be informative, but it is preferable to have visual image comparisons.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The quality of this paper is satisfactory, and I would be willing to give it a rating of 5 or 6 if the following issues are addressed or explained in the paper:

    1. The paper does not clearly explain or emphasize the benefits of combining Unet and diffusion model. It would be helpful to add more descriptions or highlight the advantages of this combination.
    2. Please provide a detailed description of what the Error Map represents in terms of error measurement (e.g., mean squared error, absolute error), as shown in Figure 2.
    3. Figure 3 represents a motion severity of 40%, but it appears to be less severe than the 40% motion severity depicted in Figure 2. Please clarify this discrepancy.
    4. The paper mentions that the improved algorithm enhances the overall consistency of MRI images in the entire volume, but there is a lack of comparison and demonstration of the results. It would be beneficial to include visual comparisons or showcases of the results.
    5. It would be more meaningful to use visual illustrations to demonstrate the image restoration results with different modules in the ablation study, rather than presenting the results solely in numerical form (Table 2). It is suggested to include the numerical data in the supplementary materials and present the image comparisons in the main text.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The project is relatively comprehensive and has a reasonable level of innovation. However, there are some issues that need further clarification. I will decide whether to give a rating of 5 based on the results of the rebuttal.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The workload of this paper is substantial, and the methods used are relatively novel.



Review #3

  • Please describe the contribution of the paper

    The paper describes a two stage framework for motion correction in MRI images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (S1) The paper is technically thourugh and provides a lot of mathematical details about the proposed approach. (S2) The qualitative and quantitative results look extremely promising.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (W1) Figure 1 is too complicated and cluttered. Please simplify the figure, highlight the workflow and focus the visuals in the essential parts that are your contribution. (W2) Some of the architectural design decisions need more motivation. For example, how does AP-Diff tackle texture loss and over smoothness ? (W3) More discussion on computational complexity needed. How much does inference after training cost ? Is it feasible to use this in a clinical setting? (W4) Please add a brief discussion on how well the simulated motion distortion reflects real-non-simulated motion distortion.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper provides relevant hyper parameter configurations and will release the source code upon acceptance. This should be satisfactory for it’s reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    (C1) add labels to the columns of table 1. Currently, the labels are only in the figure caption. (C2) Typo artefacts –> artifacts (C3 ) Bery informative abstract

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper present an innovative approach for MRI motion correction. It is well written and outperforms the state of the art.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We gratefully appreciate reviewers’ recognition of our work’s technical novelty (R1, R3, R4) and its practical value (R1, R4). We hope the concerns are addressed as follows.

R3/4: Applicability to infer with limited computation resources. The experimental details are elaborated in Section 3 and we will release code/model upon acceptance to support reproducibility. During inference, our model requires 10.742GB GPU memory, which is feasible for a typical single-GPU system. Moreover, our model doesn’t require real-time inference, which reduces the demand for computation resources. Its application in our ongoing large-scale brain cohort construction has rescued many data that previously failed quality control due to motion artifacts. The in-house application underlines the merit of our method, especially when it’s not feasible to reacquire the data from the volunteers.

R3: Adaption to real-world data. We argue that our method is effective to handle real-world data as already demonstrated in our paper, such as validations on three external datasets with real (not simulated) motion artifacts in Section 3 (Fig. 3, Fig. S2). The in-house application stated above also verifies its adaption to real-world data.

R3: Adaption to different data types. Fig. S2 shows the model’s adaption to data with different scanners and imaging parameters (FOV, TR, TE, etc), proving its efficacy of correcting motion of diverse data. In this paper we focus on a single MRI modality, partially due to the page limit. We agree with the comment, and are working with multi-modal validation. We will soon release results in follow-up papers.

R3: Clarification on Uncertainty Predictor (UP). First, we apologize for a typo in Eq. 5, where the loss term in the second line should be same with that in the first line. We will make correction in final version. The diffusion model benefits significantly from conditional guidance, and UP provides an adaptive way to improve the guidance precision. As detailed in Section 2.2, UP employs a DBT-equivalent network with three decoders to estimate the guidance and parameters α and β separately (Eq. S1). The parameters ɑ and β are subsequently used to compute the uncertainty map (Eq. S2). The diffusion model reaches its final prediction through guided iterative sampling while the uncertainty map produced by UP refines the sampling process by adjusting loss weights (Eq. 5). In ablation studies (Table 2), we validated that UP can prevent PSNR degradation while improving perceptual quality.

R1: Lack of visualization showing improved consistency in MRI volume. We observed enhanced through-slice consistency when checking the 3D volumes. Regrettably, given the rebuttal constraints, we cannot show the results now. We intend to publish the code/model and display visual results on GitHub page after acceptance.

R1/4: Clarification on the benefits of network designs. 1) The integration of the trainable ControlNet to the frozen Unet in diffusion enhances training efficiency and quality of the corrected images, and mitigates overfitting on relatively small datasets while preserving the knowledge of large model learned from billions of images; 2) AP-Diff relies on the good initialization startpoint provided by DBT, focusing on distinguished features for authentic texture prediction and enhanced visual quality, tackling texture loss and oversmoothness.

R1/3/4: Minor issues on writing. 1)Fig. 3 displays real-world artifact correction across three external datasets; 2)We apologize for any confusion on the UP structure, it is a DBT-equivalent network with three decoders. In the final version, we will revise and simplify Fig.1; 3)The Absolute Error Maps range from 0 to 1; 4)We will check and revise all the details and grammers in the final version; 5)A brief discussion: Besides high performance on diverse motion types, our model, like other diffusion-based models, should also address acceleration and lightweight challenges.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors’ rebuttal has comprehensively addressed reviewers’ concerns within the limited space.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors’ rebuttal has comprehensively addressed reviewers’ concerns within the limited space.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors comprehensively address the concerns raised - the reviewers who answered the rebuttal articulated this clearly. The previous outlier “reject” has agreed to raise to “tentatively accept” based on the good rebuttal.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors comprehensively address the concerns raised - the reviewers who answered the rebuttal articulated this clearly. The previous outlier “reject” has agreed to raise to “tentatively accept” based on the good rebuttal.



back to top