List of Papers Browse by Subject Areas Author List
Abstract
Multi-sequence magnetic resonance imaging (MRI) faces critical challenges in balancing accelerated acquisition and image quality: Rapid scanning typically induces degradation, including resolution reduction, increased noise, motion artifacts, and image blurring. While existing image enhancement models partially mitigate these issues, they often exhibit insufficient exploitation of complementary information across multi-sequence data. To address this issue, we propose an interpretable deep learning framework, FDF-VQVAE, for MRI image enhancement through frequency-domain feature disentanglement and fusion. Our framework constructs a dual-branch frequency-domain disentanglement module (DBFD) that precisely decouples high-frequency and low-frequency features of different sequences through parallel high-frequency feature and low-frequency feature extraction pathways. The multi-frequency-domian feature weighting mechanism (MFDFW) adaptively fuses the high and low frequency features of different sequences. Finally, feature recombination and decoding achieve MRI enhancement through joint optimization. We conducted denoising, super-resolution, and deblurring experiments on the IXI dataset (546 subjects) with external validation on the BraTS2021 dataset (357 subjects). Experimental results demonstrate that our method significantly outperforms the state-of-the-art approaches in denoising, motion artifact removal, and super-resolution tasks. Our code is available at https://github.com/kkllxh/FDF-VQVAE.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2926_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/kkllxh/FDF-VQVAE
Link to the Dataset(s)
N/A
BibTex
@InProceedings{XieXin_FDFVQVAE_MICCAI2025,
author = { Xie, Xinghe and Han, Luyi and Sun, Yue and Lam, Chi Kin and Zheng, Jian and Tong, Tong and Ke, Wei and Lam, Chan-Tong and Tan, Tao},
title = { { FDF-VQVAE: A Frequency Disentanglement and Fusion Learning Framework for Multi-Sequence MRI Enhancement } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15962},
month = {September},
page = {186 -- 196}
}
Reviews
Review #1
- Please describe the contribution of the paper
- The proposed dual-branch frequency-domain disentanglement module (DBFD) effectively separates high-frequency and low-frequency features, leveraging the complementary information across different sequences. 2. The MSFW mechanism allows for adaptive fusion of features, optimizing the contribution of high-frequency and low-frequency components. 3. Extensive experiments on the IXI and BraTS2021 datasets demonstrate the effectiveness of the method in tasks such as super-resolution, denoising, and motion artifact removal, outperforming state-of-the-art approaches.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper provides a thorough evaluation using multiple metrics and datasets, showcasing the robustness and generalization capabilities of the method.
- Experimental results indicate that the proposed method significantly outperforms existing techniques in various MRI enhancement tasks.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The approach of disentangling and fusing features in the frequency domain is a commonly used method.
- There are too few methods for comparison and most of them are not tailored for this task. It is necessary to compare with related works such as MRI SR, denoising, and deblurring.
- The whole set of methods directly uses natural image processing methods, ignoring the use of K-space data. Moreover, all of them are simulated data sets and lack verification on real images.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Innovation and experimental integrity.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
This paper introduces a multi-sequence MRI enhancement method . It first extracts high- and low-frequency features in the frequency domain, and proposes a weighting module to balance them. The weighted features pass though a decoder to generate the final fused images. The model was validated on three tasks: denoising, motion artifact suppression, and super-resolution.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The author decouples high- and low-frequency features from each modality, which could extract useful complementary features.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Limited practicality. While it is natural to combine multi-sequence features for MRI enhancement, it limits the practicality of the proposed model when certain modalities are absent in clinical setting;
- It is worrying that inaccurate cross-modal registration among sequences could harm the model’s practical performance, given that the model heavily rely on high frequency features which are sensitive to local textures.
- Unclear training and testing setting. Does the model output single sequence in each inference? Does the model handle all degradations, i.e., super-resolution, denosing, and motion artifacts, all at once? My understanding is the model still needs retraining on each task.
- In Fig. 1a, does the DBFD module share weights? The author should explain more on technical details.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
While the proposed model has quite intricate design, I have concerns in its applicability, e.g., it requires three modalities to run and might be affected by inter-modal registration. Please refer to details in the weakness section.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
While I have raised my score due to the author addressing most of my concerns, a potential improvement in the method’s practicality lies in a more comprehensive analysis of misalignment. This is important because in-vivo acquisitions, even those from a single session, are susceptible to motion artifacts.
Review #3
- Please describe the contribution of the paper
The paper proposes FDF-VQVAE, a novel framework for multi-sequence MRI enhancement based on frequency-domain feature disentanglement and fusion. Specifically, it introduces a Dual-Branch Frequency Disentanglement (DBFD) module to separately extract high- and low-frequency components from different MRI sequences via parallel pathways, improving feature interpretability. To effectively integrate these components, a Multi-Spectrum Feature Weighting (MSFW) mechanism is designed to adaptively fuse frequency-specific features across sequences, enabling task-specific enhancement such as denoising, super-resolution, and motion artifact removal. Additionally, a Wavelet Transform Decoupling Loss is proposed to enforce frequency disentanglement through wavelet-based supervision. Extensive experiments on the IXI and BraTS2021 datasets demonstrate that the proposed method outperforms state-of-the-art approaches across multiple MRI enhancement tasks (denoising, motion artifact suppression, and super-resolution).
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Novelty. The DBFD module addresses a critical gap in prior work (e.g., ResViT, TSF-seq2seq) by explicitly decoupling frequency features, enabling better utilization of complementary information across sequences.The MSFW mechanism provides interpretable weight maps (Fig. 3), showing how high/low-frequency features contribute to different tasks.
- Comprehensive evaluation. The authors have extensively evaluated the proposed method on multiple tasks and multiple datasets with superior results.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The two-branch design and VQVAE components may increase model complexity and inference time, but corresponding metrics such as the number of model parameters and GFLPOs are not reported in the results.
- The design of FDF-VQVAE is complicated. The motivation for having two proposed modules, i.e., DBFD and MSFW, was not well explained.
- In Table 2, there is no information on MSFW.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
- Better explain the intuition behind designing the DBFD and MSFW modules.
- Some metrics on model complexity and number of parameters can be reported.
- The differences between methods in Fig. 2 are not very clear. It is recommended to try alternative or improved visualization techniques to better highlight the distinctions.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper presents a new image enhancement method that shows excellent results on all three downstream tasks, outperforming several comparative methods. Moreover, the authors validate their approach on two publicly available datasets.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
We thank all reviewers(R) for their valuable feedback and have responded with the necessary information. Q1: Novelty & Motivation (R1, R2) While frequency-domain feature disentanglement is a common approach, our method introduces key innovations in both the loss function and network design. (1) We propose the Wavelet Transform Decoupling Loss, which implicitly constrains the extraction of high- and low-frequency features without explicitly incorporating wavelet transforms into the network, avoiding increased computational overhead and gradient instability. (2) We design the DBFD module with high- and low-frequency branches and shared weights. A Hyper-Restormer branch extracts shared low-frequency structural features, leveraging the high anatomical consistency across image sequences, while a Hyper-Dense branch focuses on high-frequency texture details unique to each sequence. Shared weights facilitate more effective extraction of features common across sequences. (3) Our MSFW module adopts a weighted fusion strategy, enabling explicit quantification and evaluation of each sequence’s contribution. The effectiveness of these designs is validated in our ablation study (Table 2). Q2: Choice of Comparison (R1) Our framework is designed to generalize across multiple MRI sequences and tasks. Existing MRI studies are tailored to specific tasks and often lack the flexibility to extend beyond their original scope. For more intuitive comparison across tasks, we compare with recent general SOTA methods (AST, AdaIR, ResVit, Tsf-Seq2seq), which have been widely used as baselines across diverse medical imaging tasks (Meng, AAAI 2025; Yang, MICCAI 2024 AMIR; Lei, ICCV 2023). In these studies, comparative experiments for a single task generally include 3 to 5 methods. For each task, we selected 4 comparative methods, which we believe is sufficient. Q3: Use of K-space Data (R1) Due to limited multi-sequence public K-space datasets, we adopted common protocols from prior studies (Feng, MICCAI 2021; Xu, MICCAI 2021; Al-Haj Hemidi, MICCAI 2023; Xu, MICCAI 2024; Lee, MICCAI 2024). In addition, our method can be used for PACS-based or post-processing enhancement of existing reconstructed images. Q4: Model Complexity (R2) While focusing on interpretability, our model also has the smallest parameter numbers (0.3M, 157 GFLOPs), and this information could be updated in the revision. Q5: Clarification of Table 2 (R2) In Table 2, w_h and w_l refer to MSFW. In the revision, we will replace w_h and w_l with MSFW. Q6: Missing Modalities and Unclear Setting (R3) During training, we randomly set w_li and w_hi to zero (then recalculated the weights by softmax) in MSFW to simulate the modality missing, which we believe effectively addresses the issue, referring to Han, MICCAI 2023. Furthermore, in pre-trained models under missing conditions, performance degrades slightly, but our method remains more stable compared to TSF-Seq2Seq and ResVit. This proves that MSFW can better adapt to missing modalities. During validation and testing, all sequences are used as input by default, and the model outputs one target sequence at a time. Currently, separate training is still required for different tasks. We will clarify the setting details in the revision. Q7: Misaligned Sequences (R3) Multi-sequence MRI is typically acquired in one session and is naturally registered. If slight misalignment exists, recent registration methods (Juan, Scientific Reports 2023; Mok, CVPR 2024; Guo, ECCV 2024) can effectively address it. In extreme cases, misaligned sequences can be excluded during inference to avoid performance degradation. Q8: Weight Sharing in DBFD (R3) The weights of DBFD are shared, (1) better capturing shared features across sequences, and (2) reducing the number of parameters to enable more efficient learning.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The paper presents FDF-VQVAE, a novel framework for multi-sequence MRI enhancement that leverages frequency-domain feature disentanglement and fusion. The approach introduces two key innovations: a Dual-Branch Frequency Disentanglement (DBFD) module that separates high- and low-frequency components via shared-weight pathways, and a Multi-Spectrum Feature Weighting (MSFW) mechanism that adaptively integrates these components. These are supported by a novel wavelet-inspired loss formulation to enforce disentanglement, all contributing to a system capable of addressing diverse MRI enhancement tasks such as denoising, motion artifact suppression, and super-resolution.
The reviewers were generally aligned in their evaluations, identifying strong empirical performance across multiple tasks and datasets, and acknowledging the architectural novelty in the way frequency information is processed and fused. The paper also presents a compelling ablation study and interpretability analysis via visualised attention weights.
However, there were also important concerns. Reviewer #1 and #3 pointed to the limited applicability in clinical settings due to the model’s reliance on well-aligned and complete multi-sequence inputs. Reviewer #1 also noted the lack of comparisons with task-specific MRI models and the absence of k-space based methods. Reviewer #2 raised questions about the complexity of the model and the motivations behind each proposed module. These concerns were carefully addressed in the rebuttal, including justification for the task-general framing, dropout simulations for modality-missing scenarios, weight sharing for efficiency, and robustness to slight inter-sequence misalignment. The authors also clarified the training strategy and evaluation conditions.
All reviewers increased their confidence following the rebuttal, with two originally critical reviewers ultimately supporting acceptance. While a more thorough comparison with MRI-specific baselines and additional clarity on clinical robustness would have strengthened the case, the methodological contributions, strong empirical results, and thoughtful rebuttal justify acceptance.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The authors present a carefully engineered two-branch frequency-disentanglement scheme and report solid numbers on IXI and BraTS; however, after reading the three reviews and the rebuttal, I feel the fundamental concerns flagged by Reviewer #1 remain only partly answered. The idea of splitting high- and low-frequency bands and fusing them is not new in MRI post-processing, and the paper still evaluates only on simulated degradations and image-space data, omitting k-space experiments or real clinical artefacts. Key baselines specific to MRI super-resolution, denoising and motion-correction are absent, so the performance margin is hard to judge; likewise, model size, FLOPs and inference time - crucial for a dual-branch VQVAE - are never reported. Because at least one knowledgeable reviewer still rates the work “weak reject” on grounds of limited novelty and incomplete validation, while the two supporting reviews are both conditional and call for additional analysis, I conclude the manuscript is not yet mature enough for MICCAI and recommend rejection.