Abstract

This study, we propose a novel Q-space Guided Collaborative Attention Translation Networks (Q-CATN) for multi-shell, high-angular resolution DWI (MS-HARDI) synthesis from flexible q-space sampling, leveraging the commonly acquired structural MRI data. Q-CATN employs a collaborative attention mechanism to effectively extract complementary information from multiple modalities and dynamically adjust its internal representations based on flexible q-space information, eliminating the need for fixed sampling schemes. Additionally, we introduce a range of task-specific constraints to preserve anatomical fidelity in DWI, enabling Q-CATN to accurately learn the intrinsic relationships between directional DWI signal distributions and q-space. Extensive experiments on the Human Connectome Project (HCP) dataset demonstrate that Q-CATN outperforms existing methods, including 1D-qDL, 2D-qDL, MESC-SD, and QGAN, in estimating parameter maps and fiber tracts both quantitatively and qualitatively, while preserving fine-grained details. Notably, its ability to accommodate flexible q-space sampling highlights its potential as a promising toolkit for clinical and research applications. Our code is available at https://github.com/Idea89560041/Q-CATN.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3402_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: https://papers.miccai.org/miccai-2025/supp/3402_supp.zip

Link to the Code Repository

https://github.com/Idea89560041/Q-CATN

Link to the Dataset(s)

HCP dataset: https://www.humanconnectome.org

BibTex

@InProceedings{ZhuPen_Qspace_MICCAI2025,
        author = { Zhu, Pengli and Fu, Yingji and Chen, Nanguang and Qiu, Anqi},
        title = { { Q-space Guided Collaborative Attention Translation Network for Flexible Diffusion-Weighted Images Synthesis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15962},
        month = {September},
        page = {508 -- 518}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a method called “Q-space Guided Collaborative Attention Translation Networks”(Q-CATN) to synthesize high-angle resolution DWIs from common MRI data by synthesizing multiple shells in Q-space. Q-CATN is composed of a single-mode attention (SAM) encoder, a multi-modal attention fusion (MMAF) module, a q-space nesting module, a single-mode attention (SAM) decoder, and a condition discriminator. The input of Q-CATN is b0 image, T1-weighted image, and T2-weighted image. Q-CATN uses HCP data sets for training and testing.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The cooperative attention mechanism proposed in this paper can extract enough information from multiple modes to accurately generate DWIs.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. In this work, no ablation study was performed to verify the effectiveness of each module. Without abaltion study, how do the authors demonstrate the effectiveness of the poprosed modules?

    2. The authors mentioned in the conclusion that “Q-CATN was chosen for its superior computational efficiency and faster inference times, crucial for real-time clinical applications.”Please provide more detailed evidence to support this conclusion.

    3. Figs. 2 and 3 provide important visualization results. Please provide more details regarding the analysis of these two figures. How dose the propose method ouperform existing ones? Why is it effective in improving the visualization results?

    4. In the experiments, the authors only uses the HCP data set for training and testing. Is this sufficient to demonstrate the generalizability of the proposed method.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method proposed in this paper is relatively new, but there are still some flaws in the experimental process and writing.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed a GAN-based method for generating DWI from multi-modality brain MRI images. Its main contribution is to make use of the complementary information between different modality and allow flexible sampling in Q-space.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    High angular resolution DWI images are the key to tractography analysis, so super-resolution in Q-space is very important. Compared with previous work, this method more effectively utilizes multimodal input information and models the nonlinear effects of continuously changing b-vecs and b-vals on DWI image signals.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    However, since there are no DWI images in the input, the model may only model the nonlinear relationship in the HCP dataset. So the model may have poor generalization to disease data (tumor) and other datasets? In addition, the model is based on 2D slices as input, which loses structural information. In the supplementary materials, the tractography results of the method proposed in the paper and Q-GAN cannot see the brainstem, which may be caused by this reason.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    There is a paper published in NIPS 2024 titled “Spatio-angular convolutions for super-resolution in diffusion MRI”. Authors should consider comparing it. No ablation experiments were to verify the effectiveness of each module.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Paper proposed a new method, but there are still some shortcomings.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper introduces Q-space Guided Collaborative Attention Translation Networks (Q-CATN), a novel methodology for multi-shell, high-angular resolution diffusion-weighted imaging (MS-HARDI) synthesis that addresses key limitations in existing approaches. The primary innovation lies in its flexible q-space sampling capability, which eliminates the need for fixed sampling schemes that restrict current methods. Q-CATN incorporates a collaborative attention mechanism that dynamically extracts and integrates complementary information from multiple structural MRI modalities (b0, T1- and T2-weighted images), significantly enhancing synthesis accuracy. The authors implement task-specific constraints to maintain anatomical fidelity in the synthesized diffusion-weighted images, enabling accurate learning of relationships between directional DWI signal distributions and q-space. Experimental validation on the Human Connectome Project dataset demonstrates Q-CATN’s superior performance compared to existing methods (1D-qDL, 2D-qDL, MESC-SD, and QGAN) in both quantitative and qualitative evaluations of parameter maps and fiber tract reconstruction, with particular emphasis on preserving fine-grained neuroanatomical details.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Exceptional Methodology and Presentation The paper presents the Q-CATN methodology with remarkable clarity and organization, supported by high-quality figures that effectively illustrate complex concepts. The collaborative attention mechanism is particularly well-detailed, showing how it extracts information from single modalities while leveraging complementary data across multi-modal inputs including b0, T1- and T2-weighted images to enhance synthesis accuracy and robustness.
    2. Innovative Technical Approach with Flexible Sampling Q-CATN’s most significant innovation is its flexible q-space sampling capability, representing a departure from the fixed sampling schemes of previous methods. This approach allows the network to dynamically adjust based on variable q-space information. The implementation of task-specific constraints preserves anatomical fidelity in the synthesized diffusion-weighted images, enabling accurate learning of relationships between directional DWI signal distributions and q-space.
    3. Superior Performance with Practical Applications Experimental validation on the Human Connectome Project dataset demonstrates Q-CATN’s state-of-the-art performance compared to existing methods in both quantitative metrics and qualitative evaluations. The method excels at preserving fine-grained neuroanatomical details and can generate densely sampled q-space data, facilitating the reconstruction of various diffusion models with substantial benefits for clinical and research applications in neuroimaging.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The paper is free of major weaknesses. The work is solid throughout, with a well-motivated approach, clear methodology, and strong experimental results that advance the field.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    This is a very well written manuscript presenting strong work. Congratulations to the authors. Minor comments and suggestions are:

    1. In Figures 1 and 3, I recommend increasing the font size for elements on graphs and figures to improve readability. Currently, some text appears illegible in the paper format, which makes it difficult to fully appreciate the detailed information these visualizations convey.
    2. The data description (section 2.6) would benefit from explicitly addressing the resolution of the dataset used. Currently, it’s unclear whether the Human Connectome Project data used is exclusively high-resolution research data or if it includes clinical-quality DWI data (typically ~2×2×6mm). This information would be valuable in the discussion section to clarify whether Q-CATN has potential for enhancing lower-resolution clinical data, thus having immediate clinical applications, or if it’s primarily intended for research settings. Addressing this point would help readers better understand the scope and potential impact of this excellent work.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (6) Strong Accept — must be accepted due to excellence

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Due to the paper’s significant methodological contributions, particularly the innovative flexible q-space sampling approach that overcomes limitations in existing methods, combined with the exceptional clarity of presentation, comprehensive experimental validation, and demonstrated state-of-the-art performance in MS-HARDI synthesis, I suggest a Strong Accept. The work represents an important technical advancement with substantial potential for clinical and research applications in neuroimaging, supported by meticulous implementation details and compelling results that preserve critical fine-grained neuroanatomical information.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

  1. Generalizability Concerns: While HCP data provides high-quality benchmarks, testing on clinically acquired datasets (e.g., lower SNR scans or pathological cases) would strengthen claims about robustness. Response: Thanks for your comments, we have validated the proposed method Q-CATN on four other datasets of different age groups beyond the HCP dataset, which we are unable to show in detail due to limited page.

  2. Technical Clarifications: Define the index “i” in Section 2.2’s final equation explicitly. Response: Thanks for your comment, we have revised the final equation in Section 2.2 by changing the index “i” to “n”.

  3. Visualization Enhancement: Adjust Figure 2’s display window to better highlight DWI contrast variations. Response: Thanks for your comments. We noticed that the DWI contrast variations in Fig. 2 caused the structural details to be less obvious, but we followed the true contrast of b2000 and b3000, which is more consistent with the clinical situation.

  4. Ablation Study: Verify the effectiveness of each module Response: Thanks for your comments. We conducted comprehensive ablation experiments, ensuring consistent data and training iterations for fair comparisons. As additional input modalities and loss functions are introduced, the metrics like RMSE, LPIPS, and FID show consistent decreases, while MS-SSIM, PSNR, and UQI improve steadily, which indicates that each modalitiy and loss function we used are reasonable. For the reason of page limits, we have prioritized the presentation of the more important experimental results in the manuscript.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    This paper introduces a novel method for synthesizing diffusion-weighted images (DWIs) from flexible q-space sampling, addressing a critical need for adaptable acquisition protocols in clinical and research settings. The work is timely, well-structured, and demonstrates a clear understanding of current challenges in DWI reconstruction, particularly the limitations of fixed q-space sampling schemes. Strengths: The proposed framework effectively leverages q-space conditioning to enable arbitrary sampling patterns, overcoming key restrictions in traditional approaches. Module motivations are well justified, particularly the integration of cross-attention mechanisms. Experimental validation on HCP data demonstrates promising synthesis accuracy and downstream utility for microstructure estimation.. Areas for Improvement:

    1. Generalizability Concerns While HCP data provides high-quality benchmarks, testing on clinically acquired datasets (e.g., lower SNR scans or pathological cases) would strengthen claims about robustness.
    2. Technical Clarifications Define the index “i” in Section 2.2’s final equation explicitly. Expand the SMA decoder architecture description.
    3. Visualization Enhancement Adjust Figure 2’s display window to better highlight DWI contrast variations.
    4. Ablation Study Verify the effectiveness of each module

    This work makes a meaningful contribution to q-space reconstruction with practical implications for reducing acquisition times. The identified limitations can be addressed through revisions.



back to top