Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

In clinical diagnosis and treatment, traditional enhanced imaging techniques often suffer from inherent limitations such as high time costs and radiation risks. Therefore, medical image translation technology provides an efficient and cost-effective alternative. However, images generated by existing medical image generation methods still face challenges, such as a lack of structural consistency and blurred local details. Most methods struggle to simultaneously integrate deterministic structural information, such as anatomical priors, and probabilistic dynamic variations, such as blood flow changes, to guide image generation.To address these challenges, we propose a Coarse-to-Fine Medical Image Translation (C2FMIT) model, which incorporates Deterministic Guidance and Probabilistic Refinement to balance generation controllability and fidelity. First, we design a Deterministic Guidance Branch (DGB) to extract coarse-grained features, such as organ contours, to provide global structural constraints. Then, these deterministic priors are fused into our Probabilistic Refinement Branch (PRB), where the Brownian Bridge diffusion is employed for fine-grained optimization, enhancing microvascular textures and dynamic enhancement regions. Notably, we designed a Coarse-to-Fine Guided Attention Module (C2FGAM) to achieve progressive optimization from global structure to local details. Experimental results demonstrate that our method achieves superior performance across multiple modalities of functionally contrast-enhanced medical imaging on both public and in-house datasets.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1774_paper.pdf

SharedIt Link: https://rdcu.be/eHwYl

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04984-1_13

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

Duke-Breast-Cancer-MRI: https://www.cancerimagingarchive.net/collection/duke-breast-cancer-mri/

BibTex

@InProceedings{TiaHon_CoarsetoFine_MICCAI2025,
        author = { Tian, Hongnian AND Lv, Tianxu AND Fan, Jiansong AND Pan, Delin AND Li, Lihua AND Pan, Xiang},
        title = { { Coarse-to-Fine Medical Image Translation by Incorporating Deterministic Guidance and Probabilistic Refinement } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {128 -- 137}
}

Reviews

Review #1

Please describe the contribution of the paper

The paper proposed a Coarse-to-Fine Medical Image Translation (C2FMIT) model. The model contains a Deterministic Guidance Branch (DGB) to provide global structural constraints, a Probabilistic Refinement Branch (PRB) to enhance detailed textures, and a Coarse-to-Fine Guided Attention Module (C2FGAM) to fuse the global and local features. The authors provide quantitative and qualitative experiment results to show the better performance over traditional image transition models on both public and in-house datasets.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The paper designed 2 separate modules to process the coarse and fine features, and a fusing module to merge the features. Experiment results show the performance improvements compared to traditional image transition models.
2. Ablation study indicates the quantitative gains from different modules.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The paper claims DGB for global structural constraints and PRB for detailed texture enhancements. However, the paper only provided the quantitative ablation study which can not indicate aforementioned claims. This diminishes the main contribution of the proposed work.
2. In Sec. 2.3, Equ (6), the paper does not elaborate the F_d part. Unclear to readers.
3. As the text part says, DGB utilizes a pretrained VAE, PRB utilizes a pretrained VQVAE. It should be reflected on Fig. 1 as well.
4. In Fig.1, the arrows indicate DGB and PRB forward features to C2FGAM for final image synthesis. However, the text says the final synthesis is done by the decoder of the PRB. Fig. 1 should reflect this as well.
5. The images for qualitative results are too small. Replace to larger images or highlight the major differences that the paper wants to emphasize.
6. Miss the reference of LBBDM work.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The gain of the proposed framework is good. But need to address the weaknesses.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The rebuttal has resolved most of my concerns. Please edit the paper accordingly in the final version if accepted!

Review #2

Please describe the contribution of the paper

This paper proposes a novel coarse-to-fine medical image translation framework (C2FMIT) that integrates deterministic anatomical priors with probabilistic refinement to improve both structural consistency and local detail fidelity. The method consists of three key components: a Deterministic Guidance Branch (DGB) for extracting coarse anatomical features, a Probabilistic Refinement Branch (PRB) based on Brownian Bridge diffusion for modeling fine-grained variations, and a Coarse-to-Fine Guided Attention Module (C2FGAM) for progressive feature fusion. The framework is evaluated on two medical imaging translation tasks (DCE-MRI and CT->CTA) and demonstrates strong quantitative and qualitative performance gains over existing methods.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The framework has a well-structured architecture that logically separates global anatomical guidance from local detail refinement. The proposed model outperforms state-of-the-art baselines across multiple datasets using comprehensive metrics (MAE, PSNR, SSIM), and visual comparisons further support the improvements. Meaningful application: The method addresses a clinically relevant problem which reduces reliance on high-radiation or contrast-enhanced imaging through accurate synthetic image generation. The impact of each module (DGB, PRB, and C2FGAM) is thoroughly analyzed, showing their necessity and complementary roles.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The novelty is limited. The core ideas in this work, i.e., VAE, Brownian Bridge diffusion, VQGAN, are borrowed from existing work. The contribution mainly lies in their integration rather than algorithmic innovation.
2. The motivation and insight analysis of proposed approach is unclear. Specifically, the paper lacks a deeper theoretical explanation or interpretability analysis for what is the motivation for applying the combination of those existing works and why this combination improves results.
3. The method assumes access to paired data during training, but does not explore applicability to unpaired or semi-supervised settings, which are common in clinical practice.
4. The experiment lacks suitable comparison with recent state-of-the-arts in CT or medical image. Given the core contribution of the paper is adoption of methods already present in the literature for an application, comparisons with other methods are a critical section. Yet all the approaches compared in the experiements are deisgned for RGB image rather than CT or medical image synthesizing, makes the comparison not complete and unfair.
Additionally, the lack of discussion on critical aspects such as the application in the real-world clinics contribute to the overall assessment that the paper does not meet the necessary standards for MICCAI.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method provides a solution to a relevant medical imaging problem. The novelty is insufficient, relying on a combination of established models. The unclear experimental gains, and unaviable discussion on clinical motivation support a negative assessment. Thus, its lack of fundamental innovation and missing comprison with recent state-of-the-arts on CT/medical image synthesizing task, leads to a reject for MICCAI conference.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

All my concerns have not been addressed. The authors only claimed again what they supposed without any justification. For example, the authors insist that the comparison with other state-of-the-art medical image models has been done but the one they pointed out (CT2MRI) can not be found in the manuscript. The referred work is not even cited. In the CT2MRI paper, CT2MRI has been validated in both inhouse dataset and BraTS, why the experiment in BraTS has not been validated if the authors claimed to compare with CT2MRI?

Review #3

Please describe the contribution of the paper

The main contribution of this paper is the introduction of a novel Coarse-to-Fine Medical Image Translation (C2FMIT) framework that effectively combines deterministic anatomical guidance and probabilistic refinement using the Brownian Bridge diffusion process.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

First, the use of deterministic structural guidance alongside probabilistic modeling effectively addresses critical issues of blurred image features and inconsistent structures typically found in other methods. Furthermore, the method’s clinical relevance is convincingly demonstrated through extensive evaluations on two clinically significant datasets, and it improves on established state-of-the-art methods, showing clear potential for real-world medical image synthesis tasks.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

While the integration of deterministic guidance with probabilistic refinement is novel, the individual architectural components (e.g., U-Net structure and VAE-based feature extraction) and attention mechanisms employed are well-established and commonly utilized in existing literature. Moreover, the experiments, though extensive, are limited to specific imaging modalities (breast MRI and chest CT). The paper does not extensively explore or discuss the generalization of the proposed method to different imaging modalities (e.g., ultrasound, PET), leaving open questions regarding broader applicability.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper provides a strong methodological advancement in medical image synthesis, but could further strengthen claims of novelty and practicality through additional experiments and discussions of generalization.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

Response to Reviewer #1: We wish to thank the reviewer’s comments and suggestions. A1. Beyond Duke and ChestCT, C2FMIT also achieves outstanding results on the public brain dataset NSCLC in the CT→PET task. Compared with prior methods, Pix2Pix (MAE=11.42, PSNR=19.42dB, SSIM=68.00%) and CycleGAN (MAE=8.42, PSNR=22.06dB, SSIM=65.82%) yield subpar results. Among diffusion models, BBDM (MAE=3.85, PSNR=23.91dB, SSIM=86.06%) performs best, followed by UNSB (MAE=6.09, PSNR=23.20dB, SSIM=84.89%). Our method achieves superior performance: MAE=3.41, PSNR=27.25dB, SSIM=91.91%, outperforming recent state-of-the-art methods. Response to Reviewer #2: We wish to thank the reviewer’s comments and suggestions. A1. Due to the limitation of including figures in the response, we instead provide quantitative evidence. We evaluate edge-SSIM (↑) and ROI GLCM contrast (↑) along with MAE (↓) to assess global structural constraints and local detail fidelity. On the Duke dataset, C2FMIT achieves an edge-SSIM of 84.42%, which drops to 78.92% when removing the DGB. For fine details, GLCM contrast reaches 344.8 and MAE is 13.51voxel, while removing the PRB leads to a decrease in GLCM to 292.9 and an increase in MAE to 16.32voxel. These results validate the effectiveness of C2FMIT in global contour preservation and fine-grained texture synthesis. A2. F_d denotes the global anatomical constraint features extracted by the translator in the DGB, facilitating structural fidelity. A3. We apologize for the confusion caused by the unclear illustration—the two branches share identical encoders to ensure spatial consistency. A4. Fig.1. has been refined accordingly. A5. For qualitative comparisons and ablation results, we will add zoomed-in vessel regions. A6. LBBDM has been cited (page 3), and we will ensure full references are properly added to enhance clarity. Response to Reviewer #3: We wish to thank the reviewer’s comments and suggestions. A1. The core innovation of C2FMIT lies in proposing a new paradigm for medical image translation: (1) we propose a novel dual-branch framework that explicitly decouples coarse-grained anatomical constraints and fine-grained dynamic details for synergistic generation; (2) we design a dedicated Coarse-to-Fine Guided Attention Module (C2FGAM) that enables progressive feature fusion and refinement. A2. Conventional approaches face clear limitations. VAE-based models, constrained by continuous latent space assumptions, tend to lose high-frequency details in dynamic enhancement synthesis. For example, GLCM contrast(↑) in microvascular regions on the Duke is only 292.9, while our method achieves 344.8. In contrast-enhanced regions, these models show blurring due to KL-driven latent regularization. These models fail to capture voxel-level spatial heterogeneity of vascular textures. Single-branch diffusion-based methods (e.g., BBDM) improve local detail synthesis but often compromise anatomical consistency. For example, BBDM achieves 78.20% edge-SSIM(↑) on ChestCT, while C2FMIT reaches 83.46%. This is due to a fundamental conflict between the stochastic nature of diffusion and the need for rigid anatomical constraints—traditional diffusion lacks explicit integration of structural priors. Our dual-branch C2FMIT resolves this by combining deterministic guidance and probabilistic refinement. C2FGAM progressively fuses structural and detail features via gated residual connections, reducing structural distortions. A3. C2FMIT addresses radiation risks in dynamic contrast imaging rather than data scarcity by enabling cross-modal translation. Future work will explore semi-/weakly-supervised paradigms to enhance generalization. A4. Compared with the latest medical translation method, CT2MRI[1], C2FMIT achieves +4.40% SSIM and + 4.84dB PSNR on ChestCT, further validating the superior performance.

[1]Choo, Kyobin, et al. “Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion Model.” MICCAI. 2024

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

While some reviewer comments remain, particularly regarding the need for further clarification and missing information, the authors have sufficiently addressed the other major concerns.

back to top

Coarse-to-Fine Medical Image Translation by Incorporating Deterministic Guidance and Probabilistic Refinement

Author(s):