Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Medical image synthesis remains challenging due to misalignment noise during training. Existing methods have attempted to address this challenge by incorporating a registration-guided module. However, these methods tend to overlook the task-specific constraints on the synthetic and registration modules, which may cause the synthetic module to still generate spatially aligned images with misaligned target images during training, regardless of the registration module’s function. Therefore, this paper proposes registration-guided consistency and incorporates disentanglement learning for medical image synthesis. The proposed registration-guided consistency architecture fosters task-specificity within the synthetic and registration modules by applying identical deformation fields before and after synthesis, while enforcing output consistency through an alignment loss. Moreover, the synthetic module is designed to possess the capability of disentangling anatomical structures and specific styles across various modalities. An anatomy consistency loss is introduced to further compel the synthetic module to preserve geometrical integrity within latent spaces. Experiments conducted on both an in-house abdominal CECT-CT dataset and a publicly available pelvic MR-CT dataset have demonstrated the superiority of the proposed method. The code is available at: https://github.com/pupuchuan/RegConDIS.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2212_paper.pdf

SharedIt Link: https://rdcu.be/eHwMT

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04937-7_8

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/pupuchuan/RegConDIS

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiChu_Boosting_MICCAI2025,
        author = { Li, Chuanpu AND Chen, Zeli AND Zhang, Yiwen AND Zhong, Liming AND Yang, Wei},
        title = { { Boosting Medical Image Synthesis via Registration-guided Consistency and Disentanglement Learning } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {78 -- 88}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper propose the Registration-Guided Consistency. By applying identical deformation fields before and after synthesis and enforcing output alignment through a dedicated consistency loss. They propose the Disentangled Representation Learning. Designing the synthesis module to disentangle anatomical structures and modality-specific styles, enhancing cross-modal synthesis fidelity.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1.The paper addresses a challenge in medical image synthesis, which is misalignment noise during training and provides a motivation for tackling it. 2.The authors propose the task-specific module constraints，by jointly constraining both the synthesis and registration modules, the method avoids the generation of geometrically inconsistent synthetic images.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The method combines known techniques, the contribution of each component is not clearly justified or novel. I have reservations about the effectiveness of the proposed processing order between the deformation field and the ACDS module. If applying the deformation field either before or after the ACDS module leads to similar or negligible differences, it raises questions about the necessity and effectiveness of the proposed consistency loss. In such a case, the consistency constraint may not be justified, and the overall design of the framework might lack theoretical soundness. I would recommend the authors provide ablation or justification to demonstrate the importance of the order and the effectiveness of the consistency loss under different configurations.
2. The proposed Anatomy Consistency Disentanglement Synthetic (ACDS) module aims to separate anatomical structure from modality-specific style, which is a valuable objective. However, similar disentanglement-based strategies have been extensively explored in prior works related to style transfer and cross-modal synthesis. The paper lacks a clear articulation of how ACDS introduces substantial novelty beyond these existing methods. A deeper theoretical insight or a unique architectural design would strengthen its contribution. 3.The methodology section lacks clear logical organization, which hinders the reader’s understanding of the proposed framework. The interactions between modules, their processing order, and the rationale for specific design choices are not sufficiently explained, weakening the overall technical soundness and reproducibility of the method.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method shows limited innovation. Key components, such as the anatomy consistency disentanglement synthetic (ACDS) module and the registration-guided consistency strategy, appear to be extensions or combinations of existing techniques rather than introducing fundamentally new concepts. The paper lacks a clear explanation of how the proposed approach differs significantly from prior works in image synthesis or style transfer, reducing its contribution to the field.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

All concerns are addressed

Review #2

Please describe the contribution of the paper

This work addresses the problem of cross-modality image synthesis in the presence of paired training data that is not perfectly aligned.

This issue is common, as multimodal images are typically not aligned at acquisition, and perfect alignment is not always achievable with current registration techniques.

To tackle this, the authors propose a registration-based consistency approach, where the model enforces consistency between a synthetic image generated from a deformed input and a deformed synthetic image generated from a fixed input. In addition, they introduce a disentanglement method to separate modality-specific style from anatomical content, which is particularly well-suited to this type of problem.

The experimental evaluation is extensive, using two different datasets, and includes comparisons with several synthesis methods as well as an ablation study.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1/ The proposed approach addresses a relevant yet often overlooked problem in cross-modality synthesis: the use of paired data that is not perfectly aligned. 2/ The authors introduce a consistency loss based on generated deformations, which is well-suited to the problem at hand. Voxelmorph, a standard deep-learning based registration technique, is used to estimate the deformation between fixed and moving images. 3/ While the anatomy-consistency disentanglement strategy is not new, it results in significant performance improvements. 4/ The proposed approach is 3D. 5/ The experimental evaluation is extensive, including one public and one private dataset. The authors implemented several baselines, and their approach outperforms all of them across three metrics on both datasets. An ablation study complements the experimental results and demonstrate the effectiviness of the two main technical contribitions.

[1] Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pair. Xin, et al. MIDL 2024.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The authors missed one other work in a similar topic: Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pair. Xin, et al. MIDL 2024. Therefore, the authors do not mention the limitation of this recent work nor report the performance of this approach. Moreover, the proposed registration consistency loss has some similarities with their approach.
2. The proposed approach seems to rely on a complex optimization procedure, as it involves optimizing six different loss terms. This could be instable in other contexts.
3. No ablation study is provided to assess the stability or sensitivity of the method with respect to its different loss terms.
4. No statistical significance tests are reported to support the performance comparisons.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed approach addresses a relevant problem in cross-modal image synthesis: the presence of misalignments in paired multimodal data. Although computationally heavy, the method builds on existing techniques and adapts them to the target objective. The experimental results are convincing.

However, one key related work is not referenced [1]. The authors in [1] tackle the same problem with a method that has some similarities (registration consistency loss).

[1] Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pair. Xin, et al. MIDL 2024.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

My primary concern was related to the novelty of the proposed approach in comparison to [1]. The authors’ response, “It still lacks task-specific constraints and is costly to optimize”, does not adequately address this concern. In fact, both methods involve costly optimization procedures and leverage a joint strategy combining synthesis and registration to handle misaligned data. As such, the response from the authors is not convincing.

Moreover, the claim that “We achieve statistically significant improvements over all comparisons (p < 0.0001)” would have benefitted from further clarification. Some reported improvements appear marginal, for instance, 30.86 dB ± 0.88 vs. 30.61 dB ± 0.85, or 83.70 ± 2.47 vs. 83.38 ± 2.0. In the absence of detail regarding the statistical tests employed (e.g., whether corrections for multiple comparisons were applied), it is difficult to assess the statistical significance of these results.

Review #3

Please describe the contribution of the paper

This paper proposes a novel framework for medical image synthesis that integrates registration-guided consistency with disentanglement learning. The proposed registration-guided consistency architecture enforces task-specificity by applying the same deformation field before and after synthesis, with an alignment loss to ensure consistent outputs. Additionally, the synthetic module is designed to disentangle anatomical structures from modality-specific style information. An anatomy consistency loss is introduced to further enhance the preservation of anatomical geometry within the latent space.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The registration-guided consistency structure is a key strength, clearly promoting task-specific roles for the synthesis and registration networks, and effectively addressing the misalignment issue in supervised medical image synthesis. The methodology is clearly described, with precise and correct mathematical formulations that enhance reproducibility and understanding. The experimental section is thorough, with well-organized quantitative tables and informative visualizations. The ablation studies are comprehensive and convincingly demonstrate the contribution of each module to the final performance.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

GAN-based approaches, while still commonly used, are less novel compared to recent diffusion models and autoregressive methods. The comparison experiments are predominantly against GAN-based baselines. Only one diffusion-based method is included in the comparison, and more competitive diffusion models (e.g., BBDM, I2SB) are not evaluated.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper presents a well-motivated and technically robust solution to misalignment issues in medical image synthesis. The combination of registration-guided consistency and anatomy-style disentanglement is thoughtfully designed and empirically validated. I would recommend acceptance.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

This paper presents a well-motivated and technically robust solution to misalignment issues in medical image synthesis. The combination of registration-guided consistency and anatomy-style disentanglement is thoughtfully designed and empirically validated. I would recommend acceptance.

Author Feedback

We thank all reviewers for their comments and affirmation: 1) Well-motivated misalignment solution (all); 2) Task-specific guidance (all) 3) Effective anatomy-style disentanglement (R1/R3); 4) Extensive experiments (R1/R3). We address the main concerns below.

C: DA-GAN vs Ous (R1) R: We thank the reviewer for pointing out this related work. Xin et al. (MIDL 2024) employ four aligners only after synthesis, with cycle-based consistency across them. It still lacks task-specific constraints and is costly to optimize. In contrast, we adopt one deformation field shared across dual branches (before/after synthesis) to enforce consistency. Our design is simpler, avoids the synthesis module from directly generating misaligned outputs, and further benefits from anatomy-style disentanglement to improve registration. We will cite and discuss this work in the revision.

C: Loss stability (R1) R: The six losses fall into three functional groups: synthesis (adv, self, cycle), registration (smooth, align), and disentanglement (anatomy), each tied to a specific module. Thus, Table 2 ablations reflect their impact. Due to space limits, we reported module-level results. As noted in Implementation Details, our model converges stably within 200 epochs on both tasks, suggesting loss stability across settings.

C: Statistical analysis (R1) R: We achieve statistically significant improvements over all comparisons (p<0.0001).

C: Necessity of registration-guided consistency (R2) R: As in Table 2 (Row 1 vs. Row 2), applying registration before vs. after synthesis leads to a clear gap, showing that synthesis is sensitive to deformation order. This supports our motivation: without task-specific constraints, the synthesis module may ignore the registration’s role. This ablation already addresses the reviewer’s concern on order. Furthermore, as noted in Implementation Details, we set a large weight (20) on the alignment loss in the consistency constraint to ensure effectiveness even when differences are small. The gain in Table 2 (Row 3) further validates our consistency design.

C: Novelty of ACDS (R2) R: While disentanglement has been explored, our contribution is integrating it into a registration-guided synthesis framework. We propose an anatomy loss in ACDS to preserve geometric integrity of anatomical features. This loss relies on registration guidance—without it, the loss may instead cause distortion due to anatomical differences across inputs. Our focus is not disentanglement itself, but how it helps registration module separate anatomy from modality styles to reduce misalignment.

C: Technical soundness and reproducibility (R2) R: (1) The interactions of the main modules are described at the beginning of their respective subsections in methodology: the registration-guided consistency module provides task-specific constraints, while the disentanglement helps the registration module separate anatomy from style. (2) The overall processing order is illustrated in Fig. 2, following a dual-branch, end-to-end structure. (3) The impact and interactions of each module is validated through ablation in Table 2. (4) Additionally, we have released the code to ensure reproducibility in the abstract.

C: Novelty of using GAN/components (R2/R3) R: Though GAN-based, our main contribution lies not in the generative components itself, but in tackling misalignment via registration-guided consistency and disentanglement. This solution can be incorporated into diffusion or autoregressive models. We believe that solving this core issue adds complementary value to advances in generative backbones.

C: Comparison with more diffusion models (R3) R: The diffusion model we compared (Ref. 6) is a 3D method tailored for medical image synthesis, using anatomy masks as priors. In our 3D setting, this model is more relevant than 2D diffusion models (e.g., BBDM, I2SB), which we found to yield inferior results. Due to space limitations, we included this representative baseline.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The authors have adequately addressed the reviewers’ concerns. I believe they have answered concerns about novelty. I advise the authors to include a more detailed description of statistical tests in their camera-ready paper.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Reject
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

I regret to reject this well written paper with shared code but the core idea of disentangle the domain-specific deformation information and jointly solve synthesis-registration tasks have been well explored using similar network architecture several years ago some are cited by the work. Missing of ablation studies justifying the design of the loss function should be presented because this is a core component of the methodology. I also agree with R1’s comment about DA-GAN after reading both papers. I believe with more experiments and better assessments will lead a better publication.

back to top

Boosting Medical Image Synthesis via Registration-guided Consistency and Disentanglement Learning

Author(s):