Abstract

Cone-beam computed tomography (CBCT) is an essential imaging modality for adaptive radiotherapy, enabling the positioning and real-time verification of anatomical changes. However, CBCT images suffer from artifacts and lack the accurate Hounsfield unit (HU) calibration necessary for dose computation. Additionally, CBCT’s limited field of view (FOV) further complicates its direct application for replanning. To address these limitations, we propose a novel framework leveraging diffusion models to synthesize a synthetic CT (sCT) from CBCT while inpainting the extended FOV using the original planning CT (pCT). Our method integrates with any CBCT-to-CT diffusion framework without degrading its performance, ensuring accurate HU values and comprehensive anatomical coverage for dose computation without requiring new CT acquisitions. Quantitative and qualitative evaluations demonstrate that our approach preserves the baseline CBCT-to-CT translation quality while effectively extending the FOV, offering a streamlined and effective solution for adaptive radiotherapy workflows.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3866_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: https://papers.miccai.org/miccai-2025/supp/3866_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{SpiQue_Diffusing_MICCAI2025,
        author = { Spinat, Quentin and Duran, Audrey and Teboul, Olivier and Paragios, Nikos and Komodakis, Nikos},
        title = { { Diffusing Boundaries: CBCT-to-CT Translation with Extended Field of View } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {149 -- 159}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a novel diffusion-based inpainting framework that extends the field of view (FOV) in CBCT-to-CT translation by incorporating planning CT (pCT) data.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The use of partial conditioning and latent-space inpainting for FOV extension is novel and directly addresses a practical limitation in radiotherapy. The approach is well-motivated for adaptive radiotherapy, enabling accurate dose computation without additional CT scans. The proposed method can augment any CBCT-to-CT diffusion framework, making it highly adaptable. Quantitative evaluation confirms that partial conditioning does not degrade the quality of CBCT-conditioned regions.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    No direct ground truth for extended FOV regions limits quantitative evaluation of the inpainting quality. Limited comparison with existing FOV extension methods (e.g., deformable registration, CNN-based inpainting) reduces clarity on relative advantages. Evaluation focused on visual assessment; lacks standardized perceptual metrics or radiologist-based grading to support qualitative claims. Method depends on accurate deformable registration, which may fail in complex anatomical deformations.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses a clinically relevant problem with a creative inpainting solution using partial conditioning and latent-space diffusion. However, the lack of quantitative evaluation for FOV extension, absence of comparison with existing methods, and heavy reliance on qualitative visuals weaken the empirical support. A strong rebuttal clarifying these aspects and demonstrating more robust validation could shift this to an accept.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes a diffusion-based framework to generate synthetic CT from CBCT. The generation process is partially conditioned on CBCT input and incorporated the corresponding full size planning CT.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    A novel application of diffusion-based models to generate full view synthetic CT by conditioning the diffusion process on limited field-of-view CBCT and incorporating planning CT.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Method Presentation: The method is not clearly presented. It is difficult to follow which networks are actually being used and how. It would be beneficial if the network architecture was presented with the corresponding input and output.

    Network Architectures: The network architectures of the employed models are not described. For example, the architecture used in the diffusion model, including specific layers, is not detailed. Additionally, other important details like the loss function are missing.

    Performance Metrics: The MAE, PSNR, and FID scores deteriorated, while SSIM remained almost the same compared to the previous method. How does the proposed partial conditioning improve the results?

    Figure Clarity: It is unclear from the figures which method is better.

    Comparison with Other Networks: There is a lack of comparison with other networks, such as even with a simple baseline U-Net.

    Dataset Sources: The sources of the datasets used are not specified.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the application seemed interesting and novel, the ambiguous methodology and evaluation of results make it a weak reject.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Thank you for taking time and replying to the feedback. If the authors can make the changes they agreed, this paper can be accepted.



Review #3

  • Please describe the contribution of the paper

    The main contribution of this paper is the introduction of a diffusion approach for generating synthetic CT (sCT) from cone-beam CT (CBCT) and planning CT (pCT). Traditional approaches would generate sCT from CBCT only, but CBCT has a limited field-of-view (POV); the proposed approach tackles this issue by incorporating the pCT’s largers FOV through an inpainting-based diffusion process. This intention is to benefit from the strengths of both CBCT and pCT during the image translation process: the high anatomic details from CBCT and accurate HU intensity values from pCT. The sCTs are evaluated and are shown to be visually more realistic and anatomically consistent compared to a traditional method, particularly in transition boundaries.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novel application of diffusion models: The expansion of the FOV is useful in practice in radiotherapy. The proposed approach is well-motivated and constructed logically. The use of CBCT and pCT for generating sCT is a promising development in synthetic image generation for radiotherapy.
    • Improved anatomical plausibility: The authors demonstrate that their approach produces sCTs with smoother transitions between CBCT-conditioned and inpainted regions. This can be important for accurate radiation dose computation.
    • Qualitative validation: The paper prioritizes visual evaluation to assess anatomical plausibility, seamless blending, and overall realism, which is appropriate given the lack of ground truth in the extended FOV and the challenges of quantitative evaluation.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Limited quantitative evaluation: While the authors acknowledge the difficulty of quantitative evaluation due to the lack of ground truth, the paper relies heavily on qualitative analysis. More (or rather, different) quantitative metrics, even if imperfect, would strengthen the evaluation. The current Table 1 is not convincing: the evaluation metric values of the proposed approach and a traditional method show tiny differences.
    • Retraining vs. fine-tuning: The authors chose to retrain a diffusion model instead of fine-tuning an existing CBCT-to-CT model due to time constraints. While they acknowledge that fine-tuning could improve efficiency, this limits the scope of the current work. It is unclear what the size and construction of their diffusion model are.
    • Lack of comparison to state-of-the-art: The paper mentions existing methods for FOV expansions, such as interpolation and CNN-based approaches, but does not provide a comparison to the most recent and advanced techniques in FOV expansion.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • The authors articulate the problem, their proposed approach, and the key contributions clearly. The writing is mainly clear and easy to follow. The paper is mostly well-structured except for section 4, which could be substantially improved by rewriting and reorganizing the text.
    • Suggestions for improvements: — The last paragraph of section 2.1 is not relevant to this paper. Either remove or replace this paragraph with more relevant information. — The authors wrote both diffusion and VAE, while they mean the same object, making it confusing for the reader (are they using a diffusion model, VAE, or both models separately?). I suggest limiting writing “VAE”. — Section 4 mentions “The VAE was trained on…”. It is unclear how the VAE looks like regarding layer type and hyperparameters. — Could the authors specify which dataset they used? A publicly available dataset or a private dataset from a medical centre? — The whole text of “Evaluation challenges” and “Visual Evaluation” in section 4.2 (except for the last sentence) does not belong in a Results section. Moreover, the Results section shows limited results. — Table 1: Either use two or three decimals, not both. Also, please mention the evaluation metrics MAE, FID, and Dice in the text.
    • Writing: Page 6: Change “mutlidiffusion” to “multidiffusion” (observe: interchange the characters t and l).
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors did a good job of framing the problem and highlighting the motivation behind their work. The proposed approach has great potential in practice. However, the writing was sloppy, and the results were not convincing.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The proposed method is interesting and can be very relevant to radiotherapy (and perhaps other fields). This is also reflected by the approval for clinical deployment of the collaborating radiologists. The changes to the submission, as indicated by the authors’ feedback, would strongly improve the article. Regarding retraining vs. fine-tuning: The initial submission mentioned “retrain a diffusion model instead of fine-tuning an existing CBCT-to-CT model due to time constraints.”, which is counterintuitive. The authors’ feedback (“We trained from scratch to demonstrate general applicability.”) makes more sense, and hence I would suggest framing it this way.




Author Feedback

We thank the reviewers for their thoughtful feedback and are encouraged by the recognition of our method’s novelty and clinical relevance. Below, we address key concerns and planned revisions.

Quantitative Evaluation and Comparison to Prior Work (R1, R3) -Comparison to deformable registration & CNN-based inpainting: Our method improves upon deformable registration by design, as it explicitly incorporates it as a component and then uses diffusion-based inpainting to mitigate artifacts and extend the FOV (included results already confirm this visually). CNN-based inpainting without pCT guidance generates results that are visually realistic but inconsistent with the patient’s anatomy, limiting clinical utility. -As ground-truth is unavailable for extended FOV, we could amend existing experiments using synthetically restricted FOVs to enable quantitative comparisons with the above baselines, if reviewers find this helpful. -We agree that broader comparisons would be valuable. However, to our knowledge, no published work tackles FOV extension in CBCT-to-CT translation with supporting metrics. Existing methods either address different clinical scenarios (e.g. CT-only or interpolation methods without CBCT-to-CT translation) or lack reported metrics (due to lack of ground truth as in our case), making direct comparison infeasible or uninformative. -We note that our method has already been approved for clinical deployment by collaborating radiologists, following evaluation based on dose metrics and visual assessments. We will mention this in the revision and note that formal radiologist grading is planned for future work.

Table 1 clarification (R2, R3) We will clarify that Tab. 1 measures CBCT-to-CT translation metrics within the original FOV only, and not the extended regions. Its purpose is to show that our method enables FOV extension without degrading the model’s base CBCT-to-CT performance, not to evaluate FOV extension performance. We stress that our objective is not to outperform specific CBCT-to-CT models, but to enable FOV extension without degrading performance within the original FOV—an outcome fully supported by Tab. 1.

Deformable registration dependency (R1) Major registration failures can impact results, but our use of a dilation margin largely mitigates small misalignments. As shown in Fig. 6, even imperfect registration yields smooth, artifact-free transitions, outperforming naive compositing (Fig. 2). We can include additional robustness analyses if permitted.

Network and loss details (R2, R3) We apologize for the lack of detail. While our framework is agnostic to architecture and training method, we will revise the experimental section to include the following: -We employ a 3D latent diffusion architecture, with a 3D VAE (SDXL-inspired) and a 3D U-Net denoiser (depth 3, with attention at the bottleneck), using magnitude-preserving layers. -The U-Net (~300M params) is trained with an L2 denoising loss. -The VAE (~250M params) uses L1, KL-divergence, spatial-gradient, adversarial, and perceptual losses.

Dataset provenance (R2, R3) We apologize for omitting this: all data are private, anonymized CBCT and CT scans from multiple centers. We will clarify this.

Figure and text clarity (R2, R3) We will revise Section 4 to improve clarity and revise figure captions. Fig. 2 (a zoom of Fig. 7) shows discontinuities in naive merging that are not present in our approach. Fig. 4 illustrates that omitting partial conditioning during training leads to poor transitions beyond the CBCT FOV, validating a key design choice of our model.

Retraining vs. fine-tuning (R3) Fine‐tuning an existing CBCT‐to‐CT model is simpler and fully compatible with our approach. We trained from scratch to demonstrate general applicability. If allowed, we can include results showing identical performance under fine‐tuning.

We thank the reviewers again and believe the suggested changes will strengthen the final version of our paper.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top