Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Chest X-Ray (CXR) imaging for pulmonary diagnosis raises significant challenges, primarily because bone structures can obscure critical details necessary for accurate diagnosis. Recent advances in deep learning, particularly with diffusion models, offer significant promise for effectively minimizing the visibility of bone structures in CXR images, thereby improving clarity and diagnostic accuracy. Nevertheless, existing diffusion-based methods for bone suppression in CXR imaging struggle to balance the complete suppression of bones with preserving local texture details. Additionally, their high computational demand and extended processing time hinder their practical use in clinical settings. To address these limitations, we introduce a Global-Local Latent Consistency Model (GL-LCM) architecture. This model combines lung segmentation, dual-path sampling, and global-local fusion, enabling fast high-resolution bone suppression in CXR images. To tackle potential boundary artifacts and detail blurring in local-path sampling, we further propose Local-Enhanced Guidance, which addresses these issues without additional training. Comprehensive experiments on a self-collected dataset SZCH-X-Rays, and the public dataset JSRT, reveal that our GL-LCM delivers superior bone suppression and remarkable computational efficiency, significantly outperforming several competitive methods. Our code is available at https://github.com/diaoquesang/GL-LCM.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0658_paper.pdf

SharedIt Link: https://rdcu.be/eHw8x

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05169-1_22

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/diaoquesang/GL-LCM

Link to the Dataset(s)

JSRT dataset: https://drive.google.com/file/d/1o-T5l2RKdT5J75eBsqajqAuHPfZnzPhj/view?usp=sharing

BibTex

@InProceedings{SunYif_GLLCM_MICCAI2025,
        author = { Sun, Yifei AND Chen, Zhanghao AND Zheng, Hao AND Lu, Yuqing AND Duan, Lixin AND Fan, Fenglei AND Elazab, Ahmed AND Wan, Xiang AND Wang, Changmiao AND Ge, Ruiquan},
        title = { { GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {222 -- 232}
}

Reviews

Review #1

Please describe the contribution of the paper
A new diffusion based bone suppression method for CXR images is proposed, achieving SoTA performance on both internal and external benchmarks.

Important components of the paper is the following
- A Global-Local Latent Consistency Model is proposed to achieve good performance in bone suppression and local texture preservation
- A Poisson fusion method is explored to merge global and local results
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Interesting domain and useful application for image generation technique
- Thorough evaluation vs existing methods
- Very well written paper
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Overall performance improvement seems marginal, both qualitative from the examples and the numbers given confidence interval.
- Given performance similarity with BS-LDM, not too convinced in the necessity of the local (masked CXR) path. I recommend an ablation with just the Global path vs Global+local path with fusion.
- In my experience, 1~2 min latency is not that critical in the existing clinical workflow for most cases. More curious about, if you do the same sampling step, could the result itself get better.
- Why excluding pleural effusion and pneumothorax cases, I thought the goal is to facilitate disease detection after bone suppression.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

Only source code, not the dataset.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overall seems to be a nice paper, but lacks the most critical ablation.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I wouldn’t call 0.070 ± 0.018 to 0.056 ± 0.015 for LPIPS on the SZCH-X-Rays a 20% improvement. But happy with the ablation that I missed. So changing to accept.

Review #2

Please describe the contribution of the paper

This paper proposes a dual-scale generative AI system based on Latent Consistency Model for bone suppression in CXR images. By proposing several modules, this paper achieved better performance and less processing time than existing works on a self-collected and a public dataset, demonstrating the practical potential of the system.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Novelty: This paper introduces a novel approach to bone suppression by decomposing the problem into two components: global-scale image generation and local-scale lung area processing. Furthermore, the paper addresses potential challenges in local-scale image processing by introducing additional modules, such as the Local-Enhanced Guidance module, which effectively resolve these issues.
2. Sufficient Evaluation: The paper demonstrates methodological advancement by comparing its proposed approach with both universal and task-specific methods on both internal and public datasets. Additionally, the authors conduct detailed quantitative and qualitative ablation experiments to validate the effectiveness of the proposed modules.
3. Clinical Utility: The proposed method achieves an inference time of fewer than 10 seconds per patient, which is approximately 10 times faster than previous works. This significant improvement in speed makes the method suitable for practical application in clinical settings.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. High Computational Cost: While the paper claims computational efficiency, the model requires a GPU with 80GB memory even for a small batch size of 4 during training. This raises concerns about the fairness of comparisons with other works, as computational resources are not standardized across studies. The authors should supplement their results by including metrics related to computational cost, such as model parameters and FLOPs, in Tables 1 and 2. Additionally, the high demand for computational resources may limit practical deployment in real-world hospital settings, where access to GPUs with such large memory is often limited. This contradicts the paper’s claim that the method is “suitable for clinical applications.”
2. Potential Negative Influence on Downstream Diagnostic Tasks: The primary goal of bone suppression is to enhance diagnostic accuracy for downstream tasks such as pulmonary diagnosis. However, in Figure 2, the results appear to erase a focal lesion from the X-ray image (e.g., in the SZCH-X-Rays dataset, a lesion in the mid-left quadrant of the real data is absent in the output of GL-LCM). This observation raises concerns about the method’s ability to retain clinically relevant texture details, as claimed by the authors. To address this, the authors should provide additional results on X-ray images with abnormalities and evaluate the performance of their method on such data to ensure it does not compromise diagnostic utility.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Conducting a more thorough analysis of the computational cost, including metrics such as memory usage, model parameters, and FLOPs, would strengthen the methodological credibility of the proposed approach. Furthermore, presenting more comprehensive quantitative results on abnormal X-ray images, particularly those with clinically relevant pathologies, would demonstrate the method’s practical value in real-world diagnostic scenarios. These additions would not only address potential limitations but also enhance the overall robustness and applicability of the proposed method.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

In this study, the authors present a method for accurate diagnosis by preserving the intrinsic characteristics of organs obscured by anatomical structures in chest X-ray (CXR) images. They present the inherent limitations of context-aware suppression by the proposed bone suppression technique. The merits of the proposed methods contain global-local latent representations and its merits leverage the performance of bone suppression operation. This approach demonstrates potential in mitigating artifacts and resolving unnatural image representations. The proposed one consists of three main stages: 1) initial segmentation, 2) dual-path processing which is applied to extract both global and local features respectively, and 3) two results are fused. Overall, each component of the framework is described with clarity, and the manuscript is well-structured and highly readable.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The paper clearly articulates the problem in CXR imaging and effectively highlights the limitations of existing approaches.
2. Each stage of the proposed method is well-motivated, with clear justification and explanation of its purpose.
3. Experiments were conducted on four datasets, and especially the experimental settings were seen as designed to ensure fair comparison.
4. Finally, the proposed method demonstrates superiority in the experimental results.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The rationale for employing specific distributions such as Gaussian and Poisson distributions during the sampling process in the diffusion stage is not sufficiently explained.
2. It is difficult to assess how the computational cost of the proposed method compares to that of existing approaches—whether it has increased or decreased remains unclear.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1. In the proposed method, how is latent structural consistency independently maintained between the global and local representations prior to their fusion, particularly in the context of achieving global suppression of bone structures while retaining local details? How does the proposed method address potential redundancy arising from duplicated information across the global and local results?
2. CXR imaging noise is generally modeled using a Poisson distribution like Rician noise in MRI. Given the importance of accurately reflecting the noise distribution for high-quality image restoration, particularly in the diffusion process described in Section 2.1, I recommend that the authors use Poisson noise , not Gaussian noise within the GL-LCM.
3. Registration problem should be discussed. CXR imaging is inherently affected by registration variability. If non-aligned CXR images with varying anatomical positions are introduced into the GL-LCM, how is the suppression performance impacted?
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The proposed method outperforms the existing diffusion-based CXR bone suppression approaches. I hope that it would be valuable to evaluate the robustness of GL-LCM on self-corrected local datasets. Since CXR data may involve aligment and registration issues, verifying that the method performs reliably under such conditions would be especially meaningful.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank all reviewers for recognizing our paper “advanced methodology” (R1&R2), “comprehensive evaluation” (R1&R2&R3), “useful application” (R2&R3), “very well written paper” (R1&R3). Below, we provide specific responses for each reviewer.

R1-Q1: Adopting Gaussian noise instead of Poisson noise. While Poisson noise is theoretically more aligned with the nature of X-ray imaging, diffusion models (including LCMs) conventionally adopt Gaussian noise due to its mathematical advantages in iterative denoising. Gaussian noise simplifies gradient-based optimization and ensures stable training, especially in high-dimensional latent spaces. In contrast, incorporating Poisson noise would require significant modifications to the diffusion framework, increasing computational complexity. R1-Q2: Computational cost comparisons. As outlined in Table 3, the inference time of GL-LCM is about 10% of what is required by other diffusion-based methods. Moreover, GL-LCM use only 50 steps instead of 1000 steps (BS-Diff [1] & BS-LDM [21]), and the number of parameters in GL-LCM is 436.9M, close to that in BS-LDM (421.3M).

R2-Q1: Demand for computational resources. Although we wrote at “Implementation Details” that we used an 80G GPU (batch size = 4), a 32G GPU was enough to complete the experiments based on our previous testing. Table 3 further indicates that our method is comparable to the diffusion-based BS-LDM [21] in terms of the model parameters, but with only 10% inference time. R2-Q2: Potential Negative Influence on Downstream Diagnostic Tasks. This is a very significant point. However, the area in the mid-left quadrant shown in Fig. 2 is actually a metal implant, not a lesion. The fact that the soft tissue images generated by GL-LCM do not retain the features of the implant actually reflects its strength in learning accurate soft tissue distribution. GL-LCM aims to suppress bone structures while preserving the details of the lung texture that are crucial for diagnostic purposes. Non-diagnostic objects were not retained in the output, reflecting the potential benefits of GL-LCM in eliminating interfering factors.

R3-Q1: Seemingly marginal improvement of performance. GL-LCM stands out among current SOTA methods, with a 20.00% improvement in LPIPS on the SZCH-X-Rays dataset, surpassing that of the second-best BS-LDM [21] (2.78%) and the third-best Wang et al. [23] (8.86%), as outlined in Table 1. This reflects the significant improvement of GL-LCM in bone suppression performance. R3-Q2: Ablation with just the Global path vs Global+Local path with fusion. It should be clarified that we have performed this ablation and presented the results in the 1st (×) and 4th (Poisson Fusion (Ours)) rows of Table 5. We mentioned in the paper that “For comparison, the absence of a fusion strategy (LCM baseline) is evaluated using only the global path.” The results indicate that (PSNR/LPIPS) with our Global+Local path design improves (1.99 dB/38.5%) and (2.31dB/29.7%) over just the Global path on SZCH-X-Rays and JSRT, respectively. This justifies the necessity of the local path design. R3-Q3: Impact of time latency and more timesteps. In a high-throughput screening environment, reducing processing time can significantly improve efficiency. Table 1, Table 2 and Fig. 2 indicate that a 50-step setup is sufficient for good performance of the GL-LCM. In contrast, based on our previous testing, using 1000 steps resulted in a marginal improvement (<2% on LPIPS). R3-Q4: Excluding pleural effusion and pneumothorax cases. The presence of pneumothorax or pleural effusion leads to loss of lung texture, which severely interferes with the validity of the dual-energy subtraction data, and consequently with the training and evaluation of bone suppression algorithms. Moreover, the typical imaging features of pneumothorax and pleural effusion are sufficiently prominent in conventional radiographs, and it is better to maintain the integrity of the original image in such cases.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The rebuttal addressed most of the concerns, and there is general agreement that the paper merits acceptance. The authors provided a clear justification for using Gaussian noise in the diffusion process and pointed out that the requested ablation study was already included in the submission, effectively demonstrating the benefit of the local path.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

GL-LCM: Global-Local Latent Consistency Models for Fast High-Resolution Bone Suppression in Chest X-Ray Images

Author(s):