List of Papers Browse by Subject Areas Author List
Abstract
Field-of-view (FOV) recovery of truncated chest CT scans is crucial for accurate body composition analysis, which involves quantifying skeletal muscle and subcutaneous adipose tissue (SAT) on CT slices. This, in turn, enables disease prognostication. Here, we present a method for recovering truncated CT slices using generative image outpainting. We train a diffusion model and apply it to truncated CT slices generated by simulating a small FOV. Our model reliably recovers the truncated anatomy and outperforms the previous state-of-the-art despite being trained on 87% less data. Our code is available at https://github.com/michelleespranita/ct_palette.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0872_paper.pdf
SharedIt Link: pending
SpringerLink (DOI): pending
Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0872_supp.pdf
Link to the Code Repository
https://github.com/michelleespranita/ct_palette
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Lim_Diffusionbased_MICCAI2024,
author = { Liman, Michelle Espranita and Rueckert, Daniel and Fintelmann, Florian J. and Müller, Philip},
title = { { Diffusion-based Generative Image Outpainting for Recovery of FOV-Truncated CT Images } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15001},
month = {October},
page = {pending}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper proposed a CT-Palette for recovery of truncated chest CT scans, which uses a DM to generate multiple recovered slices, and selects the most representative slice from each as the final prediction via muscle and SAT.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1.The idea of using masks to achieve more efficient recovery sounds intriguing. 2.The description of the paper is clear, which allows me to easily understand all the details.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
While the idea is intriguing, there appear to be some issues with the writing and experimental setup that need attention. My main concern is the weakness of the quantitative results. 1.The paper only presents RMSE metrics for muscle and SAT. While I understand the primary goal of CT-Palette is to quantify body tissue components accurately, I’m worried about potential issues like significant generation bias, leading to artifacts, blurring, or even incorrect generation. Metrics such as PSNR, SSIM, etc., under mask conditions, could be beneficial to enhance this paper’s credibility. 2.Table 1 would be more comprehensive if it included additional metrics, such as the DSC, along with providing more statistical information, such as variance and significance tests.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Do you have any additional comments regarding the paper’s reproducibility?
No
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
It is necessary to show more convincing quantitative results from the experiment.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Reject — could be rejected, dependent on rebuttal (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
It is based on the strengths and weaknesses that were mentioned earlier.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
The authors addressed some of my concerns, so I raised the score.
Review #2
- Please describe the contribution of the paper
This study proposed a diffusion model-based method to recover FOV-Truncated CT images for body composition analysis. Experiments show the proposed method outperform the prior state-of-the-art both quantitatively and qualitatively.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
– The paper proposed a novel way of using diffusion model to address the important FOV-Truncated CT problem.
– It is good to see that the evaluation not only contains the quantitatively comparison between prior methods but also includes the radiologist evaluation.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
I think that the authors should not only assess the realism of the generated slices but also evaluate whether the generated slices exhibit characteristics of gender or age. For instance, it would be important to determine if slices generated for males and females are very similar, or if slices for older adults and younger individuals are alike.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
see weakness.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper is easy to follow and have sufficient experiments.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
The authors present an approach for FOV recovery of truncated chest CT scan, which (i) uses a body bounding box detector to estimate the location, (ii) outpaints images given truncated CT information based on diffusion-based image-to-image translation. Experiments show the method outperforms previous state-of-the-art models.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The proposed method achieves good performance.
- This methods is technically sound and easy to follow.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- The outpainting is simply formulated as an image-to-image translation task, with an existing diffusion-based method adopted, seeming less novel.
- In the inference stage, there is confusion about the multiple inferences; n slices are generated and the most representative one is chosen based on metrics. It seems like generating multiple and selecting the best. Since the authors argue for a distribution of possible outputs for a given input, the final outputs should be a distribution rather than a selected one, and the distribution’s advantage should be demonstrated through evaluation, otherwise it may be overclaimed.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
See weakness.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall this paper is satisfactory.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
The authors’ rebuttal addresses my concerns. Overall I think my review was fair and I would prefer to maintain the rating.
Author Feedback
We thank ALL reviewers for their valuable feedback and appreciate that they found our ideas intriguing (R1) and novel (R3) and acknowledged our qualitative evaluation by radiologists (R3) and our strong performance (R4).
R1:
We agree with R1 on the importance of analyzing the realism of the outpainted images by CT-Palette not only qualitatively by radiologists’ evaluation but also quantitatively. Hence, we had already included the Fréchet Inception Distance (FID) between ground-truth and outpainted images in the submitted supp. material. CT-Palette achieved the lowest FID, i.e., images generated by CT-Palette have the most similar distribution to the ground-truth compared to other models. Additionally, we will now also include PSNR and SSIM in the supp. material upon acceptance.
R1 suggested adding metrics like DSC and statistical information to Tab. 1. Please note that in the submitted supp. material, we had already included the DSC for muscle and SAT between ground-truth and outpainted images and the corresponding standard deviations. Adding the DSC scores to Tab. 1 would either require adding columns (thus using much smaller fonts) or rows (thus taking much more space). We therefore propose to summarize the DSC results in the main text more clearly. We’ve now additionally added standard deviations of the RMSE scores to Tab. 1 for the manuscript. In addition, we’ve now conducted the Wilcoxon signed-rank test to compare the muscle/SAT area distribution of ground-truth images and that of outpainted images. We find that there is no significant difference between the muscle/SAT area distribution of ground-truth images and that of images outpainted by our CT-Palette, while there is a significant difference when the test is conducted on other models.
R3:
We agree that evaluating whether the generated slices exhibit characteristics of gender/age would be interesting. However, we highlight that we’ve already included both qualitative evaluations with radiologists and quantitative evaluations. Considering the page limit, integrating these studies into the current paper would require removing some of these evaluations, which we consider more important. However, we plan to include the suggested studies in a journal extension.
R4:
While we acknowledge R4’s point that our approach is an application of an existing diffusion-based method, we argue that our work contains additional novelty through (i) the development of an efficient mask generation method (see 3rd paragraph of Section 2 “Method”) and (ii) a new inference scheme designed specifically for body composition analysis (see Section 2.3 “Inference”).
The reviewer mentions confusion regarding the multiple inference in the inference stage. We are sorry that we probably did not explain the multiple inference clearly enough. To clarify: The selection of the best image is not done by comparing each generated image to the ground-truth image, but by selecting the most representative image from the learned distribution, i.e., the ground-truth is not used in this stage. We chose to select the image closest to the median of the muscle and SAT area derived from the generated images as the best image because we found that compared to the mean, the median is a more robust metric against outliers among the generated images. Hence, the best image won’t be affected by outliers. Regarding the reviewer’s point of the final output not being a distribution, it’s unfortunately impossible to sample all images belonging to the distribution. Therefore, we’ve instead utilized the learned distribution by sampling multiple images instead of one. That way, we can select the most representative image among a larger pool of samples. Without multiple inference, i.e., by generating 1 image, the performance of our method slightly drops in all metrics except FID but is still superior to the baselines. We’ve now added these results to the manuscript.
Meta-Review
Meta-review #1
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
All reviewers agreed on acceptance.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
All reviewers agreed on acceptance.
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
All three reviewers gave “Weak Accept” after the rebuttal.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
All three reviewers gave “Weak Accept” after the rebuttal.