List of Papers Browse by Subject Areas Author List
Abstract
Accurate segmentation of lung nodules in computed tomography (CT) images is crucial to advance the treatment of lung cancer. Methods based on diffusion probabilistic models (DPMs) are widely used in medical image segmentation tasks. Nevertheless, conventional DPM encounters challenges when addressing medical image segmentation issues, primarily attributed to the irregular structure of lung nodules and the inherent resemblance between lung nodules and their surrounding environments. Consequently, this study introduces an innovative architecture known as the dual-branch Diff-UNet to address the challenges associated with lung nodule segmentation effectively. Specifically, the denoising UNet in this architecture interactively processes the semantic information captured by the branches of the Transformer and the convolutional neural network (CNN) through bidirectional connection units. Furthermore, the feature fusion module (FFM) helps integrate the semantic features extracted by DPM with the locally detailed features captured by the segmentation network. Simultaneously, a lightweight cross-graph interaction (CGI) module is introduced in the decoder, which uses region and edge features as graph nodes to update and propagate cross-domain features and capture the characteristics of object boundaries. Finally, the multi-scale cross module (MCM) synergizes the deep features from the DPM with the edge features from the segmentation network, augmenting the network’s capability to comprehend images. The Diff-UNet has been proven effective through experiments on challenging datasets, including self-collected datasets and LUNA16.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1228_paper.pdf
SharedIt Link: pending
SpringerLink (DOI): pending
Supplementary Material: N/A
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Su_Crossgraph_MICCAI2024,
author = { Su, Huaqiang and Lei, Haijun and Guoliang, Chen and Lei, Baiying},
title = { { Cross-graph Interaction and Diffusion Probability Models for Lung Nodule Segmentation } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15001},
month = {October},
page = {pending}
}
Reviews
Review #1
- Please describe the contribution of the paper
The authors propose a new lung nodule segmentation network Diff-UNetby incorporating a graph module and diffusion probability network. The main contribution include 1) the incorporation of cross-graph interaction, 2) feature fusion of global and local context.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1) In the evaluation, the proposed method is validated and compared with a range of state-of-the-art methods on two datasets, including a public dataset. 2) The paper is clear, well-structured and easy to follow. 3) The ablation study effectively validates the contribution of the proposed modules.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1) The overall innovation is incremental. Firstly, the incorpration of graph-based modules (CGI in the paper) is not new in segmentation networks. These previous works were not well analysed and compared with the proposed CGI, which makes the contribution less convincing. Please find an example here Xuan, Ping, et al. “Dynamic graph convolutional autoencoder with node-attribute-wise attention for kidney and tumor segmentation from CT volumes.” Knowledge-Based Systems 236 (2022): 107360.Secondly, the contribution of feature fusion module via feature concatenation is trivial. 2) Reproducibility is a major concern: there is no provided code or sufficient implementation details. Also, there is no sufficient details for self-collected dataset.
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
The reproducibility is a major concern.
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
The major limitation include incremental technical innovation and limited reproducibility, as specified in the “Main Weakness”. Other comments include: 1) It is not clear why more comparison methods are used in Table 1 (public dataset) than Table 2 (private dataset). 2) please clarify the baseline in TABLE 3. 3) Fig 3 is not informative, and difficult to interpret. There is no color map, no ground truth, no input images, which makes readers difficult to understand the quality of the segmetnation. 4) More details on both datasets are required, especially for self-collected dataset. The details such as image size, resolutions, how ground-truth annotation is collected, ethics consideration, etc should be mentioned. 5) More implementation details should be mentioned, not only for the proposed method, but also how fair comparison was ensured for other implemented comparison methods. 6) The link between technical motivation (irregular nodule shape, and resemblance between nodule and environment) and proposed contribution (graph-based interaction, feature fusion) is weak. It is not clear how the proposed contribution addresses the aforementioned paper motivation.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Reject — could be rejected, dependent on rebuttal (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Authors presented a well-validated segmentation method. However, due to the concerns about 1) incremental innovation, 2) no sufficient information for reproducibility, I recommend weak reject for this paper.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
The authors did not sufficiently address my concerns in terms of novelties in the proposed graph-based module, and feature fusion. The contribution in the paper is a stack of modules, thus is incremental. As a result, I maintain my previous rating of the paper.
Review #2
- Please describe the contribution of the paper
The work proposes a dual-branch network for segmentation effectively. A decicated network for to glean semantic information captured, a feature fusion network and decoder, is employed in this context. In addition a multi-scale cross module is used to further enhance the feature.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed methordology is novel and the experimental analysis.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
More experiemtal analaysis on diverse datasets and condition is required to be used.
Several type of diffusion techniques are available it will be helpful to use such techniques and showcase the best combination.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Do you have any additional comments regarding the paper’s reproducibility?
No
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
More experiemtal analaysis on diverse datasets and condition, along with analysis on different difusiion principal will be beneficial.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Due to the novelity and limited analysis I remain to weak accept the paper.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
Weak Accept — could be accepted, dependent on rebuttal (4)
- [Post rebuttal] Please justify your decision
Based on the rebuttal I think the author addressed all my comments hence I accept the work
Review #3
- Please describe the contribution of the paper
The paper introduces the incorporation of diffusion probabilistic models and graph-based interactions into the UNet model. Furthermore, a Transformer is integrated into the segmentation model to extract global features.Additionally, it employs multi-level feature extraction and fusion techniques, allowing the proposed model to extract lung nodule of varying sizes.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
the key strneth of this paper lies in 1) Local and global scale feature extraction using CNN and Transformer. 2) Multilevel feature fusion at two different stages. 3) Graph-based interaction of features in the Unet decoder.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1) The proposed model appears computationally expensive due to the inclusion of two computationally intensive models (Transformer, graph). Thus, there might be difficulties in practical application in a clinical setting. 2) The graph and transformer networks are typically utilized to discover global features or facilitate cross-interaction between features. However, what is the purpose of using graph networks alongside transformers when both serve the same function?
- Please rate the clarity and organization of this paper
Very Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1) The utilization of graph-based strategy appears to be a promising approach; however, it is accompanied by extensive computation costs. How have authors addressed this issue? 2) In Table 3, authors have presented standard deviation values. What is the purpose of these values? Additionally, it would be beneficial if authors provided more details on the experimental setup, such as the number of times the experiments were repeated to calculate standard deviation, computaional cost etc. 3) It is intriguing that the model performed well on the self-collected dataset compared to the LUNA16 dataset. What do authors think as to why their proposed model achieved better performance on the self-collected dataset? What types of features are present in the self-collected dataset that contributed to this improved performance? Moreover, I suggest authors include some details about the used self-collected dataset. 4) Please include how this work benefits medical staff, along with further directions, in the conclusion section.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1) The paper presents a hybrid approach where different models (CNN, Transformer, GNN) have been used to take advantage of their feature extraction and interaction capabilities. 2) Feature fusion is performed at two different stages: first, in the segmentation network, and later, features from the segmentation network are fused with the features of the diffusion probabilities model.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Author Feedback
Thanks for the rebuttal invitation. We itemize our responses to significant points as follows: (1) The role of combining Transformer and CGI and the computational cost (R1, R3, R4): The denoising UNet architecture combines CNN and Transformer to enable the DPM to capture global context information through bidirectional connection units. Transformer Bridge uses a multi-head channel attention mechanism to fuse different levels of features and capture long dependencies across channels. The decoder’s CGI module focuses on essential nodes and edges through projection and reprojection operations, capturing complex correlations between pixels and their boundary details. Although CGI is widely used in medical segmentation, its application in 3D CT images of lung nodules remains scarce. CGI employed for nodules cannot be directly compared to the segmentation of other lesions due to the heterogeneity among them. Diff-UNet exploits the sparsity of the graph structure and convolutional layers to reduce computational complexity. (2) Description of Fig. 3 (R1): Fig. 3 depicts cases of 3D visualization of surface distances between the segmented surface and the ground truth. The segmentation result is closer to the ground truth when the green area is larger. (3) Ablation experiment analysis (R1, R3, R4): To ensure a fair comparison, we build a baseline model consisting of DPM, ResNet encoder, Bridge Transformer, and decoder. Then, we introduce CGI, FFM, and MCM modules into the baseline model and study their impact on segmentation performance. As seen from the ablation experiment results in Table 3, Diff-UNet benefits from DPM’s ability to improve image smoothness and reduce noise through Markov chains and Transformer’s ability to capture global contextual features. Therefore, Diff-UNet can produce accurate segmentation results even in low-contrast or blurry lung nodule areas. (4) Dataset information (R1, R3, R4): The LUN16 dataset was collected from the largest public reference database of lung nodules: LIDC-IDRI. The database consists of clinical-dose and low-dose CT scans collected at seven participating academic institutions. This article discards scans with slice thickness greater than 3 mm, resulting in a final list of 888 scans. Our experiment also used CT images of lung nodules collected in the hospital from 2012 to 2019. The self-collected dataset consists of 1299 samples with a resolution of 1 mm. The datasets were ethically reviewed, and informed consent was obtained from the patients. Experienced radiologists used ITK-SNAP software for annotation based on surgical pathology. Compared with LUNA16, the nodules in the self-collected dataset show obvious texture contrast compared to the surrounding tissues. Unified annotation protocols and standards used in self-collected dataset can reduce the subjectivity of annotation and help models to segment nodules more accurately. (5) Experiment details (R1, R3, R4): We implement Diff-UNet using PyTorch and calculate the standard deviation using 10-fold cross-validation, evaluating its consistency across different data partitions. The Diff-UNet and comparison models use the Adam optimizer with an initial learning rate of 0.00001 and a batch size of 2. Since the self-collected dataset has unique imaging characteristics, more comparative methods are required to verify its effectiveness. (6) Medical implications and further directions of this work (R4): Considering that the structure of nodules in CT images is irregular and similar to that of the surrounding environment, manual inspection is time-consuming and relies on radiologists’ experience. This paper proposes a segmentation framework, namely Diff-UNet. This framework utilizes CGI and DPM to weaken CT image noise and capture boundary features to segment nodules. Further directions will explore semi-supervised and unsupervised learning to reduce reliance on large labeled datasets, making models more practical in data scarce environments.
Meta-Review
Meta-review #1
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
I have checked the reviews of this paper and there are no issues.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
I have checked the reviews of this paper and there are no issues.
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
good rebuttal period, one reviewer increased the score, the paper is weak accept.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
good rebuttal period, one reviewer increased the score, the paper is weak accept.