Abstract

Automatic lung organ segmentation on CT images is crucial for lung disease diagnosis. However, the unlimited voxel values and class imbalance of lung organs can lead to false-negative/positive and leakage issues in advanced methods. Additionally, some slender lung organs are easily lost during the recycled down/up-sample procedure, e.g., bronchioles & arterioles, causing severe discontinuity issue. Inspired by these, this paper introduces an effective lung organ segmentation method called Fuzzy Attention-based Border Rendering (FABR) network. Since fuzzy logic can handle the uncertainty in feature extraction, hence the fusion of deep networks and fuzzy sets should be a viable solution for better performance. Meanwhile, unlike prior top-tier methods that operate on all regular dense points, our FABR depicts lung organ regions as cube-trees, focusing only on recycle-sampled border vulnerable points, rendering the severely discontinuous, false-negative/positive organ regions with a novel Global-Local Cube-tree Fusion (GLCF) module. All experimental results, on four challenging datasets of airway & artery, demonstrate that our method can achieve the favorable performance significantly.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2376_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2376_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zha_Fuzzy_MICCAI2024,
        author = { Zhang, Sheng and Nan, Yang and Fang, Yingying and Wang, Shiyi and Xing, Xiaodan and Gao, Zhifan and Yang, Guang},
        title = { { Fuzzy Attention-based Border Rendering Network for Lung Organ Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15009},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed fuzzy attention-based boarder rendering network for slender lung organ segmentation. This paper introduced fuzzy logic with attention mechanism which utilized Gaussian membership function to diminish feature uncertainty and presented e global-local cube-tree fusion module, which explicitly models the border vulnerable points yielded by recycled down/upsample for accurate lung organ segmentation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) This paper presented Global-Local Cube-tree Fusion to employ neighbor information to focus on border vulnerable point to improve lung organ segmentation. (2) This paper aimed at slender lung artery segmentation and proposed methods to handle its continuity which is more clinically significant.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. This paper proposes a fuzzy Attention-based Transformer-like Backbone, which shares similarities with the work presented in [1]. The authors failed to demonstrate significant improvements compared to the referenced work.
    2. The motivation of the proposed method is not clear.
    3. The current experimental results are far from sufficient to prove the effectiveness of the proposed method.
    4. Some important details of the method and the experiments are missing.

    [1] Nan, Yang, et al. “Fuzzy attention neural network to tackle discontinuity in airway segmentation.” IEEE Transactions on Neural Networks and Learning Systems (2023).

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. This paper discusses the utilization of adjacent voxel features from the current layer and the next layer to enhance local contextual information. However, the paper goes on to state that these features are further projected and combined with the original feature for the final prediction. Please clarify the motivation behind generating projected features from centroid and non-centroid features.
    2. I suggest conducting experiments by removing the global learnable feature to strengthen the argument that it contains general distribution information. This would make the findings more convincing.
    3. In the Global-Local Cube-tree Fusion module, the paper mentions the process of obtaining border vulnerable voxels by “recycling” downsampling and upsampling to generate masks, followed by evaluating the absolute difference. Please provide more detailed information regarding this procedure.
    4. I still have concerns regarding the omission of adjacent features from the last layer to enhance neighbor information. Please provide clarification on this matter.
    5. This paper requires additional experimental details regarding the dataset, including information about the train/test split and how private data is utilized.
    6. The paper mentions that the quadratic operation complexity of transformers limits their application in 3D high-resolution CT images due to hardware constraints. It is hoped that this paper will provide performance and computation comparisons to support this claim.
    7. This paper needs to provide more details regarding the evaluation metrics, such as DLR, DBR, and AMR.
    8. It is recommended that this paper includes a discussion on the selection of hyperparameters and their impact on the proposed approach.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
    1. The low credibility and novelty.
    2. The experiments are inadequate.
    3. This paper is not well written
  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    This reviewer appreciates the authors’ efforts to address the concerns raised. The responses have addressed most of our concerns and provided clear justifications for the approach taken. Therefore, I have raised the score to ‘Weak Accept’.



Review #2

  • Please describe the contribution of the paper

    The article concerns itself with the segmentation of either lung airways or arteries in CT scans. The authors make the point that segmenting such structures is challenging because they are inherently multi-scale, but that the downsampling-upsampling layers in most segmentation networks derived from U-Net and equivalent tend to damage these structures.

    Consequently they propose to use a fuzzy attention-based transformer backbone based on ConvNeXt and a “Global-Local Cube Tree feature” fusion module.

    They train and compare their method on 2 public datasets featuring either airway or lung artery segmentation and test on an in-house lung fibrosis and a public airway datasets, achieving very good results particularly concerning the detected length and branch ratios.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper proposes innovative methods pointing out and attempting to resolve the shortcomings of classical deep-learning segmentation method, particularly concerning the segmentation of structure with very high surface over volume ratios (i.e. essentially thin structures). The results on the PARSE22 dataset, which features very thin arterioles, is particularly telling.

    The ablation study is well conducted

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The paper has two main shortcomings

    1- It is not very clear. The description of the whole architecture is very dense and lacks crucial details.

    For example, it is not clear what the authors mean by their “transformer-like” architecture.

    It is not clear what cube-trees are. Are the authors thinking of an oct-tree border representation?

    I don’t understand the relationship between the left part and the right part of the figure 3, particularly the color schemes that do not seem to match.

    2- Reproducibility is an issue

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Reproducibility in MICCAI is held in very high regard. Between the unclear description and the lack of code, I don’t believe the contribution is reproducible, and hence not very useful.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Some elements are very puzzling, for example at the bottom of page 4, they want to make the OR operator differentiable, and for this they use a max operator, which is not differentiable either…

    There are language issues that do not help with clarity.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a very promising method but that appears non-reproducible. Unless the authors clarify their paper and/or propose to publicise their code, I cannot vote for accepting this paper.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a novel fuzzy attention-based border rendering network for lung organ segmentation composed of two parts: 1) A ConvNeXt backbone that uses a novel channel-specific fuzzy attention module, and 2) A global-local cube tree fusion module that refines the coarse segmentation map. by identifying vulnerable points on object boundaries by looking at differences between up-sampled masks predicted at each level of the ConvNeXt decoder and trying to minimize the information loss caused by the down-sampling and up-sampling.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Both the fuzzy attention and the global local cube tree fusion modules are novel contributions. The claims are backed by experiments and ablation studies. Both the qualitative and quantitative results suggest that the proposed model outperforms existing lung organ segmentation models.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Lacks clarity of presentation. A lot of important information is embedded in figures, but it is difficult to read and make sense of. Figure captions are not very informative. The manuscript also has many grammatical errors and typos.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Enough high-level details about the network architecture and training procedure are provided. While some low-level details are missing they shouldn’t hamper model implementation and reproducibility of the results.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please include a few line to address the following:

    • Why fuzzy logic? Please explain what specific advantage fuzzy attention provides over traditional attention.
    • Is there any specific reason to choose Gaussian membership function beyond its computational simplicity? Would any other type of probabilistic membership function work?
    • How many GMFs were used in the attention module (m=?) How is this number chosen? How sensitive is the model to the choice of m?
    • If the GCLF module only focuses on the border vulnerable points, please explain the loss calculation in equation 6. Is boundary rendering loss computed and backpropagated only through the BVPs? How is it combined with the other loss term?
    • How are the learnable global features derived? What was their embedding dimension d? How was d chosen?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper proposes novel ideas the presentation needs improvement. Several essential details are embedded into figures without any explanation in the main text. The information in the figure is very hard to read and readers are likely to miss something. I would like to see detailed explanation of the figures in the main text or in the supplementary materials.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I don’t see enough improvement in clarity. My earlier concerns were not adequately addressed. Authors have not given me a reason to change my opinion.




Author Feedback

We appreciate all feedbacks. Our work was evaluated as novel (R3, R5), good experimental (R3, R5) and reproducible (R5) method. We have proposed a robust segmentation method FABR with two novelties: 1. designed transformer-like fuzzy-attention network to handle feature representations’ uncertainty; 2. presented a novel GLCF module that decoupled organ regions as cube-trees, focused only on recycle-sampled border vulnerable points, solved discontinuous, false-negative/positive issues. Comprehensive experiment studies on four lung organ datasets proved our method’s efficacy.

To R3

  1. We have now added description “transformer-like architecture (i.e., embedding 4×expansion/compression 1x1x1 convolution layers like FFN module of transformer in our Fuzzy attention module)”
  2. Cube-tree is like oct-tree but the 3x3x3 size than 2x2x2 in oct-tree.
  3. We have revised the matched colours in Fig. 3.
  4. For reproducibility, we will release source code (based on MinkowskiEngine & ConvNeXt).
  5. torch.maximum operator in PyTorch library is fully differentiable. Several max functions, e.g. max_pool3d.

To R4

  1. Our Fuzzy Attention differed from referred work, we replaced original 1x1x1 convolution in the referred work by transformer-like 4×expansion/compression 1×1×1 convolution, added efficient channel-specific SENet layers. All experiments, which have already submitted in the supplementary, have proven significant improvements over referred work, both union metric CCFs (>2% average) and multi-level dice score (>2%), and per single metric.
  2. The motivations were clearly stated in the 9-16 lines of abstract, Fig. 1, 1-2 paragraphs of subsection 2.1 and 1-9 lines of subsection 2.2.
  3. In the original submission, we have already included comprehensive ablation studies in the supplementary due to page limits.
  4. For GLCF module, we combined projected local contextual features F_{i,ff}^l with position embeddings, not original features, as yielding projected features could align features’ dimensions, separating centroid & non-centroid features prevented non-centroid features submerge centroid features, made them equally important.
  5. According to MICCAI policies, “New/additional experiment results in the rebuttal will lead to automatic rejection.”; thus, we didn’t add new results.
  6. Downsampling Fig. 1(c) gets Fig. 1(d), upsampling Fig. 1(d) gets Fig. 1(e), then Fig. 1(f) is the absolute difference of Fig. 1(c) & Fig. 1(e), (in test phase, Fig. 1(c) was binarized coarse prediction).
  7. We did not omit adjacent features from the last layer, replacing them by penultimate features.
  8. We divided BAS dataset into 72/18 cases for train/test; Studies on PARSE2022 dataset were followed official train/val/test split. Our in-house datasets were for test.
  9. DBR (= Nx/Ny) is the ratio of correctly identified branches’ number Nx (IoU > 0.8) to ground-truths’ Ny. DLR (= Lx/Ly) is the ratio of correctly detected branch total length Lx to that of ground-truths Ly. AMR (= Vx/Vy) is the ratio of false-negative volumes Vx to ground-truths’ Vy.

To R5

  1. We have revised captions as suggested. Fig. 1 added “(f) is the absolute difference of (c) & (e)”. Fig. 2 added “BVP detector is shown in Fig. 1. Noting the match between top-right boxes’ and bottom-right bars’ colors”. Fig. 3 added “DW: depth-wise convolution”. We have corrected grammar errors & typos.
  2. The reason to choose transformer-like fuzzy attention was detailed in the 1~2 paragraphs of subsection 2.1.
  3. As GMFs held trainable & statistic parameters, we chose them. We can try other MFs in future work.
  4. Given efficiency & efficacy, m = 4 GMFs were used. When m<=4, model was sensitive to m, otherwise insensitive.
  5. By MinkowskiEngine, boundary rendering loss computed and backpropagated only through BVPs. Total loss was the sum of ordinary loss and boundary rendering loss.
  6. Learnable global features were randomly initialized vectors, d = {32, 64, 128, 256}




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The proposed method is promising and interesting.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The proposed method is promising and interesting.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The novelty of the method and evaluation results in the experiments are good enough.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The novelty of the method and evaluation results in the experiments are good enough.



back to top