Abstract

Recent development in heatmap regression-based models have been central to anatomical landmark detection, yet their efficiency is often limited due to the lack of skeletal structure constraints. Despite the notable use of graph convolution networks (GCNs) in human pose estimation and facial landmark detection, manual construction of skeletal structures remains prevalent, presenting challenges in medical contexts with numerous non-intuitive structure. This paper introduces an innovative skeleton construction model for GCNs, integrating graph sparsity and Fiedler regularization, diverging from traditional manual methods. We provide both theoretical validation and a practical implementation of our model, demonstrating its real-world efficacy. Additionally, we have developed two new medical datasets tailored for this research, along with testing on an open dataset. Our results consistently show our method’s superior performance and versatility in anatomical landmark detection, establishing a new benchmark in the field, as evidenced by extensive testing across diverse datasets.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1602_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1602_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Wan_Learnable_MICCAI2024,
        author = { Wang, Yao and Chen, Jiahao and Huang, Wenjian and Dong, Pei and Qian, Zhen},
        title = { { Learnable Skeleton-Based Medical Landmark Estimation with Graph Sparsity and Fiedler Regularizations } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15012},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    • A novel skeleton reconstruction model using Fiedler regularization, which introduces graph-derived structural constraints to GCNs, marking a significant shift from traditional manual methods.
    • The FRGCN model, which includes the TAE and SAE, designed for landmark detection.
    • Extensive experiments demonstrating the model’s superior performance on three medical image datasets.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well-organized, with a clear progression from the introduction of the problem to the methodology, experiments, figures, and conclusion.
    • Ablation experiments are extensive and results are encouraging.
    • The paper is grounded in a solid theoretical foundation, with clear explanations of the concepts such as graph convolution networks and Fiedler regularization.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Generalization Across Different Medical Modalities: The paper primarily focuses on X-ray image datasets. The applicability of the FRGCN model to other medical imaging modalities, such as MRI or CT scans remains unexplored. The differences in image characteristics and noise levels across modalities could affect the model’s performance.
    • Potential for Overfitting: The authors developed two new datasets for this research. While this is valuable, there is a risk of overfitting the model to these specific datasets, which may reduce its generalizability to other, unseen data.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Comparison with Manual Methods: Beyond performance metrics, how does the FRGCN model compare to traditional manual methods in terms of diagnostic accuracy and consistency?
    • Limitations and Future Work: Can you discuss any identified limitations of the current model and how they might be overcome in future research? What are the next steps for refining and expanding upon the FRGCN?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is easy to follow and the experimental results are promising. The motivation of regularizing the network by minimizing the Fiedler value of a graph is clear and its effectiveness is proved in the experiments.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    In this paper, the authors propose a novel method called FRGCN, which leverages Fiedler regularization-based sparse graph representation for skeleton reconstruction. This method is agnostic to skeleton structure and is extendable to enhance performance in landmark detection across different medical imaging datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The proposed method is a good extension to the current paradigm of skeleton and landmark detection, considering the infeasibility of designing data-specific graph structures. (2) The combination of TAE and SAE is reasonable for addressing the mentioned task, although there is room for improvement. (3) The experiments, which include the selection of lower limb, pelvic, and cephalogram datasets, are comprehensive enough to support the authors’ claims. (4) The paper presentation is clear and concise. The introduction to the methodology is well-motivated and easy to follow. The mathematical notation is also well-structured.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) Although the motivation and combination of methods are indeed novel, the TAE and SAE components in FRGCN seem to be loosely connected. The only components that join the two parts are a multiplication between (a) and (b) in Figure 1. Would there not be more effective options? (2) In general, spatial attention in TAE may not be the optimal choice since it is considered somewhat old-fashioned. (3) Considering the computational complexity, there should be a better structure to fit this task rather than global attention. Some of the chosen baselines, such as GCN, appear to be outdated. Also, the ablation study for TAE is missing. Minor: (1)Although this combination is indeed novel, there are still a few papers that mention the Fiedler regularization approach in GNN [1-3]. Highlighting the novelty of this work by comparing it with similar works in the domain would be helpful. [1] Tam E, Dunson D. Spectral Gap Regularization of Neural Networks[J]. arXiv preprint arXiv:2304.03096, 2023. [2] He Y , Gan Q, Wipf D, et al. Gnnrank: Learning global rankings from pairwise comparisons via directed graph neural networks[C]//international conference on machine learning. PMLR, 2022: 8581-8612. [3] Jiang, Songyao. Vision-Based Analysis of Human Face and Gesture: Dynamic Modeling, Synthesis and Recognition. Diss. Northeastern University, 2022.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors introduce the details of how to tailor the dataset but don’t mention either how to access it or any open-source plan. Therefore, the reproducibility of the method is questionable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    (1) Consider using a smoother connection between TAE and SAE in future follow-up work. (2) Consider adding an efficiency analysis to compare the gains in running speed, model agnosticism, extendability, and generalizability by using FRGCN.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Novelty of the methodology (specifically for this combination), smooth logic of paper presentation, and potential extendability.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors proposed a novel skeleton reconstruction model based on Fielder regulization, which introduces graph-derived structure constraints to GCNs. Also introduce the FRGCN, an effective model for landmark detection, by adding a Target-aware Encoder (TAE) and a Skeleton-aware Encoder (SAE). The improvements of the introduced methods are evidenced by the results on three medical image datasets.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed methods are innovative and have improved upon the conventional approaches. The description of the methodology is fairly clear, with easy-to-understand equations and pseudo code. The framework diagram shows the key components and steps, make it quite easy to follow. These are backed by experiments on three datasets, and the results are convincing in terms of the improvements.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Some of the wording of certain sentences could be improved. The desmonstration and discussion of potenital clincal applications and impacts are lacking. No mentioning of the link to the source code, making it slightly more tricky to reproduce the results.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Some information in the key parameters of the pipeline should be provided for the sake of reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall the proposed pipeline is novel and the description is fairly clear. However, more information of the experiment settings, e.g. values of some key parameters, should be provided for reproducibility. Also some demonstration and discussion of the clinical implications would be interesting and beneficial for the community.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty of the proposed method, and clear decripiton of the metholodology. This is backed of the improvements in the results on three datasets.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We sincerely appreciate the reviewers’ insightful feedback and recognition of the significance of our study. Below, we address the major concerns raised and outline the enhancements in the manuscript.

R1#: Thanks. We will refine the language as advised. Once approved, the code will be publicly accessible. Also, we’ll enrich the discussion, exploring clinical implications such as assessing the lower limb gravity line and planning orthognathic surgeries.

R3#: 1.Generalization Across Different Medical Modalities Currently, our focus is on landmark detection in 2D DR images. To expand our scope, we intend to extend our work to 3D and validate our findings across various medical modalities, including MRI and CT scans. 2. Potential for Overfitting Rigorous validation experiments have been conducted to ensure the generalizability of our model to unseen data. We tested it on two new datasets and two independent public test datasets from the ISBI 2015 cephalograms dataset, as depicted in Sup Table 2. Techniques such as data augmentation and regularization were utilized to mitigate the risk of overfitting. 3. Enhancing the Comparison with Manual Methods Beyond Performance Metrics In the final paper, we’ll complement performance metrics with diagnostic comparisons, including femoral and tibial angles and other relevant parameters for the lower limb dataset. Both FRGCN and manual methods will be used for evaluation. 4. Limitations and Future Work While our paper validates that the graph structure via Fiedler regularization outperforms manual designs, questions about its optimality persist. In future work, we aim to provide a comprehensive mathematical derivation and proof method to identify the optimal graph structure, integrating perspectives like control theory.

R4#: 1. The TAE and SAE Components in FRGCN Seem Loosely Connected Acknowledging your observation, we recognize the significance of developing a more efficient connection method. We will explore this direction further, building upon existing methodologies [1,2]. [1] Xu X. Structure-Enriched Topology..TMM, 2022. [2] Dai Y. RSGNet: Relation based skeleton graph..AAAI,2021. 2. Spatial Attention is Considered Old-Fashioned and Computationally Complex Acknowledging spatial attention as a classic method, we recognize the potential for improvement with more advanced attention mechanisms. In our forthcoming research, we intend to explore advanced methods such as cross-spatial attention or spatially separable attention[3,4], while maintaining the core focus on a Fiedler regularization-based sparse graph representation for skeleton reconstruction. [3] Guo F. B2c-afm.. cross-spatial attention.. TIP 2023. [4] Xiangx C. Twins: Revisiting..Spatial Attention.. NIPS, 2021. 3. Outdated Baselines and Missing Ablation Study for TAE Our choice of baselines aims to underscore the superiority of FRGCN over manually designed structures. While GCN serves this purpose effectively, we also incorporate a comparison with a novel baseline, Vitpose, which utilizes transformer architecture. Additionally, we acknowledge the necessity of conducting ablation studies for TAE and commit to including these results in the final manuscript. 4. Highlighting the Novelty of This Work We will cite the articles you mentioned to highlight the novelty of our work. While Fiedler regularization has been studied in spectral graph theory, our application in optimizing skeleton structure for landmark detection tasks introduces an practical optimization scheme. We appreciate the clarification regarding the focus of the mentioned article, and we will ensure appropriate citations. [5] Chung,F.R. Spectral graph theory. American Mathematical Society, 1997. 5.Efficiency Analysis Your suggestion to include an efficiency analysis is duly noted. In our future work, we will consider evaluating running speed, model agnosticism, extendability, and generalizability to enhance the comprehensiveness of our study.




Meta-Review

Meta-review not available, early accepted paper.



back to top