Abstract

Teeth alignment plays an important role in orthodontic treatment. Automating the prediction of teeth alignment target can significantly aid both doctors and patients. Traditional methods often utilize rule-based approach or deep learning method to generate teeth alignment target. However, they usually require extra manual design by doctors, or produce deformed teeth shapes, even fail to address severe misalignment cases. To tackle the problem, we introduce a pose prediction model which can better describe the space representation of the tooth. We also consider geometric information to fully extracted features of teeth. In the meanwhile, we build a multi-scale Graph Convolutional Network(GCN) to characterize the teeth relationships from different levels (global, local, intersection). Finally the target pose of each tooth can be predicted and so the teeth movement from the initial pose to the target pose can be obtained without deforming teeth shapes. Our method has been validated in clinical orthodontic treatment cases and shows promising results both qualitatively and quantitatively.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2802_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Den_TAPoseNet_MICCAI2024,
        author = { Deng, Qingxin and Yang, Xunyu and Huang, Minghan and Jiang, Landu and Zhang, Dian},
        title = { { TAPoseNet: Teeth Alignment based on Pose estimation via multi-scale Graph Convolutional Network } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15012},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces TAPoseNet to predict post-orthodontic teeth alignment targets without deforming teeth shapes. The authors develop a deep learning-based method in order to estimate the tooth poses, a multi-scale Graph Convolutional Network to characterize spatial relationships of teeth, and the prediction of post-orthodontic teeth alignment targets with clinical interpretability. The proposed method has been validated in clinical orthodontic treatment cases, and showed promising results both qualitatively and quantitatively.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    One of the main strengths of the paper is the novel approach of using a multi-scale Graph Convolutional Network (GCN) to characterize the spatial relationships of teeth in different levels (global, local, intersection). This innovative use of GCN allows for a comprehensive understanding of the teeth relationships, enabling the prediction of post-orthodontic teeth alignment targets without deforming teeth shapes. By incorporating geometric information and tooth poses, the method provides a holistic view of the teeth alignment process in order to contribute to more accurate and clinically feasible orthodontic treatment planning. The application of GCN in this context is particularly strong as it addresses the complex spatial relationships inherent in dental anatomy, by demonstrating the potential of advanced deep learning techniques in orthodontic treatment.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The study uses a dataset of 50 patients for training and 25 pairs for testing. This might be a relatively small dataset for deep learning models, and potentially lead to overfitting. TAPoseNet focuses on predicting the target position of each tooth individually. It might not fully capture the complex interactions between teeth and the overall mechanics during orthodontic treatment. The current model cannot handle scenarios with missing teeth or wisdom teeth, which are common occurrences in orthodontics. While the authors mention future work on incorporating occlusion, the current model might not fully account for the contact patterns between upper and lower teeth, which are crucial for a successful orthodontic outcome. The paper doesn’t explicitly discuss the interpretability of the model’s predictions. Understanding how the model arrives at its results would be valuable for orthodontists to gain trust and potentially refine the predictions based on their expertise. The authors used various loss functions, but failed to provide a loss convergence curve. The convergence curve is essential for evaluating model training effectiveness and identifying potential issues such as overfitting. Its absence limits our understanding of the model’s training process and raises concerns about its robustness.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Expand the dataset size and diversity to include a wider range of patient cases (age, ethnicity, types of malocclusions, etc.). Consider collaborating with hospitals or orthodontic clinics to access larger datasets. Incorporate the mechanics into the model. This could involve data on movement, size and shape, and the relationship between upper and lower teeth. Develop methods to handle missing teeth and wisdom teeth. This could involve techniques like imputing missing data, incorporating additional information about missing teeth (e.g., extraction sites), or designing specific modules for handling these cases. Explore techniques for making TAPoseNet’s predictions more interpretable for orthodontists. This could involve using visualization tools to highlight the features most influential in the model’s predictions or developing methods for explaining the model’s reasoning in a human-understandable way. Include a loss convergence curve for each loss function used during training. This will help assess the effectiveness of the training process and identify potential issues like overfitting. While the authors mention future work on incorporating occlusion, consider elaborating on potential methods for integrating contact patterns between upper and lower teeth into the current model architecture. Once the limitations are addressed, consider planning a clinical validation study to evaluate TAPoseNet’s performance in a real-world clinical setting with experienced orthodontists. By addressing these weaknesses and incorporating the suggested improvements, the authors can significantly strengthen TAPoseNet and enhance its potential as a valuable tool for orthodontic treatment planning.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The study on TAPoseNet is interesting, but there are a few reasons why it might need more work before being accepted. First, they only used data from 50 patients, which might not be enough to make accurate predictions for everyone. Second, the model only looks at individual teeth and doesn’t consider how the jaw itself moves during treatment. This could lead to unrealistic predictions. Finally, the model can’t handle cases where teeth are missing, which is a common situation for orthodontists. Because of these reasons, I think the paper needs some major revisions and could be rejected, dependent on rebuttal.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors responded to the major concerns in their paper.



Review #2

  • Please describe the contribution of the paper

    The authors present a deep learning method to derive shape and position information of teeth from point cloud data, and based on that data identify the requested transformation for each tooth to the optimal alignment. The method is shown to work on clinical data and can potentially be applied for othodentic alignment planning.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    -The paper is well structured. Literature, methods and experiments are clearly described step by step -The method is compared against current existing methods and shows better results -The qualitative results and clinical implications are well described, Fig. 2 and related text

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    -The dataset seems rather on the low side with 25 dataset including pre- and post data. -The paper claims to train the optimal teeth positions (e.g. transformations) only from the initial positions (it is stated that 50 post dataset are used for training and 25 pre-and post for testing.) However, it appears sensible that for training the alignment target prediction module one would need pre- and post data as well (as gound truth for network training). It does not become clear how the method can be trained on only pre- data, e.g. this would fit in the introduction where this claim is made. -Additionally, if not separated between patients, overfitting and data leakage could be problematic. It will be necessary to clarify about these potential issues and clearly explain the data split strategy. -Fig. 3 a. is not well described: left-right, values of colormap? (and typo in Fig. 3 b. description)

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The overall structuring and content is quite good. Major concern is as stated above (weaknesses). It did not become clear to me how the training works without ground truth target poses. It would be necessary to explain this in a revision or rebuttal, together with a more detailed description on data split and training.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The overall structuring and content shows a high quality, and results which justify that the proposed method is an advancement to the state of the art. Because of the concerns stated above, the final decicion would depend on the revision or rebuttal.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors could answer the open questions in the rebuttal. Especially the data split and dealing with the lack of pre-data in the training dataset was explained. Assuming that these explanmations will be included evenly clear in the updated version, and considering the previous overall good perception, I have adapted the rating of the manuscript.



Review #3

  • Please describe the contribution of the paper

    The study develops a pose estimation based framework for automated teeth alignment. The framework estimates the initial pose of the teeth and makes use of GNNs to predict the post-orthodontic teeth alignment by extracting representations at multiple scales.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Comment on novelty: As pointed out in the study, deep learning based methods for automatic teeth alignment exist [1, 2]. Prior work which makes use of graph-based methods for teeth alignment also exist [3]. The novelty of the method lies in the use of teeth pose estimation and multi-scale GNNs for the prediction of target teeth alignment. 1) The study presents the state of the art very well. The state of the art is categorised, and the strengths and weaknesses of each work is clear. 2) The study compares the results of the model with various state of the art approaches. A qualitative analysis is also presented.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    3) It is not clear what models are utilised to obtain the quaternion for the initial pose. This should be clarified. 4) The study mentions that there were 50 patients with post-orthodontic scans, and 25 with both pre- and post-orthodontic scans. The 50 patients with post-orthodontic scans were used during the training. It is not clear what the target/initial pose would be in this case. 5) The size of the dataset is small compared to other similar studies [1, 3].

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Additional comments: 6) Abbreviation of FDI should be defined. 7) Typographic errors should be fixed.

    References: [1] Wei, Guodong, et al. “TANet: towards fully automatic tooth arrangement.” Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XV 16. Springer International Publishing, 2020. [2] Lingchen, Y. A. N. G., Zefeng, S. H. I., Yiqian, W., Xiang, L. I., Kun, Z. H. O. U., Hongbo, F. U., & Zheng, Y. (2020). iOrthoPredictor: model-guided deep prediction of teeth alignment. ACM Transactions on Graphics, 39(6), 216. [3] Wang, Chen, et al. “Tooth alignment network based on landmark constraints and hierarchical graph structure.” IEEE Transactions on Visualization and Computer Graphics (2022).

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the study proposes some novel methods for the problem of automatic teeth alignment, several points need to be clarified or explained. The dataset size is also small compared to other similar studies. Nonetheless, a “weak accept” is recommended because of novelty in methods and improvement in results over baselines.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    The authors provide satisfactory clarifications for the open points about initial/target poses. Their response to the comment about dataset size is also reasonable. I urge the authors to provide these clarifications in the final version and make use of supplementary material if page limit is a concern.




Author Feedback

Thanks to all the reviewers for your valuable comments. We appreciate your recognition of:

  1. Our contributions (R1 “can potentially be applied for orthodontic alignment planning”, R3 “novel approach”, R4 “The novelty of the method lies in the use of teeth pose estimation and multi-scale GNNs for the prediction”)
  2. State of the art (R1 “clearly described”, R4 “very well” “qualitative analysis”)
  3. Experiments (R1 “well described”, R3 “promising results”, R4 “qualitative analysis”) Q.1: The dataset size is small (R1, R3, R4) , is there any potential overfitting and data leakage? (R1, R3) The size and diversity of the training dataset have always been our attention. We described the dataset in subsection 3.1. In detail, our work selected a total of 75 patients (50 training, 10 validation and 15 testing) based on medical criteria, which aims to cover common cases of orthodontic treatment. To obtain a more sufficient dataset and address overfitting issues, we utilized data augmentation methods (e.g., randomly rotate or translate) to reverse-generate diverse initial positions (pre-orthodontic) as input for each case in the training set with ideal target (post-orthodontic) position, reflecting different orthodontic symptoms. While both the validation and test sets will use real pre- and post-treatment data, strictly distinguished from the training data, avoiding data leakage. The strategy of training yielded superior model convergence. Such figure has not been included due to page limitation. Q.2: The detailed explanation of initial pose and target pose of teeth (R1, R4) We have formulated the definition of pose and have explained how we can estimate the pose of teeth in the first two paragraphs in subsection 2.1. Specifically, in the training process, we consider the pose of the ground-truth post-orthodontic teeth as the target pose. In each epoch, we apply an identical transformation which is applied to the corresponding tooth to each target pose to obtain the pseudo “initial pose”. In the reference stage, initial pose of each tooth from the input is estimated by the pre-trained Teeth Pose Estimation module, so that we can predict the target pose. Q.3: Typos problem and not defining certain abbreviations (R1, R4) We will correct them in the final manuscript. Q.4: Unclear description of Fig.3.a (R1) Fig. 3.a is visualization of teeth pose estimation as described in the annotation. We show two cases in the figure. In each case, the left sub figure is the axis aligned bounding box (red color) under the world coordinate system, while the right sub figure is the oriented bounding box (red color) under estimated local coordinate system. The teeth color just highlights the shape of the teeth point cloud. Q.5: Lacking medical interpretability in the model’s prediction (R3) Our predictions have medical interpretability, because the pose of teeth of ground truth data is determined by dental anatomy. Q.6: Not fully capture the complex interactions between teeth (R3) In the last sentence of section 1, we mentioned “we build a multi-scale Graph Convolutional Network to characterize the spatial relationships of teeth in multi-scale from different levels (global, local, intersection)”. This design helps to capture the complex interactions between teeth. Q.7: Handling missing teeth and wisdom teeth, and the mentioned future work on incorporating occlusion (R3) Section 4 mentioned we will handle such issues in the future. Regarding missing teeth, our current work focus on the orthodontic cases that do not require implants or tooth extraction. Regarding wisdom teeth, they are typically assumed to be extracted in orthodontics. Regarding occlusion, our Geometric Information Extraction module can discern geometric information about occlusal grooves. By aggregating such features, our multi-scale Graph Convolutional Network architecture can make the occlusal relationship between the upper and lower jaws more reasonable from different levels.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The study introduces TAPoseNet, an innovative approach to automatic teeth alignment. The paper presents a high-quality structure and content, showing that the proposed method advances the state of the art, yet it has limitations such as a small dataset of only 50 patients. The reviewers’ concerns have been addressed, and while the decision is quite borderline, there are no flaws in the experimental design, leading me to lean towards acceptance. If accepted, I strongly recommend that the authors incorporate the reviewers’ suggestions into the camera-ready version.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The study introduces TAPoseNet, an innovative approach to automatic teeth alignment. The paper presents a high-quality structure and content, showing that the proposed method advances the state of the art, yet it has limitations such as a small dataset of only 50 patients. The reviewers’ concerns have been addressed, and while the decision is quite borderline, there are no flaws in the experimental design, leading me to lean towards acceptance. If accepted, I strongly recommend that the authors incorporate the reviewers’ suggestions into the camera-ready version.



back to top