Abstract

Reconstructing the 3D anatomical structures of the oral cavity, which originally reside in the cone-beam CT (CBCT), from a single 2D Panoramic X-ray(PX) remains a critical yet challenging task, as it can effectively reduce radiation risks and treatment costs during the diagnostic in digital dentistry. However, current methods are either error-prone or only trained/evaluated on small-scale datasets (less than 50 cases), resulting in compromised trustworthiness. In this paper, we propose PX2Tooth, a novel approach to reconstruct 3D teeth using a single PX image with a two-stage framework. First, we design the PXSegNet to segment the permanent teeth from the PX images, providing clear positional, morphological, and categorical information for each tooth. Subsequently, we design a novel tooth generation network (TGNet) that learns to transform random point clouds into 3D teeth. TGNet integrates the segmented patch information and introduces a Prior Fusion Module (PFM) to enhance the generation quality, especially in the root apex region. Moreover, we construct a dataset comprising 499 pairs of CBCT and Panoramic X-rays. Extensive experiments demonstrate that PX2Tooth can achieve an Intersection over Union (IoU) of 0.793, significantly surpassing previous methods, underscoring the great potential of artificial intelligence in digital dentistry.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1637_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1637_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Ma_PX2Tooth_MICCAI2024,
        author = { Ma, Wen and Wu, Huikai and Xiao, Zikai and Feng, Yang and Wu, Jian and Liu, Zuozhu},
        title = { { PX2Tooth: Reconstructing the 3D Point Cloud Teeth from a Single Panoramic X-ray } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    1. PX2Tooth reconstructs 3D dental models from single 2D panoramic X-rays using artificial intelligence. The process uses two AI models: PXSegNet for segmenting teeth in the X-ray and TGNet for building the 3D models. The methodology reduces radiation exposure and treatment costs compared to traditional 3D imaging techniques. PX2Tooth achieves a high accuracy in the Intersection over Union (IoU) score, outperforming previous methods. Authors validated their approach with a large dataset of 499 cases.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Integrates tooth segmentation and 3D reconstruction from 2D X-rays using novel AI models, called PXSegNet and TGNet, streamlining the conversion from image to 3D model.
    2. Uses a larger dataset of 499 cases compared to previous studies.
    3. Achieves high IoU score of 0.793, indicating high precision suitable for clinical use + potential for real-world application.
    4. Applies AI models to transform standard 2D dental X-rays into 3D models, advancing digital dentistry capabilities without more invasive or expensive imaging technologies.
    5. Experiments seem extensive + ablation studies rigorously assess the effectiveness and contribution of each component of the framework.
    6. Figures look good and are well created, specifically Figure 2 and 3.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. If the first step of segmenting the teeth goes wrong with the PXSegNet, I wonder if the whole 3D model could be off?
    2. The paper shows good results in experiments and also technical feasability, but it would also be interesting to have a bit. more information or limitation/discussion on clinical validation.
    3. It claims to be better than older methods and also tested on a larger dataset but doesn’t compare itself thoroughly with the latest dental imaging technologies.
    4. The technology is complex and might require more computer power like the GPUs used, limiting where it can be used.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I could not see if the method is open source or if there is any way to reproduce the experiments (like an anonymized github repo)

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please replace words like ‘meticulous’, ‘underscore’ or ‘pioneer’ since they can sometimes seem to be generated with an LLM like ChatGPT. I also feel that there is some ‘overclaiming’ of the method, if revising the paper, please write more neutral. Furthermore, I would have wished for a short limitations sections at least to discuss this from a more objective point of view. I would also like to know if the authors code and experiments are openly available somewhere like a github repository or a website.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The figures look good, the method makes sense and the authors clearly described their improvement compared to previous methods. I also feel this is an interesting and important topic. Since the authors seem to more use previously existing methods and put them together into a new workflow, as well as nothing mentioning about the code being available or not, I would argument for weak reject. However, I would also be open to accept in a rebuttal depending on the other reviews, and would only ask for minor changes in the manuscript text.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Thank you for the detailed feedback. I would accept this paper and I feel the authors did address all reviewer concerns very well and carefully. I would suggest the authors to include/add all comments, justifications and details from the rebuttal into their manuscript. Specifically, the clarifications of several reviewer’s questions could fit a limitations/discussion sections.



Review #2

  • Please describe the contribution of the paper

    The paper provides an approach to reconstruct 3D structure of teeth and oral cavity from a single panoramic X-ray (PX) image. The approach, called PX2Tooth uses two deep learning models: PXSegNet and TGNet to achieve the same. The proposed PXSegNet model is responsible for segmenting the permanent teeth from PX images, which are provided to TGNet as additional input. The PGNet model is trained on CBCT and PX pairs to produce a point cloud for a given PX image. Comparisons with the state of the art and analysis are provided.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is well written and is easy to understand. The proposed approach looks robust and better than the compared methods. The authors have addressed several concerns regarding accuracy from previous methods. Incorporation of a segmentation network prior to reconstruction is an interesting way to incorporate contextual information. Comparison and ablation study are provided.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There are some obvious grammatical problems with the paper that need to be corrected. E.g.:

    1. Sec 2.1: For PXSegNet can be seen in ….
    2. Sec 2.2: And alter the output…

    At some places, enough details are not provided. For example, in sec 2.2, no technical details of the novel Prior Fusion Module (PFM) are presented. There is also no citation to explain that.

    Overall, while the application looks great, the models used are already existing and nothing new there. PXSegNet is based on the widely used UNet model for segmentation, and TGNet uses the popular PointNet model. I do not see much novelty there.

    Furthermore, details of registration between CBCT and PX images are missing in section 3. A major effort seems to have gone into data preparation for training.

    The results in Fig. 2 and 3 show surface models of this approach, but I could not find any discussion in the text on how to perform that. The authors should discuss this.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The paper can be implemented by a graduate or PhD student.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The paper presents an interesting approach to 3D teeth reconstruction from PX images. The approach looks quite data intensive for training, while the technical contribution looks week. The suggest the following to strengthen the paper:

    1. Look at various other segmentation models and compare to see which ones are better. A more analysis on segmentation alone could be interesting.
    2. 3D reconstruction from images has come a long way. PointNet is an old technique (from 2017). I suggest authors to review 3D reconstruction methods that utilize single image. Latest techniques use diffusion based 3D reconstruction, that could be explored here as well.
    3. The outcome of reconstruction is a pointcloud. In many applications, that is not sufficient and a smooth surface model is often required. The authors might consider reconstruction approaches that use implicit representation rather than point cloud based.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I’m inclined to give a Weak reject due to lack of novely but since the application looks promising and results are impressive I’m providing a Weak Accept.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper describes a two step machine-learning based process on how to reconstruct 3D shapes of teeth, just from 2D XRay images. Their approach outperforms the state of the art.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (S1) The paper is well structured and easy to follow (S2) The paper tackles a challenging problem and achieves satisfactory results (S3) Combing 2D segmentation and 3D generative modeling as two separate steps is a clever way to reduce the problems complexity. (S4) The paper lists most hyperparameters that are require the reproduce the results. The authors claim they will release the source code upon acceptance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (W1) Fig. 2 contains too many details and needs to be simplified. Especially (a) and (b) can be abstracted much more. (W2) Section 2 and onwards contain significant spelling and grammar mistakes. Those sections need to be updated thoroughly before publication. (W3) Currently it is not clear to me how Figure 4 fits into the whole picture.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Once the source code has been released, the results of the paper should be reproducable.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Most of those comments are writing related. (C1) The abstract sounds great (C2) which is further integrated into to guide the generation of TGNet –> which is further integrated to guide the generation of TGNet (C3) remove “,through meticulous segmentation technology,” (C4) remove “The notable advantage of UNet lies in its U-shaped structure, making it highly adept at accurate segmentation of medical images [22].” (C4) spell PointNet with consistent capitalization (C5) reconstruction loss needs to be capitalized consistently and the acronym should be introduced at the words’ first appearance. (C6) We pioneered the construction of a dataset containing 499 cases –> We constructed a dataset containing 499 cases (C7) 3D Tooth –> 3D tooth (C8) I don’t understand this sentence: “, And alter the output which is not merely to determine whether a point belongs to a category but to generate the point cloud representing the position and shape of the teeth.” Please improve (C9) This sentence doesn’t make any sense: “The expansive path features feature map upsampling, 2x2 upconvolutions halving feature channels, and integration with cropped maps from the contracting path plus two subsequent 3x3 convolutions with ReLU.”

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper describes a well thought through technically challenging solution to the problem of 3D tooth reconstruction. For the paper to be accepted, the paper writing has to be improved significantly. This should be doable in a minor revision cycle.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We appreciate the efforts of the AC and all reviewers in handling our manuscript. Below, we provide a general response and individual responses to each reviewer’s comments。

General Response: (1) We will release the code, implementation details on GitHub. (2) We appreciate reviewers’ detailed comments and apologize for the grammatical errors, which will be corrected in revised version.

Response to Reviewer #1 Q1.2: Technical details of PFM. PFM integrates 3D features with 2D features to enhance the accuracy of tooth tip(Fig.3B). Using multi-head cross-attention, PFM’s inputs are Query (Q) from TGNet via W_q and Key (K) and Value (V) from PXSegNet via W_k and W_v. After cross-attention, the fused outputs feed into subsequent TGNet layers. Detailed descriptions and settings will be in the revised version and code. Q1.3 The novelty of this work and old techniques. Our contribution lies in a simple, effective framework for 3D CBCT reconstruction from 2D images, validated by a large-scale clinical dataset. Additionally, the proposed PFM integrates 2D information into 3D features, while MB, UB, and RT Losses enhance quality (Fig.3B, Table2). We have replaced the backbone with various networks. TransUnet’s 2D and 3D IoU decreased by 4.75% and 1.43%, respectively, and simplified Unet’s by 10.51% and 5.16%, showing our method effective. We will seek better alternatives and discuss their applicability to dental images in the revised version. Q1.4: Registration Details. We will add the registration details and manual check efforts in the revised version. Q1.5: Mesh surfaces v.s. point clouds. We agree that smooth surface models are important for dental applications, hence we use mesh illustrations. We used the Ball-Pivoting Algorithm for converting point clouds to meshes. Mesh transformation is not our focus; all computations are based on images and point clouds. We will discuss implicit representation and mesh evaluations in the revised version.

Response to Reviewer #3 Q3.1: The impact of segmentation on generated parts. We replaced Unet with TransUnet to simulate inaccurate segmentation. TransUnet’s 4.57% drop in segmentation IoU led to only a 1.43% drop in 3D IoU, showing our model’s robustness. Missing teeth segmentation can be managed as we generate teeth individually (up to 32). Additionally, our method still needs adjustments to be compatible with supernumerary teeth. Addressing these limitations is part of our future work. Q3.2: Limitation on clinical validation. Current methods focus on teeth, while CBCT includes broader details like the jawbone and nerve canals, which our method has yet to validate. The clinical utility of 2D X-rays for 3D CBCT and generalization for complex cases need validation. These limitations will be discussed in the revised version. Q3.3: Overclaiming & Code release In revised version, we will replace any inappropriate words, present the paper neutrally to avoid overclaiming and release the code on GitHub. Q3.4: GPU cost Our method is cost-effective, training on a single RTX 3090 for 56.23 hours with 0.692 GB VRAM and 103.94 million parameters. It infers 32 teeth in 6.03 seconds, making it highly efficient for medical center deployment. Q3.5: Compare with the latest technologies The question might address the baselines or accuracy of the latest CBCT machines. Table1 and Fig.3A compare the latest methods, including X2Teeth and Occudent. Our CBCT machines have a 0.125 to 0.47 mm resolution, which is an acceptable distance error for many clinical applications.

Response to Reviewer #4 Q4.1: About Fig.2 simplified and grammar mistakes. We deeply appreciate your comments. Fig.2 will be appropriately simplified in the revised manuscript and will correct all writing mistakes in revised version. Q4.3: Description of Fig.4 Fig.4 visualizes tooth-level analysis, aiding future reconstruction improvements. Teeth No.5, No.7 show high IoU (Table3, Fig.4B), while No.6 shows lower IoU with less root detail (Fig.4A).




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The problem is interesting, and all reviewers give positive scores after rebuttal. I recommend “accept” at this stage.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The problem is interesting, and all reviewers give positive scores after rebuttal. I recommend “accept” at this stage.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The reviewers agree after the rebuttal to accept the paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The reviewers agree after the rebuttal to accept the paper.



back to top