Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

In open spine surgery, navigation requires registration between the surgical field and volumetric CT. The spine pose changes between preoperative CT (pCT) and intraoperative patient positioning, and can further change after intraoperative CT (iCT) during surgery, degrading navigation accuracy. In this study, we developed a novel, fully automated end-to-end system for spine pose adjustment driven by intraoperative stereovision imaging (iSV) images. Our system includes three innovative modules. First, we present a method to automatically generate weak bone labels in stereo images via co-registration with iCT images. The automated labeling process addresses the labor and expertise-intensive challenges associated with supervised bone segmentation models that typically require manually segmented labels for training. Second, we train a fully convolutional deep learning method that integrates complementary information from the color (RBG) and depth (D) images to automatically segment bone using the weak labels. Finally, the segmented bone structures are used to perform a pose-adjusted registration. Data collected from 5 porcine cadavers were used for training and validation, and data from 2 porcine cadavers were used for independent testing. Pose-adjusted registration accuracy across all lumbar levels of test specimens was 2.0±1.1 mm, compared to 2.5±1.5 mm using manual segmentation, and 9.1±6.8 mm using a commercially available navigation system. The fully automated pose-adjusting registration framework compensated for spine motion between pCT and intraoperative positioning and overall achieved clinically acceptable accuracy. Our approach was not user or expertise-dependent and holds potential for wider adoptions in open spinal procedures for intraoperative spine motion correction. Code is available at https://github.com/wRossw/Sparse-XM-Spine-Pose-Adjustment.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4661_paper.pdf

SharedIt Link: https://rdcu.be/eHw2g

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05114-1_51

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/wRossw/Sparse-XM-Spine-Pose-Adjustment

Link to the Dataset(s)

N/A

BibTex

@InProceedings{WarWil_SparseXM_MICCAI2025,
        author = { Warner, William R. AND Bhattacharya, Indrani AND Evans, Linton T. AND Mirza, Sohail K. AND Paulsen, Keith D. AND Fan, Xiaoyao},
        title = { { Sparse-XM: Spine Pose Adjustment with RGB-D Bone Segmentation via Cross-Modality Label Transfer } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {532 -- 541}
}

Reviews

Review #1

Please describe the contribution of the paper

In this paper, the authors present methods for rigidly registering preoperative CT data to RGBD imaging for pose adjustment. In their proposed pipeline, the authors use a holistically-nested edge detection (HED) architecture to achieve a high-fidelity segmentation in RGBD space to improve their final pose adjustment registration. The authors also emphasize the utility of using co-registered RGBD and intraoperative CT images to generate automatic segmentation labels, which are used as weakly-supervised labels for the HED model.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is clearly written and well organized, with strong and appropriate validation methods on a porcine dataset.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

One weakness of this paper is that the proposed methods are fairly similar to previously demonstrated prior work. I’d point the authors specifically to this prior work [1]. In this previous method, ground-truth bounding box annotations for spine segmentation in RGBD images are generated automatically by registering CT and RGBD camera space. These ground-truth labels were then used to train YOLOv8 to identify the region of interest containing the spine. Then, the bounding box region was input to the Segment Anything Model (SAM) to create a binary mask of spine segmentation. Even though the author’s proposed method generates automatic labels slightly differently (using registered CT and RGBD to create a weak label, and then using a network to refine the mask), the authors cite one of the main contributions of their work being automatic label generation. Other relevant related work using RGBD images for spine registration include the Liebmann et al. Paper cited by the authors ([9]). Even though ([9]) uses ex-vivo spine data only, the methodology is similar.

[1] https://arxiv.org/abs/2410.01443
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

In Table 1, the Sparse-XM TREs are lower than using the weak or manual masks, but is this improvement statistically significant? The TRE differences between the groups are fairly small.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although the paper was clear and well written, I do not believe it presents a large enough leap forward in the field to recommend accept. While automatic label generation for training segmentation models is important for automation, there are previously proposed methods. Additionally, the reported rigid registration accuracy results with the improved segmentation model are incremental (less than 1 mm on average).
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper described a method that uses RGB-D images to correct for the spine positing during surgery as compared to the pre-op data set. These errors can amount to several centimeters easily.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is nicely written
- The paper is easy to understand and also covers a very relevant idea.
- The paper does a basic ablation study.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The paper is often vague on the details. This is to some extend understandable as the entire process involves many different steps and approaches.
- In Sec 2.2 the authors describe on how they are producing weak labels for their data sets by e.g. segmenting the iCT bone images and by registering it to the point clouds capture with the SBI sensor (type, model, manufacturer?) Then they assigned points that are closer than 3mm away as bone! This is in my opinion a bold assumption and a chicken-and-egg problem as the two point clouds might have been registered wrongfully! Admittingly they later re-fine this initial labelling step but I wonder what successrate such an algorithm has? How often is such an assumption not correct!?!?
- In Sec 2.4 the authors write that they compile all the SBI images into one common frame. How is this done? Tracking information seems to be available but how is this fusion done? Do they just translate the point clouds based on their tracking camera or do they the do a proposed ICP registration (by also taking into account the depth component) and the RGB color component? This is inclear and needs to be described.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper addresses a clinical relevant - yet unsolved problem - yet there are still many details missing to make the paper reprodusible.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The manuscript Sparse-XM proposes a bone segmentation and subsequently registration algorithm that leverages RGB-D image information with Pre-operative CT scans. They use a weak label generation algorithm as an initiation to accelerate the overall registration process. The authors tested the proposed algorithm on lumbar spine levels. The data was collected from 5 porcine cadavers. Their post-adjustment registration accuracy for the proposed algorithm was close to 2±1.1 mm which is a good accuracy level for this type of applications.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed model fuses RGB images with depth maps. While the idea is not new in the field the implementation of the technique to improve the segmentation task using adaptive-HED was innovative.
2. Use of weak labels instead of zero shotting the labelling task helps improving the segmentation and later registration task.
3. Accuracy of the proposed model (2.0±1.1 mm) surpasses both manual segmentation (2.5±1.5 mm) and commercial systems (9.1±6.8 mm), providing robust evidence of effectiveness.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. While the model shows potential, the number of test data points (252 images) and its diversity is a point of concern for me. I think this work can be strengthen with further testing with a different set of data.
2. The authors use the generated weak labels (from step 1) along with RGB-D images to generate and refine bone segmentation. While the authors mention this step in Abstract, later in the manuscript (introduction section detail) they skipped to mention the use of weak labels for segmentation task (step 2). This was more unclear in section 2.3 too.
3. The proposed work is not compared any base or state of the art models. Some potential comparison can be: a. Gopalakrishnan, Vivek, Neel Dey, and Polina Golland. “Intraoperative 2d/3d image registration via differentiable x-ray rendering.” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11662-11672. 2024. b. Ma, Xihan, Xiao Zhang, Yang Wang, Christopher J. Nycz, Arno Sungarian, Songbai Ji, Xinming Huang, and Haichong K. Zhang. “Cross-Modality Registration using Bone Surface Pointcloud for Robotic Ultrasound-Guided Spine Surgery.” Journal of Medical Robotics Research (2025): 2540004. c. Ma, Xihan, Xiao Zhang, Yang Wang, Christopher Nycz, Arno Sungarian, Songbai Ji, and Haichong K. Zhang. “Feasibility of Pointcloud-based Ultrasound-CT Registration towards Automated, Robot-Assisted Image-Guidance in Spine Surgery.” In 2024 International Symposium on Medical Robotics (ISMR), pp. 1-7. IEEE, 2024.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

I suggest the authors to add an overall algorithm table for reproducibility of the work.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. Reproducibility and comparison with other baseline methods to compare the current proposed models performance.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank the reviewers for the thoughtful feedback and recognition of the clinical importance and robustness of our methods. R1/R3 (reproducibility): We will include additional implementation details and references to our previous studies. Source code will be released on GitHub. R1 (hardware): The custom system uses two 1080p cameras rigidly fixed to an active tracker. Reference to previous studies will be included. R1 (3mm weak-label threshold): The weak labels were generated using an accurate co-registration between CT and RGB-D data. Specifically, a fiducial-based rigid registration aligned CT and the surgical field to transform RGB-D data to CT space. A refinement registration was then performed between each RGB-D image and CT using fiducials available on each image (average: 8 ± 2 fiducials/image). The overall mean fiducial registration error was 0.6 ± 0.2mm across the dataset. The 3mm threshold was determined based on error sources that contributed to the RGB-D data, e.g., calibration, reconstruction, and tracking. We will add co-registration errors involved in weak label generation in §2.1. R3 (weak-label incorporation): We will include detail on incorporation of weak labels for training in §2.3. Each RGB & D image pair and its weak label formed a training triplet for the adapted HED network. Fig. 2 will be modified for illustration. An algorithm table will be inserted in §2.2 to list steps in weak-label generation with parameter settings. R1 (merge of RGB-D surfaces): We will clarify in §2.4 that RGB-D surfaces are merged into a single coordinate system using only tracking information (no inter-frame ICP) and then down-sampled on a voxel grid before level-by-level registration. R2 (significance of error reduction): We will discuss the statistical and clinical significance of target registration error (TRE) improvement of <1mm in §3.3 . Paired analyses across 12 independent levels in 2 animals showed Sparse-XM TRE was lower (statistically significant α = 0.05) than (i) weak labels, (ii) manual segmentation, and (iii) the uncorrected CT. Normality of the paired-difference was assessed and appropriate paired t- or Wilcoxon tests applied. Though the TRE reduction may seem small between Sparse-XM and manual segmentation, eliminating expert-dependent tasks allows the approach to be adopted more broadly for clinical applications. Clinically, a 0.5mm TRE reduction corresponds to 10% of a 5mm lumbar pedicle screw safety margin [1]. R3 (data diversity): We agree with R3 and human data collection is underway for a future study. R2 (similarity to prior work): We will add reference [2] and discussion per R2’s suggestion to acknowledge similar previously proposed methods. Our study differs from prior work in three key aspects. (1) Labels: Although other similar automatic labeling methods have been proposed, Sparse-XM generates masks that were directly used to train a segmentation model, as opposed to bounding boxes for an object detection model (e.g., YOLO). (2) Experimental setting: In contrast to explanted spines, our data were acquired in situ in porcine cadavers with surrounding soft tissue in a surgical cavity, a clinically relevant and challenging scenario. (3) Use of depth: D was not used for image-space mask generation in [2]. Our adapted HED network ingests D as a separate channel in addition to RGB to learn features from the geometry contained in depth and improved segmentation performance compared to RGB only (§2.6). R3 (other modalities): We will add citations and discussion to include registration frameworks using other modalities per R3’s suggestion. Fluoroscopy and ultrasound provide different features from RGB-D data, and a direct, quantitative comparison is outside the scope of this study. [1] Chua et al. “The Optimal Screw Length of Lumbar Pedicle Screws during Minimally Invasive Surgery Fixation” Asian spine J (2019) [2] Massalimova et al. https://arxiv.org/pdf/2410.01443

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

Sparse-XM: Spine Pose Adjustment with RGB-D Bone Segmentation via Cross-Modality Label Transfer

Author(s):