Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Video-based Surgical Navigation (VBSN) inside the articular joint using an arthroscopic camera has proven to have important clinical benefits in arthroscopy. It works by referencing the anatomy and instruments with respect to the system of coordinates of a fiducial marker that is rigidly attached to the bone. In order to overlay surgical plans on the anatomy, VBSN performs registration of a pre-operative model with intra-operative data, which is acquired by means of an instrumented touch probe for surface reconstruction. The downside is that this procedure is typically time-consuming and may cause iatrogenic damage to the anatomy. Performing anatomy reconstruction by using solely the arthroscopic video overcomes these problems but raises new ones, namely the difficulty in accomplishing keypoint detection and matching in bone and cartilage regions that are often very low textured. This paper presents a thorough analysis of the performance of classical and learning-based approaches for keypoint matching in arthroscopic images acquired in the knee joint. It is demonstrated that by employing learning-based methods in such imagery, it becomes possible, for the first time, to perform registration in the context of VBSN without the aid of any instruments, i.e., in an instrument-free manner.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3454_paper.pdf

SharedIt Link: https://rdcu.be/dV5xn

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72089-5_32

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3454_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Bap_Keypoint_MICCAI2024,
        author = { Baptista, Tânia and Raposo, Carolina and Marques, Miguel and Antunes, Michel and Barreto, Joao P.},
        title = { { Keypoint Matching for Instrument-Free 3D Registration in Video-based Surgical Navigation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15006},
        month = {October},
        page = {339 -- 348}
}

Reviews

Review #1

Please describe the contribution of the paper

Enable registration of pre-operative bone model with intra-operative data such as video in a touchless manner (without using instrumented touch probe). This paper focuses on its application for knee surgery, which is argued to be complex due to its lack of texture for keypoint detection and matching.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Clinical Feasibility. The paper proposes a registration solution that does not require instrumented touch probe, which is important for during surgery to avoid undesired complications.
- Sufficiently detailed experimental setup. Authors comprehensively described the dataset, how image pairs were chosen, and registration data generation.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
- Lack of technical novelty. This paper used a standard pipeline for keypoints matching and standard U-Net for Semantic Segmentation in Arthroscopy, and only applied several combination of existing feature extraction and feature matching.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

Parameters for the different feature extraction and matching methods were not discussed in detail.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
- While performing comprehensive comparison of methods is useful, but the current state of the paper is lacking in novelty. It would be more impactful if the authors could manage to remove WM while being able to perform the registration with a new network.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Reject — should be rejected, independent of rebuttal (2)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper mainly discusses the performance of different variation of existing feature extraction and matching, thus lack of novelty for MICCAI.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Weak Reject — could be rejected, dependent on rebuttal (3)
[Post rebuttal] Please justify your decision

Authors provided some clarification on their contributions and existing works, but the work is still lacking novelty for MICCAI.

Review #2

Please describe the contribution of the paper

This paper introduces a validation of several image feature matching methods within the arthroscopic scene. Additionally, it presents a novel approach utilizing an artificial landmark to enhance 3D reconstruction.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

I found the idea of using an artificial landmark to estimate the 6 DoF position and orientation of the scope interesting.

The paper is well organized and easy to follow.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The novelty of the method is limited. The concept of performing 3D reconstruction using SLAM/SfM and subsequently registering with pre-operative MRI/CT for surgical navigation has been extensively discussed in previous literature. Many such systems are already in practical use in operating rooms. This paper claims its major novelty as to use the artificial markers to assist the tracking of the scope in the arthroscopic environments, because “the very low texture of bony surfaces is an added challenge” (page 3, line 6). However, since this method still rely on image features to match between two images to obtain 3D information, why cannot the image features also be used by the SLAM algorithms? Note that this is a very easy fix since feature matching is usually an independent module of SLAM.

This paper claims the proposed method as a “touchless” and “instrument-free” navigation approach, but this is not true since the artificial landmark needs to be firmly attached to the tissue.

Another significant aspect of this paper is its evaluation of various existing image feature extraction and matching methods, providing valuable data to the community. However, this contribution may lack the necessary novelty expected for MICCAI.

Thus, the primary weakness of this paper is its lack of novelty.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

The code or related paper to recognize the landmarks is needed.
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

My major concerns are above.

Please give the CT segmentations, 3D surface models and their alignments for all cases. If there is space limitation you can list at least one and put others in the supplementary materials.

I suggest avoiding the claim of being the ‘first’ touchless navigation method in the abstract. Furthermore, it’s important to acknowledge that this method requires additional instruments, which is the artificial landmark. There exists navigation approaches that do not needs any additional devices.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Reject — should be rejected, independent of rebuttal (2)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The novelty of the approach is constrained, and the motivation is unclear. If the bottleneck of video-based 3D reconstruction in arthroscopic environments is attributed to feature matching and serves as the motivation/rationale for utilizing artificial landmarks, then why continue to rely on image features for obtaining a 3D point cloud through triangulation?
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

Video-based Surgical Navigation (VBSN) is often used during an arthroscopic procedure, facilitated by registering surgical plans to the anatomy, by localizing anatomical landmarks with a tracked probe and performing a point-cloud-based registration of pre-op images to the target anatomy. Because this is a time-consuming and invasive process, there is an unmet need to develop a registration means that does not require localization of physical landmarks. The authors propose a methodology using a single implanted bone marker that allows the pose of the arthroscope to be determined at all times during the procedure. They validate this approach via a systematic analysis of classical and ML approaches for matching these key points in the knee and demonstrate that the DKM (Dense kernelized feature matching) approach, using the tracked marker, has the potential to register ACL tunnels in a touchless manner with sufficient accuracy for clinical implementation.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors have looked at traditional methods of performing touch-less registration of the pre-operative to the intra-operative environment, and identify their shortcomings. Their approach relies solely on images, that include a WM (implanted fiducial marker) captured by the arthroscope. The use of such an implanted marker is certainly novel, but has its downside as well (see below). The inclusion of such a marker in all views definitely leads to the ability to robustly track the camera pose – leading to significant improvements over a purely SLAM-based approach. Because all acquired images are explicitly acquired with respect to the WM-defined coordinate system, surface matching can be robustly captured over a wide field of view. Another advantage of their approach is that, because they have access to continuous known camera motion acquired at a high frame rate, they have the ability to select appropriate baselines that optimize 3D triangulation. To complement the tracking based on the WM, the authors have also developed a deep-learning model for automatic semantic image segmentation to classify images into bone (rigid) and cartilage (non-rigid) components, providing a segmentation mask to ensure that only rigid components are part of the surface matching process.

They have demonstrated the robustness of this approach by comparing its performance who 5 other state-of-the-art methods from the literature.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

The WM is rigidly fixed to the bony anatomy in the vicinity of the repair site. It would be very helpful to the reader to have been provided with more details of this device. What are its dimensions? What is it made of? How is it implanted? What kind of tracking precision does it provide? Although the matching procedure is touted as a “touchless” approach, isn’t the fact that it is fastened to the bone using a pin itself invasive? These questions, along with an indication fo the feasibility of including the WM implantation step, need to be addressed in the discussion.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

Because this procedure seems to be heading toward clinical application, it would be extremely waluable to include the surgeon’s perspective of how practical this approach would be, and how it would affect the workflow. Also, comments on the cost-benefit of implementing this approach in the clinic.

Figure 2 contains far too much information, It would be helpful to have the key components of DKM in relation to the other approaches demonstrated, summarized more succinctly.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Accept — should be accepted, independent of rebuttal (5)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is a compelling example of how an innovative approach (implantation of the WM in the field of view of the arthroscope) can facilitate image guidance.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

Accept — should be accepted, independent of rebuttal (5)
[Post rebuttal] Please justify your decision

I see no reason to change my original score

Author Feedback

We appreciate the reviewers’ constructive comments. While R4 considers our work an “innovative approach” and R3 states it provides “valuable data to the community”, R1 and R3 believe the main weakness of the paper is its lack of novelty. Additionally, R3 and R4 consider that the major contribution of the paper is the use of a fiducial marker for camera pose estimation, which is not what we intended to convey. We hope this rebuttal provides some clarification. Video-based Surgical Navigation as proposed in [16] already utilizes a fiducial marker (WM) that is implanted in the anatomy for the purpose of camera pose estimation. The contribution of this work is, as explained in the last paragraph of page 2, the demonstration that it is possible to perform “3D surface reconstruction and registration without using any instrumentation for digitization” in the context of arthroscopy. To the best of our knowledge, there exists no system or method capable of registering a 3D model with bone anatomy solely from arthroscopic footage and without the aid of instruments such as touch probes or structured-light devices. As stated in the last paragraph of Section 1, our only “required instrumentation is a WM rigidly attached to the anatomy” whose implantation does not disrupt the normal course of the medical procedure as it takes less than 30 seconds. Answering R4’s questions, the WM is a metal 3mm cube with an attached thread that is screwed into bone and provides submillimetric tracking accuracy. Its implantation is invasive but by being placed in bone (and not in cartilage or soft tissue), surgeons are not concerned about it causing any damage to the anatomy. Contrary to R3’s statement that there exist systems that perform SLAM without the aid of any fiducial markers, the literature reports that previous attempts to perform SLAM in arthroscopic footage were unfruitful [16]. For this reason, instead of going completely markerless, we decided to keep the WM and start by removing the probe from the procedure to accomplish touchless registration, this being the main contribution of our work. Given the recent advances in keypoint matching in challenging and low-textured scenarios, and since classical feature extraction approaches perform poorly in such conditions, this paper assesses the performance of different learning-based matchers in the task of 3D reconstruction from arthroscopic footage that is dominated by low-texture, floating debri, moving tissue and specularities. These adverse conditions cause existing matchers to provide many wrong matches, which are filtered out by the known epipolar geometry retrieved by accurately tracking the WM. Without the WM, the extracted correspondences would be highly contaminated with outliers, hampering 3D registration. Another important contribution of our work is a new deep-learning model for semantic segmentation in arthroscopic video. Such model allows to identify regions in video frames that correspond to bone and cartilage such that correspondences in other anatomical parts such as ligaments or tissue that pass the epipolar verification are filtered out. In conclusion, it is demonstrated, for the first time, that it is possible to accomplish touchless registration in the context of arthroscopy by using recent learning-based feature matchers combined with accurate camera pose estimation and semantic segmentation. This opens the way to important applications in the medical field. Addressing particular comments: [R1] “Parameters […] not discussed” - We will include details on the parameters used in each method. [R3] “[…] bottleneck […] attributed to feature matching […]” - This work demonstrates that the recent advances in the literature of feature matching now make 3D reconstruction and registration in arthroscopic environments a possibility, when combined with accurate camera pose estimation and semantic segmentation.

Meta-Review

Meta-review #1

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Reject
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

While some reviewers provided positive feedback, the rebuttal addressing the technical novelty issue raised by another reviewer remains insufficient. Additionally, the reproducibility of the paper has not yet been verified.
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

While some reviewers provided positive feedback, the rebuttal addressing the technical novelty issue raised by another reviewer remains insufficient. Additionally, the reproducibility of the paper has not yet been verified.

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

While one reviewer is positive, the others express limited novelty as a primary reason for rejection. Since this is CAI, I consider this paper to be borderline.
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

While one reviewer is positive, the others express limited novelty as a primary reason for rejection. Since this is CAI, I consider this paper to be borderline.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This is an interesting CAI paper. However, I would argue that the “touchless” claims made are not entirely accurate, and the title should be changed accordingly to reflect this. The method still requires the mounting of a fiducial onto the bone anatomy, which is invasive, even though it does not affect the surgical procedure. The validations on specimens represent an important step forward, as such evaluations are always challenging and provide crucial insights. While there is no technical novelty in the approach, using a small fiducial marker in the view of endoscopy adds an interesting aspect to this work.
What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

This is an interesting CAI paper. However, I would argue that the “touchless” claims made are not entirely accurate, and the title should be changed accordingly to reflect this. The method still requires the mounting of a fiducial onto the bone anatomy, which is invasive, even though it does not affect the surgical procedure. The validations on specimens represent an important step forward, as such evaluations are always challenging and provide crucial insights. While there is no technical novelty in the approach, using a small fiducial marker in the view of endoscopy adds an interesting aspect to this work.

back to top

Keypoint Matching for Instrument-Free 3D Registration in Video-based Surgical Navigation

Author(s):