Abstract

Deformation recovery from laparoscopic images benefits many downstream applications like robot planning, intraoperative navigation and surgical safety assessment. We define tissue deformation as time-variant surface structure and displacement. Besides, we also pay attention to the surface strain, which bridges the visual observation and the tissue biomechanical status, for which continuous pointwise surface mapping and tracking are necessary. Previous SLAM-based methods cannot cope with instrument-induced occlusion and severe scene deformation, while the neural field-based ones are offline and scene-specific, which hinders their application in continuous mapping. Moreover, neither approach meets the requirement of continuous pointwise tracking. To overcome these limitations, we assume a deformable environment and a movable window through which an observer depicts the environment’s 3D structure on a canonical canvas as maps in a process named impasto. The observer performs panoramic impasto for the currently and previously observed 3D structure in a two-step online approach: optimization and fusion. The optimization of the maps compensates for the error in the observation of the structure and the tracking by preserving spatiotemporal smoothness, while the fusion is for merging the estimated and the newly observed maps by ensuring visibility. Experiments were conducted using ex vivo and in vivo stereo laparoscopic datasets where tool-tissue interaction occurs and large camera motion exists. Results demonstrate that the proposed online method is robust to instrument-induced occlusion, capable of estimating surface strain, and can continuously reconstruct and track surface points regardless of camera motion. Code is available at: https://github.com/bmpelab/trans_window_panoramic_impasto.git

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/4075_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/4075_supp.zip

Link to the Code Repository

https://github.com/bmpelab/trans_window_panoramic_impasto.git

Link to the Dataset(s)

https://github.com/bmpelab/trans_window_panoramic_impasto.git

BibTex

@InProceedings{Che_TransWindow_MICCAI2024,
        author = { Chen, Jiahe and Kobayashi, Etsuko and Sakuma, Ichiro and Tomii, Naoki},
        title = { { Trans-Window Panoramic Impasto for Online Tissue Deformation Recovery } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15006},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposed a two-step online method which includes optimization and fusion to obtain 3D panoramic map in deformation recovery from laparoscopic images.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is an original way to use data and can obtain 3D panoramic map in deformation recovery from laparoscopic images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The online method for tissue deformation recovery in this paper is [1], and the 3D reconstruction, 2D tracking method is [2], [3]. Thus, this paper appears to propose a simple fusion approach bridging three methods. The novelty of this paper is my main concern, especially need to compare [1]. [1] Chen, J., Hara, K., Kobayashi, E. et al. Occlusion-robust scene flow-based tissue deformation recovery incorporating a mesh optimization model. Int J CARS 18, 1043–1051 (2023). https://doi.org/10.1007/s11548-023-02889-z [2] Hui T W, Loy C C. Liteflownet3: Resolving correspondence ambiguity for more accurate optical flow estimation[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16. Springer International Publishing, 2020: 169-184. [3] L. Lipson, Z. Teed and J. Deng, “RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching,” 2021 International Conference on 3D Vision (3DV), London, United Kingdom, 2021, pp. 218-227, doi: 10.1109/3DV53792.2021.00032.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Three methods which includes the online method for tissue deformation recovery, the 3D reconstruction and 2D tracking method are bridged in this paper. The author needs to compared them and elaborates on innovations, especially for deformation recovery method. With the camera window removed, this simple fusion method looks like it can not predict the deformation of earlier frames.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The novelty of this paper is my main concern

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposes a way to recreate panoramic deformable endoscopic scenes from stereo video. They represent the deformation of the scene over time with a canonical map that transforms uv coordinates from a texture space to 3D surface space. New observations are fused using optical flow and scene flow in combination with a regularization to ensure smooth deformation in non-observed areas. They evaluate their method qualitatively on datasets with instrument masks, and quantitatively using depth estimation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The authors present a very novel and promising concept, mapping texture coordinates from 3D space to 2D enables representation of deformation. This could be very useful for tracking and mapping applications in the future.

    • The authors demonstrate the use for strain estimation, a promising application.

    • The authors show high-quality qualitative results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Weakness:

    • The authors mention prior methods cannot deal with occlusion. If methods have masks, as your method does then you are missing some literature. For example, SuPer[1], is capable of representing motion and tracking points.

    • The authors claim their method is online but there is no mention of speed.

    • The authors claim deformation recovery use a metric that evaluates depth reconstruction rather than deformation. There are datasets with labelled tissue deformation, of which [1] is one.

    • The authors mention they use only ‘smooth subsets’ for depth, but details are not provided.

    • It is unclear on how the old map is fused with the new in eqn. 2.5. What do we do about re-observations?

    • There is not an explanation for the empirical selection of the \alpha parameter for smoothness

    • There is no mention of GT acquisition, or how scanning is performed.

    • The authors state their method is performant under occlusion, but omit any mention of masks in their dataset. These are critical for their method to work.

    • Detail is not provided on long-term effects, such as performance when returning to regions?

    1. Li Y, Richter F, Lu J, Funk EK, Orosco RK, Zhu J, et al. Super: A surgical perception framework for endoscopic tissue manipulation with surgical robotics. IEEE Robotics and Automation Letters. 2020;5(2):2294–301.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I really enjoy the concept of using a canonical texture to represent 3D space, as the real world is 2.5D in a sense. Figure one is very compelling.

    I would recommend using NeRFs instead of NeF. I have not seen the latter abbreviation used.

    The use of ‘the observer’ in the text is confusing at times. Maybe rewording it as the camera, in camera coordinate space would make this easier to understand.

    I understand the use of the subscript c for canonical, but I would either: clarify this up front, pick another format or not use subscripts.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the paper is very impressive in its novelty, some of the implementation detail is unclear to me, and the claims seem overly large. Deformation recovery should not be claimed unless there is data showing such. Although depth is a good proxy, it does not actually provide ground truth for deformation. Lack of analysis on datasets with labelled deformation, in addition to implementation details and online performance being unclear are the reasons for my recommendation.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    I have changed my opinion from weak reject to accept.

    In the rebuttal, the author clarifies many of my points, such as the method being online (5s per frame), and noting that re-observation causes replacement of re-observed spaces. They note this could be amended with bundle adjustment. I appreciate the clarification of how ground truth is acquired, and although depth is an indirect proxy for measuring deformation, I agree this is how many papers quantify their method, which means it is not a limitation solely held by this paper. As said in the rebuttal: “There have been two widely used ways to evaluate the accuracy of deformation, geometry-based and tracking-based.”, where geometry-based denotes usage of depth maps.

    I would also like to agree with the rebuttal’s response to R1. I see this paper as an novel and different method than the [1] cited by R1.

    I recommend an accept due to the extremely interesting novelty of this methodology presented, in combination with the clarification provided in the rebuttal on ground truth.

    1/4 rank in rebuttal stack.



Review #3

  • Please describe the contribution of the paper

    The paper proposes a new online appoach to recover the deformable 3D surface from 2D laparoscopic images and 3D point clouds. The paper proposes one energy function for optimization to estimate the surface map deformation based on the measured scene flow map. The esitmation of the deformation is followed by a fusion between the estimated deformed surface and observed surface to get the final recovered surface. The optical flow is further utilized to estimate the surface strain. Authors claim that the approach can provide online performance and higher accuracy than the existed methods in addition to enable a continuous pointwise tracking.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main novelty of the work is using optimization to estimate the deformation. The paper proposed a novel way to handle the surface deformation recovery issue by using the modeling of the canonical canvas and the proposed energy function for the optimization.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The implementation details of the paper are not well-illustrated, which can hinder reproducibility. Additionally, the performance/efficiency and limitations of the approach are not discussed in details.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    More details should be added to increase the reproducibility of the work. I have some questions regarding the proposed method.

    1. How is the canonical canvas defined? Is it defined manually? or is it defined by using some evaluation of the scene flow map?
    2. What optimization method is used to estimate the deformation map? Do the different start values (canonical scenes and frames) affect the optimized result?
  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall, the paper proposes a novel method to estimate the surface deformation based on the measured scene flow map. The writing should include more implementation details (see the above comments related to reproducibility) to increase the reproducibility of the work. Also, the authors need to discuss the limitations and performance more thoroughly.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The results shown in the paper very promising. The novelty that uses the optimization method to estimate the surface deformation for the unobservable region and the proposed energy function can be an inspiration for others in the field.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

-Concern about the novelty from Reviewer #1:

We appreciate the concern of reviewer #1 about the novelty of the proposed work as compared to the previous paper [1] (J. Chen, Int J CARS, 2023). However, we should claim that the proposed method is different from [1] and is not a simple implementation or incrementation of [1]. We hereby claim the major differences between [1] and the proposed work.

Limitation of [1]: In [1], the connectivity of vertices of the tissue mesh maintains the same over iterations, which leads to the major limitations (cannot deal with newly appearing surface, cannot deal with surface continuity change). As a consequence, [1] can neither be used in the SLAM-like task, where camera is moving around and new structure continuously appears, nor in the task of dissection, where the tissue surface continuity changes. This is also the reason why we did not compare with [1] in the experiment, as [1] does not work in these cases.

How we overcome the limitation of [1]: As for the proposed method, we do not make use of the mesh for representing the 3D structure given the limitations as in [1]. Instead, we propose a novel representation of the 3D structure, which is described as “canonical canvas” and “impasto” in the article. This new 3D representation can be simply described as a 3D-2D mapping, which naturally has derivative structure and is useful in geometry optimization for guaranteeing spatiotemporal smoothness.

The new 3D representation is much more flexible than mesh, such that we can propose to optimize and fuse the newly appearing surface to the previously modeled scene. In the experiment part, we demonstrate the proposed method in the case of dissection and camera moving, which is impossible with the previous work [1].

-Concern from Reviewer #3 and #4:

Canonical canvas: Canonical canvas is initialized as a borderless 2D space corresponding to the first camera pose and will move together with the camera, such that the canvas is always static to the window (camera). On this canvas, the geometry of the scene is represented, optimized and fused.

Optimization of deformation map: We do not directly optimize the deformation map. Instead, we optimize the geometry map (M) as in equation 4. Since there exists spatiotemporal connectivity of the geometry map, immediately we can derive the optimized deformation map from the optimized geometry map.

Re-observation: It is currently not our focus in dealing with the long-term issues. Now the algorithm will simply replace the old scene if a new scene is detected in the same space. However, some techniques, such as bundle adjustment and graph-based optimization, can be implemented to guarantee long-term stability.

Online vs real-time: We carefully pick the word “online” rather than “real-time” to avoid misunderstandings in the discussion of latency. Now the method may take around 5 seconds to compute each new coming laparoscopic image. However, this does not hinder the method for future intraoperative applications. The latency can be reduced with proper code optimization and parallel computation.

Instrument mask: We agree that instrument masks are important to accurately remove the inference of instrument from the 3D reconstructed scene, and we will add such discussion in the revision. The instrument masks are provided in all the datasets used in the experiment.

Ground truth (GT) acquisition and evaluation method: There have been two widely used ways to evaluate the accuracy of deformation, geometry-based and tracking-based. We appreciate the reviewer’s recommendation of a dataset that enables the evaluation in latter way, which may help to better demonstrate the proposed method. In GT acquisition, a thin tissue was fixed on a grid, and we temporarily paused the movement of the instrument and performed the 3D scan from behind, such that we can get the GT even in the occluded area. The scanned GT was used to represent the current deformation status of the tissue.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The rebuttal addressed reviewers’ questions well. The technical novelty of this paper is sufficient. Accept.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The rebuttal addressed reviewers’ questions well. The technical novelty of this paper is sufficient. Accept.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top