List of Papers Browse by Subject Areas Author List
Abstract
Unsupervised video-based surgical instrument segmentation has the potential to accelerate the adoption of robot-assisted procedures by reducing the reliance on manual annotations. However, the generally low quality of optical flow in endoscopic footage poses a significant challenge for unsupervised methods that rely heavily on motion cues. To overcome this limitation, we propose a novel approach that pinpoints motion boundaries, regions with abrupt flow changes, while selectively discarding frames with globally low-quality flow and adapting to varying motion patterns. Experiments on the EndoVis2017 VOS and EndoVis2017 Challenge datasets show that our method achieves mean Intersection-over-Union (mIoU) scores of 0.75 and 0.72, respectively, effectively alleviating the constraints imposed by suboptimal optical flow. This enables a more scalable and robust surgical instrument segmentation solution in clinical settings. The code is publicly available at https://github.com/wpr1018001/Rethinking-Low-quality-Optical-Flow.git
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1043_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: https://papers.miccai.org/miccai-2025/supp/1043_supp.zip
Link to the Code Repository
https://github.com/wpr1018001/Rethinking-Low-quality-Optical-Flow.git
Link to the Dataset(s)
N/A
BibTex
@InProceedings{LiuYan_MotionBoundaryDriven_MICCAI2025,
author = { Liu, Yang and Wu, Peiran and Huo, Jiayu and Zhang, Gongyu and Yuan, Zhen and Bergeles, Christos and Sparks, Rachel and Dasgupta, Prokar and Granados, Alejandro and Ourselin, Sebastien},
title = { { Motion-Boundary-Driven Unsupervised Surgical Instrument Segmentation in Low-Quality Optical Flow } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15968},
month = {September},
page = {374 -- 383}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper introduces an unsupervised method for surgical instrument segmentation by leveraging motion boundaries in low-quality optical flow. The key method includes: High-Quality Area Matching (HQAM) – Identifies reliable motion boundaries while ignoring noisy interior flow. Low-Quality Case Dropping (LQCD) – Discards frames with globally unreliable flow to reduce supervision noise. Variable Frame Rates – Adapts to subtle instrument movements by randomly sampling frame intervals. The method achieves 0.75 mIoU on EndoVis2017 VOS and 0.72 mIoU on EndoVis2017 Challenge, outperforming prior unsupervised approaches without relying on manual annotations or shape priors.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The concepts of the method is interesting. Authors proposes HQAM which explicitly targets motion boundaries, that are more reliable than interior flow, improving robustness. And they also propose LQCD, which can dynamically discards worst-case frames, reducing error propagation.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Presentation of the experimental results: Authors compared the proposed method with two unsupervised baseline method FUN-SIS and AGSD. Although achieving better performance in terms of Mean IoU, they do not present the visualization results like Fig. 4. Similarly, no visualized results for the ablation study. And no highlight visualized results for the challenging cases in the manuscripts (stationary instruments, dark areas and abrupt movements).
- Hyperparameter sensitivity: HQAM’s angle threshold (α) and dilation (d) impact performance (Table 3). On other datasets, results could be different and parameters may require manual tuning.
- Completeness of the experiments: Authors only validate the method on EndoVis 2017 dataset (with two variations). Another commonly adopted dataset for evaluating tool segmentation is EndoVis 2018. Authors should also evaluate on this dataset to show the robustness of the method.
- The method has flaw in handling the moving backgrounds. Based on the results from the supplementary video, when there is deformation on the tissue, the proposed method could mistaken it as the tool.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(2) Reject — should be rejected, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The motivation and method on unsupervised surgical instrument segmentation sounds reasonable, but the results are not very impressive. Authors fail to demonstrate the robustness of the method.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
The paper presents an unsupervised surgical instrument segmentation framework that leverages optical flow priors. It introduces a High-Quality Area Matching block to emphasize reliable flow regions, a Low-Quality Case Dropping mechanism to discard frames with severely degraded flow, and a variable frame-rate scheme designed to better capture subtle instrument motions.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well-written and well-organized.
- It presents point-to-point solutions to address potential risks arising from low-quality optical flow, with the limitations of such flow clearly illustrated.
- Overall, the evaluations are convincing, and the challenges related to setting hyperparameters are appropriately discussed.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The segmentation results remain quite coarse for real-world applications. While unsupervised methods are appealing in such scenarios, their practical applicability may be limited. After all, manually labeling or simulating segmentation masks is not particularly costly or prohibitive.
Beyond labeling, there are other, potentially more critical challenges to address for real-life deployment—such as visual artifacts and out-of-distribution inputs. That said, this limitation is not unique to this work but is shared by all unsupervised methods. Some discussions will be helpful.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper provides point-to-point solutions to address potential challenges by leveraging priors from optical flow for unsupervised segmentation, which is both reasonable and well-articulated.
- Reviewer confidence
Not confident (1)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The decision remains unchanged. The above commented limitations are not specific for this paper but for all unsupervised segmentation methods.
Desipte the coarse results and validation only on Endvis 2017, the paper still provides point-to-point solutions to overcoming potential challenges in unsupervised surgical instrument segmentation.
Review #3
- Please describe the contribution of the paper
This work proposes three improvements to guide unsupervised surgical tool segmentation based on a model for video object segmentation in natural scenes (RCF). These improvements are (1) a High-Quality Area Matching based on the assumption that spatial motion boundaries have more reliable signal, enforcing an optical flow loss at these areas, (2) Low-Quality Case Drop, where frames are removed per batch based on loss, and (3) Variable Frame Rates at training, to ensure the extracted optical flow captures a wider range of motion.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Interesting adaptation of natural scene techniques to the surgical domain
- Convincing quantitative performance with additional strengths considering end-to-end design and adaptability to other surgeries
- Paper is well-written
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Unclear terminology when presenting benchmarks (different “stages” in Table 1)
- Incomplete ablation study
- Limited qualitative comparison (does not show results from baselines for surgical tool segmentation)
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
While the presented results demonstrate that the proposed method outperforms other unsupervised methods (AGSD and FUN-SIS Stages 1 and 2), it is not immediately clear what these “stages” refer to, as the term is not explicitly defined in the manuscript. Clarifying this terminology would help better contextualize the comparison. Although FUN-SIS Stage 3 outperforms the proposed, this work provides additional strengths beyond this comparison, including an end-to-end design and independence from shape priors still providing strength in the novelty of this work.
The ablation study is appreciated to understand the effect of some combinations of the three proposed improvements (HQAM block, LQCD block, and variable frame rates). However, these components were not tested in isolation, or with the remaining combinations. As a result, it is difficult to fully disentangle the effect of each component. Expanding the ablation would strengthen the analysis, though the current results still demonstrate the effectiveness of the proposed approach.
Finally, the qualitative comparisons are only made with RCF, rather than the other surgical-domain baselines. Including domain-relevant visual comparisons would better highlight the areas of the segmentation that improve based on the motion boundaries. Considering the quantitative results and other strengths, this does not critically detract from the work.
Overall, while there are areas where further clarification or analysis would strengthen the work, the proposed method offers a distinct approach by adapting techniques for natural scenes (RCF and Super-BPD) to the surgical domain, achieving convincing quantitative results, resulting in my verdict.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
I appreciate the further clarifications provided in the rebuttal and the authors’ commitment to addressing them in the manuscript. I believe this work is valuable considering the necessity of reliable segmentation for data curation, and additionally has broader implications for downstream algorithms that rely on segmentation as a backbone.
Author Feedback
We thank three reviewers the your careful reading and for recognizing our work’s novelty, rationale, and effectiveness. Though confident in its simple yet powerful design, we see where it can improve. Your suggestions, adding EndoVis2018 (E18) validation, expanding ablation studies, enriching visualizations, and broadening clinical impact and future-work discussion, have greatly strengthened the manuscript. Reviewer #1. Thank you for raising key questions that are crucial for clarifying our core contribution. R1Q1: Qualitative visualizations. Earlier unsupervised efforts, FUN-SIS and AGSD pose reproducibility challenges: FUN-SIS involves multiple intricate stages and relies on unpublished shape priors and without released code, while AGSD depends on surgery-specific signals that limit generalization. Consequently, our initial visual comparisons targeted RCF, a more up-to-date, fully unsupervised, open-source baseline in the natural image domain. Per your suggestion, we will include comparison with AGSD. Reproducing FUN-SIS remains infeasible without its proprietary data. R1Q2: Challenging-case examples. Sorry for any confusion. These scenarios degrade flow quality and introduce noisy supervision across all frames, not just in the highlighted examples. In our manuscript, we illustrate this with dark-region cases (1 & 3) and a static-tool case 4( top-right), showing how poor flow propagates errors and reduces overall performance. R1Q3: Hyperparameter sensitivity. We acknowledge that unsupervised settings amplify parameter effects. In Sec. 4.2.3 and the Conclusion, we discussed sensitivity. Crucially, even under worst-case settings our method still outperforms RCF the baseline (From 46.09 to 64.74). Robustness is a key direction for future work. R1Q4: E18 evaluation. To align with prior work we didn’t use E18. We agree that broader evaluation is valuable and useful to the following work, and will add experiments. R1R5:Tissue-deformation misclassification. Our model infers from single frames (no temporal flow) at test time. Misclassification of tissue occurs when its appearance resembles instruments not deformation. Will claim clearly. Reviewer #2, thanks for your humility in noting limited domain familiarity but offering valuable insights for our future work. Hope our perspectives may also inspire further discussion. R2Q1: Practical applicability of unsupervised methods While supervised models (e.g., SAM) can produce finer masks, annotating every frame in a surgical video, often thousands per procedure, is far more laborious than labeling images. Unsupervised methods remain crucial for these unlabeled datasets. For instance, our motion cues could augment SAM-style instance segmentation for zero-shot tool delineation, and the same principle may benefit other video tasks like depth estimation or action recognition. Though we cannot explore each extension here, these avenues are promising for future work. R2Q2: Other research challenges We agree that handling visual artifacts and out-of-distribution frames is critical. In our Future Work section, we will briefly discuss that. Reviewer #3, we deeply appreciate your senior-level analysis and the precision of your feedback, both in detailed points and overall structure. Your comments have greatly clarified areas for improvement. R3Q1:Undefined “stages” in Table 1. You’re right that we omitted a clear definition due to space and reliance on prior work. We will add a concise explanation of different stages R3Q2: Components tested in isolation. Although Tab. 2–3 report single-module results for main modules HQAM and LQCD, we agree this may not be obvious. We will reorganize Table 2 to present each components individually, alongside all pairwise combinations, for clearer interpretation. R3Q3: Qualitative comparisons. Following R1Q1, our focus was on RCF, but we recognize the value of broader baselines. We will include visualizations with AGSD.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
Although the paper has received one rejection, the other reviewers have identified the potential of the work to the larger SDS community. I vote for acceptance.