Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Cataract surgery is the most common surgical procedure globally, with a disproportionately higher burden in developing countries. While automated surgical video analysis has been explored in general surgery, its application to ophthalmic procedures remains limited. Existing research primarily focuses on Phaco cataract surgery, an expensive technique not accessible in regions where cataract treatment is most needed. In contrast, Manual Small-Incision Cataract Surgery (MSICS) is the preferred low-cost alternative in high-volume settings and for complex cases. However, no dataset exists for MSICS. To address this gap,we introduce Sankara-MSICS, the first comprehensive dataset containing 53 surgical videos annotated for 18 surgical phases and 3,527 frames with 13 surgical tools at the pixel level. We also present ToolSeg, a novel framework that enhances tool segmentation with a phase-conditional decoder and a semi-supervised setup leveraging pseudo-labels from foundation models. Our approach significantly improves segmentation performance, achieving a 38.1% increase in mean Dice scores, with notable gains for smaller and less prevalent tools. The code is available at https://github.com/Sri-Kanchi-Kamakoti-Medical-Trust/ToolSeg.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1776_paper.pdf

SharedIt Link: https://rdcu.be/eHw14

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05114-1_43

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/Sri-Kanchi-Kamakoti-Medical-Trust/ToolSeg

Link to the Dataset(s)

Sankara-MSICS: https://huggingface.co/datasets/SankaraEyeHospital/SankaraMSICS

BibTex

@InProceedings{SacBhu_PhaseInformed_MICCAI2025,
        author = { Sachdeva, Bhuvan AND Akash, Naren AND Ashraf, Tajamul AND Müller, Simon AND Schultz, Thomas AND Wintergerst, Maximilian W. M. AND Singri, Niharika AND Murali, Kaushik AND Jain, Mohit},
        title = { { Phase-Informed Tool Segmentation for Manual Small-Incision Cataract Surgery } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {446 -- 455}
}

Reviews

Review #1

Please describe the contribution of the paper

this work introduces a new dataset Cataract-MSICS for cataract surgery. in addition, they provide ToolSeg tool for segmentation. results are reported on the new dataset and other datasets.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- new dataset
- new segmentation tool evaluated on the new dataset and others.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- no mention or discussion to how to access to this dataset. not made public.
- no mention to the access to the code/weights of ToolSeg. not made public.
authors must make the dataset public. they must discuss clearly how one can acquire the data, where it is stored, and ensure long safe and accessible storage. it is better to have a dedicated website for this. you can use github.com. for storage, one can use https://huggingface.co/ or other permanent storage hosts. an open license must be used like https://creativecommons.org/licenses/by-nc-sa/4.0/

or a costume license but it must be clearly described. in this case, a EULA (End-user license agreement) must be provided.

code for TooSeg must be made public, including the weights of the model. you can use github.com

all this should have been provided to reviewers. this must be done in the rebuttal.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- the authors criticized the lack of public data. but their dataset is not made public. the same for the code/weights of ToolSeg.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

my main concern was not making the dataset available. authors promised to make their dataset public after acceptance. i recommend conditional accept.

Review #2

Please describe the contribution of the paper

The paper introduces Cataract-MSICS, the first dataset for Manual Small-Incision Cataract Surgery (MSICS), featuring 53 videos annotated for 18 phases and 3,527 frames with pixel-level labels for 13 tools. It proposes ToolSeg, a novel framework using a Phase-informed Conditional Decoder (PCD) to leverage surgical phase information and a semi-supervised approach using SAM 2-generated pseudo-labels to improve surgical tool segmentation performance
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Introduced the first dataset for Manual Small-Incision Cataract Surgery (MSICS), addressing an important but previously overlooked surgical procedure in automated analysis
2. Proposed ToolSeg with a unique Phase-informed Conditional Decoder (PCD) that effectively uses surgical phase predictions to improve tool segmentation
3. Employed a novel semi-supervised method using SAM 2 to generate extensive pseudo-labels, significantly boosting performance and reducing annotation needs
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The paper relies on predicted surgical phases for its core method (ToolSeg Variants v2, v3, v4 ) but doesn’t report the phase recognition accuracy of the MS-TCN++ model on the new Cataract-MSICS dataset. This prevents assessment of error propagation impacts on segmentation. Furthermore, as a new dataset contribution, benchmarking multiple phase recognition models would strengthen its value
2. The tool segmentation benchmark (Table 4) doesn’t compare with recent methods, especially transformer (e.g., MATIS, SurgicalSAM) and prompt-based approaches, harder to compare ToolSeg’s relative performance
3. The paper does not sufficiently justify why the complex mechanisms within the Phase-informed Conditional Decoder (PCD) (i.e., PAFT, DFBF, CGate ) are necessary / better compared to potentially simpler phase conditioning strategies Also does the author plan to open-source the benchmark?
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Insufficient comparison against state-of-the-art methods and the omission of phase recognition accuracy are required to assess the benchmark importance and method’s performance I would be happy to change my rating, based on rebuttal
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The authors introduce a newly collected and annotated dataset on MSICS Cataract surgery and establish a new phase-informed segmentation method.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The main strenght of the dataset is the collected datset. The dataset is the first on MSICS cataract surgery, filling a critical gap in the well-established Cataract Surgery data science community. The dataset is well-annotated, including phase and tool annotations and of a decent size with 53 videos.

The phase-tool cooccurence analysis is also interesting.

The tool segmentation method the authors propose is a interesting straight-forward addition to incorporate phase information into the well established U-Net architecture. The authors show results for many different variants to highlight the contributions of each component.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The main weakness of this work is the incomplete comparison of ToolSeg to SOTA methods.
- Comparison on Cataract-MSICS (Table 4): It seems that the SOTA methods were not trained on the pseudo-labeled data from Cataract-MSICS (judging from results for U-Net = ToolSeg v0) making the comparison to ToolSeg v4 and v7 unfair. A fair comparison would be to v2/v3 only. Ideally, the authors would retrain the SOTA methods including the pseudo-labels and compare to v4. Generally, for a fair comparison to non-phase aware methods, Phase Ground Truth can not be considered, making v5 to v7 invalid for this comparison.
- Comparison on CaDIS (Table 5): Similarly, methods for this comparison should either all be trained with pseudo-labels or without. (Compare to v3 or v4 respectively). Like above, v5, v6 and v7 are not relevant to this comparison. Another big issue with this comparison is the inexplicable difference in metrics compared to results previously reported on the CaDIS dataset. The authors report 52.69 mIoU for U-Net, while the original CaDIS paper (Ref. 10) reports much higher results (at least 66.6, depending on CaDIS task). The authors also inexplicably omit HRNetV2 for CaDIS, while a) they report its results on Cataract-MSICS, and b) it was the best-performing method on CaDIS in Ref. 10. Stronger methods have also been developed since, e.g. in “Pissas, T., Ravasio, C.S., Cruz, L.D., & Bergeles, C. (2021). Effective semantic segmentation in Cataract Surgery: What matters most? International Conference on Medical Image Computing and Computer-Assisted Intervention.”, which the authors also don’t compare against.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The comparison of ToolSeg against other methods is unsatisfactory for the reasons outlined above. I still lean towards acceptance based on the contribution of the dataset. I hope the authors can clarify and rework their comparison against the SOTA, especially on the CaDIS dataset.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

My original recommendation remains largely unchanged. I appreciate the authors addressing my concerns regarding the evaluation and SOTA comparison for ToolSeg. I can understand their reasoning for why they limit certain comparisons, but now there is still not convincing evidence that it represents a general performance incresase over previous methods. I see ToolSeg as a minor contribution. I lean only slightly towards acceptance, because I see the proposed dataset as a more valuable contribution. If the paper gets accepted, I ask the authors to include the crucial experimental details for CaDIS from the rebuttal in the final version of the paper, otherwise the shown results are misleading and confusing.

Review #4

Please describe the contribution of the paper

The paper presents a dataset for Manual Small-Incision Caratact Surgery, incorporating 53 surgical videos with annotations for surgical phase and tools. In addition to the dataset, the paper presents a tool segmentation model that integrates surgical phase priors and employs SAM 2 for pseudo-label generation.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper contributes an annotated dataset.
- Annotations include instruments and phases.
- The number of videos contributes to the variability of annotated cases, improving the data representation of the dataset.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Regarding the proposed segmentation methods, how does the described Phase-informed Conditional Decoder would compare with simpler feature aggregation methods (why is this model preferred over a simpler approach), for example, a learnable convolution for fusing the image and phase features?
- The use of pseudo-annotations using a foundation model like SAM and SAM 2 is an explored strategy (for example [1]).
[1] Huang, Ziyi, et al. “Push the boundary of sam: A pseudo-label correction framework for medical segmentation.” arXiv preprint arXiv:2308.00883 (2023).
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The introduction of a new dataset is an interesting contribution. However, the contribution of the phase-informer decoder of the model proposed, and the pseudo-labeling strategy can be a weakness of the approach, considering that simpler aggregation/feature fusion methods can be employed to incorporate the phase-based features, and this has not been deeply discussed. Similarly, the SAM-based pseudo-labeling strategy has become a common practice since the release of different vision-based foundation models.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

After reading the responses, I consider that the dataset is a contribution that can benefit future models for training and benchmarking. I would be more inclined toward accepting the paper.

Author Feedback

We thank the reviewers for their valuable feedback and will incorporate their suggestions. Dataset and Code Release (R1, R2): We would like to assure the reviewers that the Cataract-MSICS dataset and the codebase will be publicly released upon acceptance of this paper. Note: As per MICCAI guidelines, we cannot include external URLs in the rebuttal. We have contacted the program chairs seeking permission to do so, but haven’t heard back yet. Phase Prediction and Ground Truth Phase Evaluation (R2, R3): Our primary technical contribution is using surgical phase information as a prior for improving tool segmentation. To test our hypothesis, we train MS-TCN++ on I3D features from our dataset, achieving accuracy of 61.5% in phase prediction. These predicted labels were then used as inputs to the ToolSeg model (v2-v4) to generate phase-aware tool segmentations. To establish a performance upper bound, we also train our model with the ground truth phase labels (v5-v7). Our results demonstrate that utilizing phase labels (whether predicted or ground truth) can significantly boost segmentation performance. The performance gap between using predicted labels and ground truth labels can be bridged by better phase recognition models. Improving phase prediction is an open research direction, and is orthogonal to our focus on showing the value of phase priors and designing an effective mechanism to fuse them into the segmentation process. Discrepancy in IoU Scores of CaDIS (R3): The original CaDIS paper defines three tasks with varying levels of tool granularity. As our focus is on utilizing phase priors to improve tool segmentation, we use CaDIS Task II tool classes, and exclude anatomy (e.g., pupil, iris) and ‘other’ (e.g., hand, tape) categories merging them into the background. These excluded classes typically yield higher IoU scores due to their larger size and higher instance counts. As a result, our reported metrics are lower than those in prior work that includes these categories. Comparison with Ref [2] (R3): Pissas et al. [2] focus on optimizing loss functions and sampling for class imbalance. As this is orthogonal to our phase-informed segmentation method, we do not compare against it. SOTA Comparison (R2, R3): We compare our approach against multiple SOTA methods (e.g., MATISFrame, TernausNet, PAANet) on the Cataract-MSICS dataset. We exclude video-based methods like MATIS or TraSETR as they require sequential frames as input, while our method operates on individual frames. Label Propagation via SAM 2 (R4): While prior work has used SAM to generate pseudo-labels on images, our proposed method is different. For instance, Huang et al. [1] combine SAM with an iterative label correction method to detect and refine noisy labels, whereas we use SAM 2, a video-based network, to propagate high-quality annotations from seed frames to neighbouring frames within the video in a zero-shot manner. The effectiveness of our approach is demonstrated by the performance gains achieved by pretraining on pseudo-labels. Effect of Pseudo-labeling without Phase Information (R3): To isolate the impact of pseudo-labeling, we compare ToolSeg v0 (baseline U-Net) with v1 (U-Net with pseudo-labeled data). v1 outperforms v0 by 7.78% in IoU, demonstrating the effectiveness of pretraining on pseudo-labeled frames even without phase priors. Comparison with Feature Fusion Methods (R2, R4): We also experimented with other feature fusion strategies such as DAFT [3] on our dataset. These approaches underperformed compared to our proposed PAFT module by 8% DSC. PAFT’s learnable embeddings better capture phase-tool correlation. Additionally, our gating mechanism adds robustness to noisy phase predictions, yielding a further ~2% improvement. References [1] Z Huang, et al. Push the Boundary of SAM… arXiv’23 [2] T Pissas, et al. Effective Semantic Segmentation in Cataract Surgery… MICCAI’21 [3] S Pölsterl, et al. Combining 3D Image and Tabular Data.. MICCAI’21

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The final version should thoroughly include reviewer comments and suggestions if the paper is accepted.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This paper introduces Cataract-MSICS, the first dataset for Manual Small-Incision Cataract Surgery (MSICS), featuring extensive video and pixel-level annotations. While one reviewer expressed some remaining reservations about the generalizability of the proposed ToolSeg method’s performance increase, the authors addressed concerns regarding dataset availability by committing to public release and provided clarifications on comparisons. The dataset itself is considered a significant and valuable contribution to the community, benefiting future model training and benchmarking. Therefore, I recommend accepting it.

back to top

Phase-Informed Tool Segmentation for Manual Small-Incision Cataract Surgery

Author(s):