Abstract

Laparoscopic liver surgery, while minimally invasive, poses significant challenges in accurately identifying critical anatomical structures. Augmented reality (AR) systems, integrating MRI/CT with laparoscopic images based on 2D-3D registration, offer a promising solution for enhancing surgical navigation. A vital aspect of the registration progress is the precise detection of curvilinear anatomical landmarks in laparoscopic images. In this paper, we propose BCRNet (Bezier Curve Refinement Network), a novel framework that significantly enhances landmark detection in laparoscopic liver surgery primarily via the Bezier curve refinement strategy. The framework starts with a Multi-modal Feature Extraction (MFE) module designed to robustly capture semantic features. Then we propose Adaptive Curve Proposal Initialization (ACPI) to generate pixel-aligned Bezier curves and confidence scores for reliable initial proposals. Additionally, we design the Hierarchical Curve Refinement (HCR) mechanism to enhance these proposals iteratively through a multi-stage process, capturing fine-grained contextual details from multi-scale pixel-level features for precise Bezier curve adjustment. Extensive evaluations on the L3D and P2ILF datasets demonstrate that BCRNet outperforms state-of-the-art methods, achieving significant performance improvements. Our code is available at https://github.com/jinlab-imvr/BCRNet.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1649_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/jinlab-imvr/BCRNet

Link to the Dataset(s)

L3D dataset: https://github.com/PJLallen/D2GPLand P2ILF dataset: https://p2ilf.grand-challenge.org/

BibTex

@InProceedings{LiQia_BCRNet_MICCAI2025,
        author = { Li, Qian and Liu, Feng and Yang, Shuojue and Shen, Daiyun and Jin, Yueming},
        title = { { BCRNet: Enhancing Landmark Detection in Laparoscopic Liver Surgery via Bezier Curve Refinement } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15969},
        month = {September},
        page = {75 -- 85}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    • This work presents an approach that casts the liver landmark segmentation task as a Bézier curve fitting problem. The method consists of multi-scale feature extraction component using foundation models, a curve proposal component and a curve refinement component.

    • Experiments are performed on two datasets, presenting superior results than baseline models in terms of Dice, IoU and ASSD.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • This work is the first application of the Bézier curve for segmenting landmarks of laparoscopic images, with potential benefits for downstream 2D-3D registration tasks for liver navigation.

    • The network Integrates foundation models to extract robust and effective features at multiple scales, including SAM and depth anything models.

    • Experiments were carried out on two different datasets, presenting good performance of the proposed method.

    • The manuscript is well-organized and clearly presented, with elaborate figures and tables that effectively support the presentation.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Method:

    • In my view, most components of the method are directly adopted from DeepSolo/DeepSolo++, including multi-scale feature extraction (modified with two foundation models) <—> Model Architecture, Adaptive Curve Proposal Initialization <—> top-K Bezier Curve proposals, Hierarchical Curve Refinement <—> Point Query Modeling for Text Spotting (left parts are sections in this paper and right parts are in DeepSolo paper), and also several equations. However, no explicit acknowledgment of DeepSolo/DeepSolo++ is provided. The authors should clearly delineate which components are adapted from previous work and which aspects constitute their original contributions. Or the author can clarify that this is not the case.

    • In sec 2.1, the author mentioned AdelaiDepth is applied to extract depth maps, while Fig.1 indicates depth anything is the method for depth estimation. So I wonder if there are different depth estimation methods employed in practice? Also, AdelaiDepth is the one utilized by D^2GPLand, if I checked it correctly.

    • what is Q_c in fig. 1? There are only Q, Q_p, Q_s in sec. 2.3

    Evaluation:

    • For the P2ILF dataset, where landmarks are annotated with a 1-pixel width, it is unclear whether any augmentation was performed on the dataset. Figure 2 suggests that the landmarks might have been dilated. If so, for how many pixels? Could the author provide more details on this?

    • Will the number of Bézier control points influence the final results? In my view, the fewer control points, the more regular and rigid the curves will be, specifically in laparoscopic surgery, liver can undergo large deformation caused by surrounding tissues and instruments, resulting in various shapes of the landmarks, and sharp corners in extreme cases. Could the author elaborate on the selection of six control points? Will more control points bring higher accuracy? Is it a tradeoff between accuracy and inference time?

    • Details on the evaluation results are missing: If I checked it correctly, the Dice, IoU and ASSD appear to be the mean values over all three different landmarks. Did the author weigh the landmarks differently or equally when calculating the mean? Since the size of the three landmarks is different, i.e., rigid lines and contours are often larger than falciform ligaments, micro averaging or other metrics taking sizes into account, e.g. FWIoU, are highly recommended for a fair comparison. Additionally, in 2D-3D rigid alignment task, for example, the three landmarks work differently in the liver pose optimization. For example, contour lines are viewpoint-based, which results in ambiguities, combining with ligament can alleviate such ambiguities. Therefore, the authors are encouraged to report performance on three different landmarks and their means. Or can the author clarify if this does not apply here?

    • Could the author report inference time for one image?

    Misc:

    • sec 4, “…represent critical anatomical landmarkss,”, duplicated s
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While the paper is the first application of Bézier curves in 2D liver landmark segmentation, significant concerns remain regarding the unclear contribution relative to prior work, and incomplete evaluation reporting.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    All my main concerns are addressed by the authors. Although the innovation of the algorithm seems to be limited, the application of deep Bezier curve fitting in liver segmentation is something new. Therefore I increase my rating and I recomment the authors structure their paper in a different way to highlight the contribution.



Review #2

  • Please describe the contribution of the paper

    The paper proposes BezierLandNet, a landmark detection framework for laparoscopic liver surgery that models anatomical structures using Bezier curves. It introduces modules for adaptive curve proposal and hierarchical refinement, enabling accurate, continuous landmark representation. The method outperforms prior approaches on L3D and P2ILF datasets.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The model combines RGB and depth cues, and leverages frozen SAM features for stronger anatomical awareness. This multi-modal integration enhances landmark detection in challenging laparoscopic scenes with clutter, deformation, or low contrast.
    2. Strong evaluation and generalization: Extensive experiments on two public datasets (L3D and P2ILF) show consistent improvements in Dice, IoU, and ASSD over 16 baseline models.
    3. The HCR module is novel in surgical image analysis, offering a robust way to iteratively improve landmark precision under occlusion or visual noise.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Since SAM2 has been proposed, this paper needs to discuss and compare medical image segmentation methods based on SAM2, such as:
      • [*1] SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation
      • [*2] SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More
    2. Many experimental details are missing, such as the size of SAM (B, L, H?), the structure of the CNN encoder-decoder (U-Net?), and the structure of the transformer encoder (ViT?).
    3. In Figure 1, the depth map is obtained by Depth Anything Model, while in Section 2.1, the depth map is obtained by AdelaiDepth. Please check this inconsistency.
    4. While applying Bezier curves for anatomical landmark detection in surgical settings is relatively new, the use of Bezier-based representations for curvilinear structures has been well explored in other domains, such as lane detection (Rethinking Efficient Lane Detection via Curve Modeling) . BezierLandNet does not substantially innovate upon the prior work, making its contribution more incremental in terms of algorithmic design.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    See weaknesses & strengths. If the authors can address the concerns raised in the weaknesses section, I would be inclined to raise my overall score.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    After reading the other comments and the authors’ response, the reviewer still has concerns about the novelty of this paper (similarity to the deepsolo framework). Additionally, the authors seems to provide new experimental results in their response (133 ms/image), which violates the rebuttal guidelines.



Review #3

  • Please describe the contribution of the paper
    1. Proposed BezierLandNet, an innovative framework utilizing Bezier curves for accurate anatomical landmark detection in laparoscopic liver surgery, enabling more precise 2D-3D registration.
    2. Introduced Adaptive Curve Proposal Initialization (ACPI) and Hierarchical Curve Refinement (HCR) modules to generate pixel-aligned Bezier curve proposals and iteratively refine them, capturing fine-grained contextual details.
    3. Demonstrated significant performance improvements on the L3D and P2ILF datasets, outperforming SOTA methods in terms of curve continuity and detection accuracy.
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Replacing traditional segmentation-based methods with Bezier curves provides a holistic and continuous representation of curvilinear structures, significantly improving detection accuracy and efficiency.
    2. BezierLandNet achieved remarkable results on the L3D dataset, surpassing SOTA methods like D2GPLand by 5.53% in Dice, 4.62% in IoU, and 17.25 pixels in ASSD.
    3. The model also demonstrated strong generalization on the P2ILF dataset, maintaining superior performance even with fine-tuning on a smaller dataset.
    4. Each module (MFE, ACPI, HCR) is well-defined, with clear responsibilities. Ablation studies validate their contributions, highlighting the importance of the modular design.
    5. The introduction of Proposal Induction Loss effectively accelerates convergence during early training, improving the quality of curve proposals.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The datasets used (L3D and P2ILF) are relatively small, especially P2ILF (only 167 images), which may limit the model’s generalizability to larger and more diverse datasets.
    2. Significant domain gaps between datasets (e.g., labeling standards, image resolutions) can lead to performance drops after fine-tuning, as noted in the P2ILF evaluation.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    BezierLandNet introduces a novel approach to anatomical landmark detection using Bezier curves, achieving superior performance in terms of continuity, accuracy, and robustness. The framework’s modular design and innovative methodology provide a strong foundation for improving augmented reality navigation in laparoscopic liver surgery. While dataset limitations pose challenges for broader generalization, the overall method shows great promise and is suitable for acceptance, with potential for further optimization and application.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    The method lacks sufficient novelty and is too close to prior work. The experiments are also limited in scope, using only two small datasets. After rebuttal, I recommend rejection.




Author Feedback

We thank the reviewers for insightful comments and the positive feedback regarding our “innovative framework” (R4), “strong evaluation”(R5) with “significant performance improvements” (R4), and “benefits for registration” (R3, R4). We address specific comments below, which will be added in revision.

R3

  1. Technical Contributions Though our Bezier curve strategy has some high-level similarities with the 2 mentioned methods, we introduce several distinct innovations for surgical challenges:
    • Feature Extraction: Unlike prior methods that use only a ResNet encoder for visual feature extraction, we designed MFE to integrate visual and geometric information, enhancing the accuracy of curve detection for liver landmarks.
    • Proposal Generation: Prior methods generate many redundant curves from all-scale features, leading to divergent training in complex tasks. Instead, ours uses only the lowest-scale features for coarse proposals and introduces a Proposal Induction Loss to accelerate convergence.
    • Curve Refinement: Prior methods refine proposals with all-scale features, but often cause local minima given poor initialization due to our challenging task. Our HCR instead introduces a coarse-to-fine strategy, progressively leveraging global-to-local (f4->f1) context to avoid local minima. Also, partially using features allows more refinement iterations with same computational cost, improving accuracy.
    • Given these innovations, our method can largely surpass DeepSolo++ in Tab.1.
  2. Method details
    • Depth Estimation Model: We utilize AdelaiDepth.“Depth Anything” in Fig. 1 is a typo and will be corrected.
    • Notation: “Q_c” (Fig. 1) is a typo and should be “Q_s” (semantic query). Corrections will be made.
  3. Landmark Dilation For fair comparison, we follow D2GPLand and dilate polylines by 30 pixels for a 1920x1080 image.
  4. Point Number We empirically set the control point number at 6. Fewer points limit anatomical variability representation, while more do not improve results and can introduce artifacts (self-intersections or duplicate paths).
  5. Experiments We exactly follow D2GPLand and report equally averaged results for 3 classes for fair comparison. We will add FWIoU and per-class metrics in the revision. Inference takes 133 ms/image on an A6000 GPU, sufficient for registration.

R4 Dataset Limitations: L3D and P2ILF are the only publicly available datasets for this task to our knowledge. We agree that larger, more diverse datasets are needed to enhance generalization and alleviate domain gaps.

R5

  1. Comparisons In our early experiments, SAM2-Adapter and SAM2-Unet showed issues of discontinuity and scattered pixels, common issues in segmentation-based methods due to their pixel-level prediction without explicit continuity constraints, offering no clear improvement over SAM-Adapter. Due to space, these were not included in the paper but will be discussed in related work/future extensions.
  2. Framework details We use SAM-ViT-B, and a U-Net built upon ResNet50. Transformer encoder adopts MSDeformAttn [26]. Code will be available for reproduction.
  3. Depth Estimation Model: Kindly refer to Q2 in R3.
  4. Innovations over Others While Bezier-based methods are explored in other domains (e.g. BezierLaneNet as you mentioned), we propose several distinct innovations to tackle challenges in surgery task:
    • BezierLaneNet assumes vertical lanes and predicts curves from 1D horizontal vectors. Our task involves curves with varying orientations and positions, so our ACPI predicts curves from each pixel of 2D feature maps without prior assumptions. We also designed a specialized Proposal Induction Loss for stable training and convergence.
    • Landmarks in surgery show high complexity, so we design a new HCR module with multi-stage refinement for more precise recognition.
    • BezierLaneNet uses only ResNet, which suffices for clear lane structures. Our MFE leverages multi-modality cues of RGB and depth to capture complex liver landmarks.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The manuscript presents BezierLandNet, the first system that recasts laparoscopic liver-landmark detection as Bézier-curve fitting. All three reviewers found the direction timely and the experiments thorough. Reviewer 3 initially questioned originality but, after the rebuttal, moved to accept. Reviewers 4 & 5 switched to reject on “limited novelty”, a point that was not central in their first-round reports. I examined the exchange carefully and find the authors’ clarifications convincing.

    While the high-level pipeline echoes DeepSolo++, the paper contributes three surgery-specific advances that are absent from prior curve-fitting work:

    • Multi-modal Feature Extraction (MFE) fuses visual and geometric features—crucial for the low-contrast, deformable liver surface.

    • Adaptive Curve Proposal with Proposal-Induction Loss stabilises optimisation when each frame contains only a handful of salient curves.

    • Hierarchical Curve Refinement uses a coarse-to-fine, global-to-local schedule that escapes local minima and still runs at 133 ms per frame, satisfying real-time guidance needs.

    Ablation results suggest that each component contributes meaningfully to the overall accuracy, and the full model outperforms 16 baselines on the only two public benchmarks. Although the datasets are small, testing shows reasonable generalisation, and the authors candidly discuss limitations and future SAM-2 comparisons.

    Given the clarified novelty, solid empirical gains, and clear surgical relevance, I override the split reviews and recommend acceptance, while urging the authors to emphasise the distinctions from DeepSolo++ even more explicitly in the final revision.



back to top