List of Papers Browse by Subject Areas Author List
Abstract
In minimally invasive surgeries, such as endoscopic and ophthalmic procedures, specular highlights on tissue and instrument surfaces can obscure critical details, compromising surgical safety and precision. Traditional methods rely on color segmentation and filtering optimization but are highly sensitive to lighting variations and produce suboptimal restoration. While deep learning enhances detection robustness, its effectiveness is constrained by the scarcity of annotated medical data and unnatural boundary transitions in restored regions. To address these challenges, this paper proposes a two-stage hierarchical network framework. First, a Hierarchical Feature Attention Network (HFA-Net) is designed, integrating spatial-shift segmented attention (S²MLP), dual-flow attention (DFA), multi-scale feature fusion (SFF), and partial mask convolution (PMConv) to achieve precise detection and removal of specular highlights. Second, a large-mask inpainting model (LaMa) is introduced, utilizing dilated mask expansion to enhance contextual awareness and improve texture consistency in the restored regions. To address the scarcity of medical highlight datasets, we construct four specialized datasets covering various surgical scenarios, including ophthalmic injections and instrument reflections, while also incorporating publicly available data to enhance model generalization. Experimental results demonstrate that the proposed method outperforms existing approaches across six datasets in terms of detection accuracy and restoration quality, particularly excelling in complex textures and natural boundary transitions. Our code is available at https://github.com/tkllndxn/highlight-removal.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0536_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/tkllndxn/highlight-removal
Link to the Dataset(s)
CVC dataset download link: https://pan.baidu.com/s/1d8TOgcwZGD7f9aOqfudIOw?pwd=hyc4
BibTex
@InProceedings{LiZef_ATwoStage_MICCAI2025,
author = { Li, Zefeng and Cui, Mingyue and Hu, Daosong and Gong, Jin and Weng, Jingchong and Zhang, Zeyu and Tian, Lele and Li, Mengran and Huang, Kai},
title = { { A Two-Stage Method for Specular Highlight Detection and Removal in Medical Images } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15969},
month = {September},
page = {23 -- 32}
}
Reviews
Review #1
- Please describe the contribution of the paper
- The paper introduces a two stage method for specular highlight detection and removal in medical images using six different datasets.
- Constructed four datasets covering ocular surface, fundus, surgical instruments, and specular highlights on endoscopic tissue surfaces.
- Proposed a hierarchical feature attention network to detect specular highlights and used metrics like PSNR and SSIM to ensure the preservation of image structure, edge blurring and information loss.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Methodological novelty: Proposed HFA-Net with different modules, each with different roles specific to specular reflection detection, and validated their significance through ablation study.
- Six datasets from different modalities: The results are shown on six different datasets from different modalities, including endoscopic and ophthalmic images.
- Well-written: The paper is well-written and easy to follow.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Dataset details: The train, validation and test split details are missing for all six datasets.
- Dataset construction contribution: The paper mentions construction of four datasets as one of the contributions. However, since there is no indication that these datasets will be made publicly available, it’s unclear how this can be considered a contribution to the broader research community. Additionally, important details such as sample distribution are missing, limiting the reproducibility and assessment of the datasets’ value.
- Accuracy metric: It is unclear why accuracy was chosen as one of the evaluation metrics. Given that specular highlights typically occupy only a small region (a few pixels), accuracy may be misleading due to class imbalance. Metrics more sensitive to minority classes—such as precision, recall, or IoU—would provide a more meaningful assessment of performance.
- Clinical relevance: The paper proposes an inpainting method but does not provide any clinical relevance evaluation, such as assessing how segmentation or classification performance is affected after inpainting.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- Lack of dataset transparency: Sample distribution and train/validation/test splits for all six datasets are missing, limiting reproducibility. Moreover, the contribution summary should be revised if the datasets are not intended for public release.
- Inappropriate evaluation metric: Accuracy is used despite the small size of specular highlight regions, making it a misleading metric due to class imbalance. More appropriate metrics like precision, recall, or IoU are not reported.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Reject
- [Post rebuttal] Please justify your final decision from above.
While the authors did not mention dataset release in the original paper, they have promised to make it public in the rebuttal and have listed dataset curation as one of the contributions. However, the dataset lacks key information necessary for proper evaluation and reproducibility—most notably, detailed sample distribution and ethical approval details. For reference, the dataset details similar to https://link.springer.com/article/10.1007/s00138-017-0864-0 should be included to provide detailed statistics on specular highlight distribution.
Review #2
- Please describe the contribution of the paper
The paper presents a two-stage deep learning framework for specular highlight detection and removal in surgical images. It introduces HFA-Net, which leverages attention mechanisms (S²MLP, DFA, SFF) and partial mask convolution for accurate highlight detection, followed by LaMa-based inpainting with mask dilation to ensure realistic texture restoration. Additionally, the authors construct four specialized medical datasets and demonstrate that their method outperforms existing approaches across six datasets in both detection accuracy and visual quality.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The architecture effectively decouples highlight detection and image restoration, with clear complementary strengths across HFA-Net and LaMa.
-
Outperforms previous state-of-the-art on six datasets using ACC, MCC, PSNR, and SSIM. The quantitative improvements are substantial and consistent.
-
The integration of attention mechanisms (S²MLP, DFA) and multiscale fusion (SFF) is motivated and validated through an extensive ablation study.
-
Creation of four diverse surgical datasets with paired highlight/no-highlight images, which could serve as a valuable community resource.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
While the integration is effective, most components (e.g., LaMa, S²MLP, PMConv) are adapted from prior work. The innovation lies more in system-level design than in algorithmic novelty.
-
The paper does not report runtime, model size, or inference latency. Real-time applicability, critical in surgery, is not assessed.
-
No feedback or usability analysis from surgical professionals is provided. This limits claims regarding clinical readiness.
-
It is not clearly stated whether the newly collected datasets will be made publicly available.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The author proposes a well-designed two-stage framework for specular highlight detection and removal in surgical images, showing strong performance across six datasets and contributing valuable new data. While the integration of existing modules is effective, the methodological novelty is moderate, and some concerns remain around reproducibility, runtime efficiency, and dataset/code availability.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have done a commendable job in addressing the reviewers’ comments.
Review #3
- Please describe the contribution of the paper
This paper introduces a 2-stage method for specular highlight detection and context-aware inpainting. To aid in this task, the authors also generate several datasets for training and evaluation. The introduced method is compared against several existing approaches for segmentation and inpainting and produces better metrics across all datasets used.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-The pipeline improves performance on both segmentation and inpainting tasks (notable improvement in metrics). -Seems like the use of a new/different architecture for this task (specular highlight inpainting) is the source of performance improvements but I am not knowledgeable about the related work for this task.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-Figures and figure captions could use better labels (especially row labels on Fig 2 and captions for Fig 4, Table 2). -No visual comparison against ground truth in Fig 4. -Seems like almost all components of the introduced 2-stage model came from the cited literature except for PMConv which only differs in limiting convolutions to operating within a masked region, plus the addition of mathematical morphology to model output.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Nice ablation study.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Strong performance over existing methods for specular highlight segmentation and inpainting, but seems like the novelty is somewhat limited. Seems like the primary method novelty is in limiting convolutions to operate within a masked region and employing mathematical morphology on a segmentation mask (applying morphology in this way is itself not novel and in fact the primary use for mathematical morphology).
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
We thank the reviewers for their insightful comments in our work. we offer the following clarification: Q1 [R1/R3]: Concerning methodological novelty A1: In fact, our contribution extends beyond system design. We focus on solving the clinical challenge of visual obstruction caused by specular highlights, and through algorithmic optimization ensure that surgeons maintain clear, continuous visual feedback in complex operating environments. Moreover, to balance global and local feature extraction, we initially tried the lightweight S²MLP, but it underperformed in aligning tissue-edge textures and restoring fine details. Therefore, we designed DFA and SFF: DFA uses parallel branches to capture global semantics and local details with multi-scale fusion; SFF establishes cross-scale fusion channels between feature maps of different resolutions to efficiently integrate shallow details with deep semantics. Finally, we introduce PMConv to perform convolution only within the highlight mask and combine it with morphological dilation to expand context, guiding the LaMa inpainting stage to focus on valid regions.
Q2 [R1]: Concerning model performance, size, and real-time capability A2: Stage-1 HFA-Net contains only 0.05M parameters; Stage-2 LaMa contains approximately 7M–21M parameters. We tested our pipeline on an embedded intravitreal injection surgical imaging platform (1080p), and the results fully satisfy the stringent real-time requirements of medical surgery.
Q3 [R1]: Concerning the lack of clinical-expert feedback and usability analysis A3: We invited three experts in a real clinical setting for live trials and collected questionnaires before and after use. The experts’ satisfaction scores were 4.6/5, 4.4/5, and 4.5/5, with an average of 4.5/5. We will include these questionnaire results in the paper to further strengthen the clinical relevance of our method.
Q4 [R1/R2]: Concerning code and medical-dataset release and sample distribution A4: We are eager to make our code and datasets publicly available to facilitate further advances in this field. Upon paper acceptance, we will provide complete download links for the code and datasets, along with detailed data-generation procedures and sample-distribution statistics. All datasets are uniformly preprocessed; each subset contains approximately 10,000 samples (highlight pixels account for 1%–10%) and is split 75%/25% into training and testing sets; no additional fine-tuning was performed, nor was an independent validation set used.
Q5 [R2]: Concerning the choice of evaluation metrics for highlight detection A5: We recognize that class imbalance can render accuracy misleading. However, since References [2] and [7] both use Accuracy as the primary metric, and to maintain fair comparison and due to page limits, we retained Accuracy and removed Precision, Recall, and IoU. To correct the overestimation caused by abundant true negatives, we additionally introduce MCC to balance performance evaluation between the minority (highlights) and majority (background) classes. The final version will supplement a comparison table including Precision, Recall, and IoU.
Q6 [R2]: Concerning the lack of clinical relevance evaluation A6: Indeed, we also acknowledge the lack of formal clinical assessment; to address this, we invited three medical experts to participate in a questionnaire study, where they rated their satisfaction as 4.6/5, 4.4/5, and 4.5/5 (average 4.5/5). Although we have not yet conducted quantitative evaluations on downstream tasks such as segmentation and detection, our qualitative results show that highlight removal yields clearer tissue boundaries and significantly reduces visual interference—enhancing surgical safety without adding surgeon burden.
Q7 [R3]: Concerning unclear figure/table labels and missing ground truth comparisons A7: We will comprehensively optimize all figure and table labels and supplement every visualization with Ground Truth comparisons.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
This paper presents a two-stage deep learning framework for specular highlight detection and removal in medical images. The method combines a custom hierarchical attention network (HFA-Net) for highlight segmentation with a LaMa-based inpainting approach, enhanced through a mask dilation strategy. The authors also contribute four new surgical datasets and demonstrate strong quantitative and qualitative results across six datasets, outperforming existing methods.
The reviewers acknowledge the paper’s strengths, including a clear architectural design, consistent performance gains, a strong ablation study, and the potential impact of the new datasets. However, some concerns were raised. Reviewer #2 questions the appropriateness of accuracy as an evaluation metric for imbalanced data and points out the lack of detail in dataset splits and availability. Other reviewers note that while the components are largely adapted from prior work, their integration is effective. Additional feedback mentions the lack of clinical validation, runtime profiling, and usability assessments, which could strengthen claims of practical applicability.
Given the promising results, clear organization, and methodological soundness—alongside the identified areas for clarification—the AC recommends inviting this paper for rebuttal. The authors are encouraged to address concerns about metric selection, dataset transparency, code availability, and clarify the novelty and clinical relevance of the approach.
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The two-stage framework proposed in this work first detects the region of the endoscopy video that contains specular highlights, followed by an inpainting technique to digitally remove these highlights. The authors make several claims that are not strongly supported by their manuscript. First, the significance of this work is low: one of the stated contributions is “construction of medical scene datasets,” and yet there is no mention of public availability. Thus, the construction of such a database is a necessity for this work, but does not benefit the research community. Second, the reproducibility of this work is difficult. As reviewers noted, this manuscript does not contain sufficient detail. The lack of reproducibility, in addition to the lack of a public dataset, suggests that this work is not likely to advance the research field.
I understand that in the rebuttal, the authors suggested that both the code and database will be publicly available. However, there is no mechanism in the MICCAI review and publication processes to enforce this; thus, the promise of open-source availability cannot be taken as a significant contribution nor an indicator of scientific reproducibility. This gives this submission a competitive disadvantage. Reproducibility should be ensured by the initial/original submission.
Third, inpainting techniques, in general, are not able to recover minute details occluded by specular highlights. As evident from the supplemental video, specifically at the time-index of 18s into the video, it is very evident that the vascular structures in the occluded region were simply being masked: no details were recovered. Hence, the claimed quantitative results (Table 2), while competitive compared to other works, are not representative of the true performance of the work.
Lastly, about half of the techniques used in the comparative study are rather old: [1] from 2015, [2] from 2010, [6] from 2019, [16] from 2013, and [21] from 2019. Thus, the results reported in Table 1 and 2, while competitive, may not be representative against other state-of-the-art methods.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Reject
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper received mixed reviews. After reading all materials, the AC finds the proposed method is widely used in computer vision and this topic has limited contributions to medical area.