Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Accurate segmentation of retinal vessels is an important task. Deep learning-based approaches have achieved impressive segmentation performance on images with the same distribution as the training images. However, the performance significantly drops when there is a substantial disparity between the distributions of the training and testing data, which limits the practical applicability of these methods in real-world scenarios. In this paper, we propose a novel test-time training (TTT) strategy that employs a local contrast-preserving copy-paste (L2CP) method to generate synthetic images in the target domain style. Specifically, leveraging the thin nature of retinal vessel structures, we apply a simple morphological closing to remove these structures from the test image. This process yields a vessel-free image that retains the target domain’s style, which we then employ as the background component for the synthetic image. To realistically integrate retinal vessels from source domain images into the background component, our L2CP method pastes the local contrast map of the vessels, rather than their grayscale values, onto the background component. This approach effectively mitigates the issue of significant disparities in grayscale distribution between the foreground and background across the source and target domains. Extensive TTT experiments on retinal vessel segmentation tasks demonstrate that the proposed L2CP consistently improves the model’s generalization ability in retinal structure segmentation. The code of our implementation is available at https://github.com/GuGuLL123/L2CP.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3277_paper.pdf

SharedIt Link: https://rdcu.be/eHwXp

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04981-0_58

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/GuGuLL123/L2CP

Link to the Dataset(s)

N/A

BibTex

@InProceedings{GuYul_TestTime_MICCAI2025,
        author = { Gu, Yuliang AND Sun, Zhichao AND Liu, Zelong AND Xu, Yongchao},
        title = { { Test-Time Training with Local Contrast-Preserving Copy-Pasted Image for Domain Generalization in Retinal Vessel Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {614 -- 624}
}

Reviews

Review #1

Please describe the contribution of the paper

An approach of test time training for retinal vessel segmentation is proposed. Images in the test domain are generated using the local contrast map of the source image and background of the test domain.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A local contrast-preserving copy paste method is proposed to generate target domain image.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

In figure 2, the pipeline is confusing. There is no output of vessel segmentation. Is it only fine-tune stage? There is no quality assessment of the generated images. There is no comparison with other retinal image generation algorithms. For example, the GAN with style transfer, which also generates target domain image with the target style and source vessel.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The comparison study is not enough.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors address most of my concerns.

Review #2

Please describe the contribution of the paper

This paper presents three main contributions: a simple and novel strategy called L2CP to realistically integrate vessel structures from the source domain into the target domain background, effectively reducing the grayscale distribution gap between domains; the use of synthetic images as a bridge for test-time model adaptation; and extensive experiments on generalizable retinal vessel segmentation across multiple classical networks and datasets, demonstrating that L2CP consistently improves model generalization in vessel structure segmentation.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is well written, with thorough justification of the proposed method and experiments. It clearly outlines the challenges in current retinal vessel segmentation. The core idea—copy-pasting vessels using local luminance consistency—is simple yet effective, and I find it particularly interesting. The proposed method achieves state-of-the-art results on three public datasets, demonstrating both strong generalization and practical value.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Level 1 (Major):

(Experiment) 1.1 The selection criterion for the source domain image is not clearly explained. Was the source image selected randomly? Was it taken from the source domain training set or test set? Since the similarity between the source and target domains can significantly affect adaptation performance, using a source image that happens to be more similar to the target domain could lead to overly optimistic results. Therefore, the selection strategy of the source image has a substantial impact on the final outcomes.

Moreover, it is unclear whether all methods in the comparison used the same source image. If different methods used different source images, this could lead to unfair comparisons. For instance, if the proposed method was evaluated with the best-performing source image after test-time tuning, while other methods used less optimal source images, the performance advantage could stem from source selection rather than the method itself.

If the experiments were conducted using different source images, it would be more appropriate to report the mean and standard deviation of the results across multiple source images to ensure fairness and reliability.

(Experiment) 1.2 There are many recent test-time training (TTT) methods, such as DLTTA (TMI 2022), SAR (ICLR 2023), and VPTTA (2024). In particular, VPTTA is also developed for fundus images and is highly relevant to this paper. A comparison or discussion of these related works is expected.

[1] Yang, H., Chen, C., Jiang, M., Liu, Q., Cao, J., Heng, P. A., & Dou, Q. (2022). Dltta: Dynamic learning rate for test-time adaptation on cross-domain medical images. IEEE Transactions on Medical Imaging, 41(12), 3575-3586. [2] Niu, S., Wu, J., Zhang, Y., Wen, Z., Chen, Y., Zhao, P., & Tan, M. (2023). Towards stable test-time adaptation in dynamic wild world. arXiv preprint arXiv:2302.12400. [3] Chen, Z., Pan, Y., Ye, Y., Lu, M., & Xia, Y. (2024). Each test image deserves a specific prompt: Continual test-time adaptation for 2d medical image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11184-11193).

(Method and Results) 1.3 The proposed method utilizes vessel structures from the source domain. However, there are structural differences between source and target domain vessel patterns, making it difficult to ensure the anatomical integrity of vessels in the target domain. For example, DRIVE contains regular vessel structures from healthy adults, STARE includes pathological cases with disrupted and tortuous vessels, and CHASEDB1 consists of dense, thick vessels from young children. These subject-related differences introduce substantial variations in vessel topology across datasets. Moreover, the evaluation only reports AUC and F1-score, which are insufficient for capturing such structural discrepancies. It is recommended to include structure-aware metrics such as clDice to better reflect the preservation of vessel morphology and connectivity.

(Experiments) 1.4 The proposed method is referred to as a local contrast-preserving copy-paste technique. However, what would the result look like if one simply copy-pasted the vessel structures directly? In Table 4, did the “copy-paste” baseline paste full vessel structures, or was it based on a rectangular patch (as described earlier in the paper)? If the experiments did not include a comparison using vessel-structure copy-paste, it is difficult to demonstrate that the benefit of the proposed method comes specifically from contrast preservation rather than just from copy-pasting in general.

Level 2 (Intermediate):

(Results) 2.1 From Figure 1, it can be observed that some residual vessels remain in the synthesized background, which could introduce noise into the segmentation. It would be helpful if the authors could visualize the background image after vessel removal and show the final synthesized image in Figure 3. Additionally, Figure 3 lacks visualizations for Tent and CoTTA. It is also unclear which source and target domain datasets are used for each visual example shown in Figure 3.

(Method) 2.2 The paper uses the “mean gray value in the local background.” However, how is “local” defined? What is the spatial range or window size used to compute the local background? How was this range chosen, and how sensitive are the results to this choice?

Level 3 (Minor):

(Code) 3.1 Although all datasets used in this paper are publicly available, it appears that the authors do not provide the source code. Making the code available would enhance the reproducibility of the study.

(Experiments) 3.2 The training setup is not described in sufficient detail. Please clarify the batch size, learning rate, number of training epochs, and other training-related configurations.

(Method) 3.3 After subtracting the region contrast value from the background pixels in the target domain, how do the authors ensure that the resulting pixel values do not fall below zero?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

I will divide the comments of weakness into three levels of importance, and describe each level roughly in terms of methods, dataset, experiments, results, analysis, formulas, code, figures and so on. The authors may choose to respond selectively based on the relative importance of each comment. Additionally, I would like to clarify that although I have pointed out some limitations in the experiments, this does not imply a request for additional experiments at this stage. Rather, I hope these comments may serve as useful feedback for the authors’ future research. I fully understand the MICCAI reviewing policy and the practical constraints that prevent authors from adding new experiments during the submission phase.

I sincerely hope these suggestions will help inspire and inform the authors’ future work.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper proposes a simple yet effective solution to the problem of cross-domain retinal vessel segmentation. The proposed L2CP method and achieves consistent performance gains across multiple public datasets. The experimental design is relatively complete, and the writing is clear, effectively conveying the validity of the method. The proposed idea is intuitive and easy to understand.

However, the paper also has several areas that could be improved. The strategy for selecting the source domain image is not clearly described, which may affect the results. Additionally, comparisons with more recent and representative methods, are lacking. The structural evaluation metrics are also insufficient and fail to reflect how well the vessel topology is preserved. Furthermore, since the datasets differ significantly in terms of patient demographics and pathology, the robustness of the method under complex structural or pathological conditions is not thoroughly discussed.

In summary, although there is room for improvement in terms of completeness and rigor, the method is innovative, concise, and performs well in experiments. Therefore, I lean toward a Weak Accept recommendation, and I hope the authors will clarify and improve the above aspects in the final version.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have addressed and clarified most of my concerns. I find the idea of L2CP for test-time training simple yet effective, and it is interesting. Based on the experimental results, the proposed method also appears to be effective. Therefore, I ultimately decide to recommend acceptance.

Review #3

Please describe the contribution of the paper

The main contribution of this paper is the introduction of a novel test-time training (TTT) strategy called Local Contrast-Preserving Copy-Pasted Image (L2CP) to improve domain generalization in retinal vessel segmentation. The method addresses the problem of significant distribution discrepancies between source and target domains during testing by generating synthetic images that integrate source domain vessels into the target domain’s background while preserving contrast. This synthetic image is then used to fine-tune the model during the testing phase, improving the model’s generalization ability to out-of-distribution data.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Novel Methodology (L2CP): The proposed L2CP method is a novel and effective solution for domain generalization in retinal vessel segmentation. It solves the challenge of domain discrepancy by integrating local contrast maps of vessels into a vessel-free background, preserving domain-specific style while improving the segmentation model’s ability to generalize to unseen data.
2. Strong Evaluation: The authors provide extensive experimental results on three public datasets (DRIVE, STARE, CHASEDB1), demonstrating the robustness of L2CP across different retinal vessel segmentation models and datasets. The method consistently improves the segmentation performance, especially in cross-domain evaluations, which strengthens the argument for its generalizability.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Dependency on Source Domain Information: The method relies on one annotated image from the source domain to generate the local contrast map for training, which may limit its applicability in scenarios where such data is not available. While the authors mention this as a limitation, future research could explore ways to generate synthetic images without any source domain annotations, which could broaden the approach’s usability.
2. Limited Discussion on Computational Complexity: The method involves morphological operations and fine-tuning at test time. While the paper demonstrates the effectiveness of the approach, there is little discussion on the computational cost.
3. No Consideration for Other Retinal Conditions: The paper focuses on general retinal vessel segmentation. However, it does not discuss how this method could be adapted for other retinal conditions (e.g., diabetic retinopathy or glaucoma). Exploring the broader applicability of the L2CP method in other retinal pathologies could strengthen the contribution.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall recommendation for this paper is positive due to its novel contribution, strong experimental results, and practical implications for domain generalization in medical image segmentation. The paper introduces an innovative approach to test-time training that addresses a significant challenge in retinal vessel segmentation and demonstrates its effectiveness through thorough experimentation across different datasets. While there are some limitations, particularly regarding the dependency on source domain data and computational efficiency, the strengths of the approach in improving model generalization make it a valuable contribution to the field. The paper is well-structured, and the methodology is clear and reproducible. Therefore, the recommendation would be to accept the paper, possibly with suggestions for further exploration of the computational aspects and broader applications.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have effectively addressed my comments.

Author Feedback

Reviewers R1, R2, and R3 rated 354 with confidence 343. Reviewers find our paper “novel methodology/strong evaluation/well-structured” (R2) and “particularly interesting/simple yet effective/thorough justification/well written” (R3). We will release the code.

1 Pipeline clarification (R1 Q1) We will add the output segmentation of the decoder in revision. Consistent with standard TTT studies (e.g., CoTTA [22]) that begins with a pre-trained model, we show only the test-time fine-tuning stage. The pre-trained model is not our contribution.

2 Quality assessment of the generated images (R1 Q2) We directly assessed the quality of generated images (compared to others) using task-relevant segmentation evaluation (see Tab. 4). Following your suggestion, we evaluate generated images directly using FID. L2CP achieves the lowest FID at 60.9, outperforming mix-up (90.7), copy-paste (81.5), and Fourier (110.1).

3 Comparison with style-transfer methods (R1 Q3) Style-transfer methods (e.g., GAN) usually require enough number (» 1) of target domain images, which are not available in the one-pass TTT scenario (only the current image in test flow) in this paper. More importantly, style-transfer methods need to be retrained on new target domain, bringing heavy training and inference costs during testing on new target domain—precisely what TTT seeks to avoid. This is why we did not compare with style-transfer methods, but with other generation methods (Tab. 4).

4 Synthetic image without any source domain information (R2 Q1) Excellent suggestion. A single annotated source domain image is easy to obtain. We also plan to use fractal methods [16] to synthesize vessel structures without relying on source-domain data.

5 Computational complexity (R2 Q2) In all experiments (except Tab. 2), L2CP uses the same source domain image. The local-contrast is computed once. The most computation cost lies on the morphological closing operation and fine-tuning. The former requires linear computational complexity. Similar to TENT [20], our L2CP does not need EMA teacher network in CoTTA [22] and DPLoT [25]. The whole TTT runtime for our method is about 2.6s, compared with 2.3s, 3.7s, and 3.9s for TENT, CoTTA, and DPLoT, respectively.

6 Extending L2CP to other retinal conditions (R2 Q3) Excellent suggestion. As noticed by R3, the STARE dataset already contains pathological cases with disrupted and tortuous vessels. In the future, we will extend L2CP to other suggested retinal conditions (e.g., diabetic retinopathy and glaucoma).

7 Selection of the source image (R3 Q1.1) In all experiments (except ablation study in Tab. 2), L2CP uses the same source domain image. We also conducted ablation study on the impact of using different source domain images. As depicted in Tab. 2, our L2CP maintains robust generalization regardless of random source-image selection. Future work will report key stats across diverse source images.

8 Comparison with more methods (R3 Q1.2) Thanks for the suggestions. We compared with Tent (ICLR21), Cotta (CVPR22), and Dplot (CVPR24). We will cite the suggested works in revision, and compare with them in future work.

9 Evaluation on clDice (R3 Q1.3) Excellent suggestion. We will include clDice in the future work.

10 Detail of copy-paste [4] in Tab. 4 (R3 Q1.4) Indeed, the ‘Copy-Paste’ in Tab. [4] simply copy-pastes the vessel regions (not rectangular patches) in the source-domain image onto the vessel-free background of target domain image. This naive paste may invert vessel/background contrast, degrading performance. This highlights L2CP’s superiority.

11 Some other details (R3 Q2.1-3.3) We used a 7×7 kernel to define ‘local’. The details of TTT fine-tuning are provided in Sec 3.2. We follow the same settings with baseline methods for the pretrained model training. Any negative values are clipped to 0. We will add more clear visualizations. We are really grateful for your suggestions that surely help us in the future research.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The authors have clearly clarified the issues raised by the reviewers. The explanations in the rebuttal look reasonable and correct to me.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

All reviewers recommend accepting this work due to its novelty and superior performance, supported by extensive experiments. Therefore, an acceptance is given.

back to top

Test-Time Training with Local Contrast-Preserving Copy-Pasted Image for Domain Generalization in Retinal Vessel Segmentation

Author(s):