Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Monocular Depth Estimation (MDE) in cell microscopy provides critical insights into cellular structures, with applications spanning cancer diagnostics, hematological analysis, and tumor margin assessment. However, it presents unique challenges such as sparse z-stacks with limited focal planes, optical aberrations degrading depth precision, and the inherently ill-posed nature of inferring depth from single 2D images. Existing MDE methods often rely on semantic priors, geometric modeling, or self-supervised learning. While effective in macroscopic applications, these approaches struggle with microscopy-specific challenges involving domain-specific feature distributions.

To address these limitations, we propose a novel deep learning-based physics-guided augmentation strategy leveraging Extended Depth of Field (EDOF) images to enhance MDE performance. To demonstrate the effectiveness of our approach, we employ a regression model trained to predict z-stack levels from individual cell images and a UNet-based model to synthesize blurred cell images at intermediate z-levels by modeling the point spread function (PSF) of the imaging process. Experiments on Giemsa-stained peripheral blood smear data demonstrate significant improvements in MDE over training without augmentation and simple augmentation strategies. Ablation studies validate the robustness of our approach, providing a promising framework for advancing medical microscopy-related applications.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1037_paper.pdf

SharedIt Link: https://rdcu.be/eHw3v

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05127-1_22

Supplementary Material: https://papers.miccai.org/miccai-2025/supp/1037_supp.zip

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{VisAbh_Guided_MICCAI2025,
        author = { Viswanathan, Abhishek AND Rajagopalan, A. N. AND Yelamarthy, Nikhil AND Rai, Ankit AND Ramachandran, Pradeep},
        title = { { Guided Augmentation for Monocular Depth Estimation in Cell Microscopy } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15969},
        month = {September},
        page = {224 -- 234}
}

Reviews

Review #1

Please describe the contribution of the paper

The authors propose a data augmentation strategy to enhance z-level prediction performance in cell microscopy. They generate realistic blurred cell images at intermediate z-stack levels and use them in the training pipeline. They use two kind of models: (1) z-Net, which, given an image it predicts the z-level based on the blurr; (2) PSF-Net, which given a Extended Depth of Field (EDOF) image and the desired (continuous) z-level generates the corresponding blurring kernel which convolved with the EDOF image gives the blurred image for that level. Using this training augmentation strategy is shown to be beneficial, improving the results in the Giemsa-stained peripheral blood smear data. Moreover, it is agnostic to the z-Net prediction architecture, showing the improvements transfer to different models.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper presents a simple idea clearly and well executed. It takes the benefits of data augmentation seen in the deep learning field to the niche of microscopy imaging. Their method shows clear improvements in the Giemsa-stained peripheral blood smear dataset. Using z-level instead of directly trying to predict the depth, as well as, modeling the PSF function instead of directly predicting the blurred image, leverages a physic informed system.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1) The authors propose a first stage training only with the discrete z-levels, and a second stage introducing the augmented data. Why not train with all the data available since the beginning?

2)There is no comparison with other z-level prediction strategies. The authors show this strategy improves the results they get with AlexNet and other architectures. But, do these improvements transfer to current state of the art results?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper shows effective results in a single dataset, but it does not provide a comparison with the current state of the art system.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have addressed my concerns. I would recommend this paper for acceptance.

Review #2

Please describe the contribution of the paper

This paper focuses on Monocular Depth Estimation (MDE) in cell microscopy, a challenging problem due to sparse z-stacks, optical aberrations, and the limited focal planes of typical imaging systems. The authors propose a physics-guided data augmentation strategy that leverages Extended Depth of Field (EDOF) images to synthesize realistic blurry images at intermediate z-stack levels. A z-level prediction model (z-Net) estimates depth from single images, and a PSF-Net models the point spread function to generate consistent blur.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper integrates domain knowledge (PSFs, EDOF) into a deep learning pipeline, ensuring realistic blur generation.

Synthesizing intermediate z-level images effectively increases training diversity, helping the model generalize better.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Limited External Benchmarks Although the paper shows improvements over baseline or conventional augmentation strategies, it does not extensively compare against other specialized MDE frameworks for microscopy. Recommendation: Incorporate competitive benchmarks using well-known or state-of-the-art microscopy depth estimation techniques.

Potential Overfitting to the PSF Model The PSF-Net implicitly learns a specific imaging system’s blur pattern, which might not accurately represent drastically different hardware setups or aberrations. Recommendation: Validate on multiple microscopes and imaging conditions to show that the learned PSF remains robust or adapt to new PSFs without retraining from scratch.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Mentioned above.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

Previous Reasons

Review #3

Please describe the contribution of the paper

Authors present a well-motivated and technically sound contribution that addresses a key challenge in monocular depth estimation (MDE) for cell microscopy, specifically, the limitations posed by sparse z-stacks, optical aberrations, and domain-specific imaging artifacts. Authors proposed a physics-guided data augmentation strategy that utilizes extended depth of field (EDOF) images to create realistic, physically consistent blurred cell images at intermediate z-stack levels. By modeling the microscope’s point spread function (PSF), the approach fills the depth gaps in training data while maintaining structural fidelity. Integrated into a Net-based architecture, the method enhances training diversity and improves depth inference performance. The results, validated on Giemsa-stained peripheral blood smear data, demonstrate measurable improvements over conventional augmentation methods. The contribution stands out for its principled integration of physical imaging knowledge into data-driven learning, providing a microscopy-specific solution that enhances the robustness and generalization of MDE models.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors present a novel physics-guided data augmentation method tailored for monocular depth estimation in cell microscopy. By leveraging extended depth of field (EDOF) images and modeling the imaging system’s point spread function, it generates realistic, depth-consistent blurred images to address the challenge of sparse z-stacks. The microscopy-specific approach improves training diversity and accuracy, demonstrating strong performance gains on real biomedical datasets and offering a well-justified, domain-adapted solution.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

While the authors evaluate the effectiveness of their proposed physics-guided augmentation strategy across multiple architectures (ResNet, DenseNet, MobileNet) and demonstrate improvements in MSE compared to non-augmented training, the authors did not discuss any potential limitations of the method or computational cost. In particular, the generalizability of the approach remains uncertain as validation was performed solely on Giemsa-stained peripheral blood smear data. A discussion of potential drawbacks or challenges, such as its applicability to other imaging modalities, stain variations, or cell types would strengthen the proposed method.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

The authors should revise the in-text citation numbering to ensure they appear in the correct sequential order.

In Table 1, as the authors compare their method with existing augmentation strategies, I recommend placing the metrics for their proposed method in the final column. This would enhance clarity and readability. Additionally, the headings in Tables 2 and 3 are somewhat confusing; I suggest that the authors consider reorganizing these tables for improved structure and interpretation.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I recommend accepting this paper due to its novel physics-guided augmentation strategy, which addresses a critical limitation in monocular depth estimation (MDE) for microscopy sparse z-stacks and domain-specific imaging artifacts. The authors use extended depth of field images to generate physically consistent intermediate slices, which is both innovative and practically valuable. The method demonstrates strong performance across multiple network architectures and shows improvements over baseline and conventional augmentation approaches. Despite being validated on a single dataset, the technique has the potential for broad impact in medical and biomedical imaging and microscopy applications.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The paper introduces a novel physics-informed data augmentation method for estimating monocular depth in cell microscopy. By modeling the point spread function (PSF) and leveraging extended depth of field (EDOF) imaging, the authors generate realistic, depth-consistent blurred images to enhance training data, effectively addressing the challenge of sparse z-stacks. The method is domain-adapted, conceptually sound, and demonstrates consistent performance improvements across various neural network architectures. Although the original manuscript did not discuss potential limitations or provide validation of the diverse dataset beyond Giemsa-stained blood smears, the authors acknowledged these issues in their rebuttal and expressed a clear willingness to make the necessary corrections and additions. Given the originality, biomedical relevance, and commitment of the authors to strengthen the final version, I believe it deserves acceptance.

Author Feedback

R1,R2,R3: Thank you for the detailed feedback, which has provided valuable insights. R1: Training Protocol: We first trained on discrete z-levels (Sec. 3.1) to establish baseline results, then generated augmented data (Sec. 3.3) using PSF-Net. The augmented and original data were then combined for training z-Net. We will clarify this explicitly in the final version. R1: SOTA Comparisons: Thank you for raising this point. Our comparisons focused on the most relevant baselines for bridging discrete z-gaps in microscopy, including strong interpolation-based methods (linear, Bspline, bicubic, kriging). Our augmentation is a plug-in methodology, independent of the MDE backbone, and can be integrated with future SOTA MDE networks. When using SOTA Vision Transformer architecture and MiDaS MDE framework with a final regression head, we observed z-prediction MSE improvements from 1.26 to 0.5 and 0.37 to 0.11, respectively, thus validating our approach. These results were not included in Table 3 initially due to higher computational cost and limited additional performance gain over lightweight architectures, diminishing clinical utility. We will include them in Table 3 in the final manuscript for partial SOTA comparisons as suggested. R1, R2: SOTA Microscopy Benchmarks: Thank you for this suggestion. Our literature survey shows that most SOTA MDE approaches in computer vision (self-supervised, geometric modeling, semantic priors) are designed for natural images, often requiring multi-view, stereo, or large annotated datasets, which are not available or applicable for microscopy-specific challenges such as sparse z-stacks, optical aberrations, and domain-specific artifacts. Prior microscopy MDE works (Sec. 2) have used CNNs and PSF modeling but focus on different modalities (e.g., light-field, epi-illumination), often requiring additional data or tools, and lack specialized MDE DL frameworks for standard z-stack microscopy. Data augmentation and physics-based modeling have been used for classification, reconstruction, and segmentation, but to our knowledge, no prior work has proposed physics-guided data augmentation for depth estimation in microscopy, making our approach novel in this niche. However, as mentioned in our response in the previous point, we have validated our method using the MiDaS (small) MDE framework with a modified regression head, serving as a competitive architectural benchmark. R2: PSF Overfitting: PSF-Net mapping from EDOF images and z-levels to blurred images is inherently system-specific and must reflect underlying optics. We believe retraining PSF-Net for a new system is straightforward and necessary to ensure physical fidelity. We agree that further validation on multiple microscopes and imaging conditions is important future work. R3: Dataset Generalizability: While the Giemsa-stained PBS dataset provides a challenging testbed due to cell-level heterogeneity, imaging artifacts, and domain-specific variation, and the physics-guided PSF modeling is not tied to a specific cell type or stain, we acknowledge the need to validate on further medical datasets as future work. As a note, although not included in the manuscript due to clinical focus, validation on semiconductor SEM images also showed our method’s adaptability and improved results. R3: Limitations and Computational Cost: Thank you for this suggestion. As a limitation, reliance on EDOF images could be addressed in future work by computing EDOF via SFF algorithms, as noted in Sec 5. The main computational cost is PSF-Net training; image generation and z-Net inference are efficient and suitable for real-time clinical use. We will add a discussion on limitations and computational cost in the final manuscript. R3: Presentation: We will incorporate all formatting suggestions (sequential citation, table reorganization etc) in the camera-ready version.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
My recommendation is based on:
1. vox populi: a 2-1 vote “for”, including (a) a vote switched to “for” post-rebuttal;(b) an “against” vote hedged by a stated lack of expertise in the topic; and (c) a strong and highly-engaged “for” review.
2. A solid rebuttal from authors that addressed issues.
3. The use of domain specifics (physics) is a strong asset, noted by all reviewers. I urge the authors to carefully examine and thoroughly address, where possible, the points raised by R2.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The rebuttal convince most of reviewers.

back to top

Guided Augmentation for Monocular Depth Estimation in Cell Microscopy

Author(s):