Abstract

Monocular depth estimation in bronchoscopy can significantly improve real-time navigation accuracy and enhance the safety of interventions in complex, branching airways. Recent advances in depth foundation models have shown promise for endoscopic scenarios, yet these models often lack anatomical awareness in bronchoscopy, overfitting to local textures rather than capturing the global airway structure—particularly under ambiguous depth cues and poor lighting. To address this, we propose Brea-Depth, a novel framework that integrates airway-specific geometric priors into foundation model adaptation for bronchoscopic depth estimation. Our method introduces a depth-aware CycleGAN, refining the translation between real bronchoscopic images and airway geometries from anatomical data, effectively bridging the domain gap. In addition, we introduce an airway structure awareness loss to enforce depth consistency within the airway lumen while preserving smooth transitions and structural integrity. By incorporating anatomical priors, Brea-Depth enhances model generalization and yields more robust, accurate 3D airway reconstructions. To assess anatomical realism, we introduce Airway Depth Structure Evaluation, a new metric for structural consistency. We validate BREA-Depth on a collected ex-vivo human lung dataset and an open bronchoscopic dataset, where it outperforms existing methods in anatomical depth preservation.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2574_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: https://papers.miccai.org/miccai-2025/supp/2574_supp.zip

Link to the Code Repository

https://github.com/SIRGLab/BREA-Depth

Link to the Dataset(s)

https://www.marcovs.com/bronchoscopy-navigation

BibTex

@InProceedings{ZhaFra_BREADepth_MICCAI2025,
        author = { Zhang, Francis Xiatian and Mackute, Emile and Kasaei, Mohammadreza and Dhaliwal, Kevin and Thomson, Robert and Khadem, Mohsen},
        title = { { BREA-Depth: Bronchoscopy Realistic Airway-geometric Depth Estimation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15968},
        month = {September},
        page = {75 -- 85}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes Brea-Depth, a framework that integrates airway-specific geometric priors into foundation model adaptation for bronchoscopic depth estimation. It introduces a depth-aware CycleGAN to refine the translation between real bronchoscopic images and airway geometries from anatomical data. The paper also introduces an airway structure awareness loss. This loss enforces depth consistency within the airway lumen while preserving smooth transitions and structural integrity. To evaluate anatomical realism, the paper introduces a new metric: Airway Depth Structure Evaluation. Brea-Depth is validated on a collected ex-vivo human lung dataset and an open bronchoscopic dataset, showing good results. The video submission explains paper well.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper is well-written and presents a strong idea by integrating depth airway-geometric priors into the model.
    • The paper proposes an interesting approach that employs a depth-aware CycleGAN to refine the synthetic-to-real translation, offering a novel solution for bronchoscopic depth estimation.
    • The introduction of the Airway Structure Awareness Loss is a valuable contribution to this task, as it effectively enforces depth consistency within the airway lumen while maintaining structural integrity.
    • The experiments are thorough and well-conducted.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Fig. 1 could be enlarged for better clarity. –The use of CycleGAN is not well explained.
    • The authors focus solely on CycleGAN and do not explore other GAN models, or diffusion models?, that could also be suitable for this task.
    • The experiments are limited to an ex-vivo dataset, which restricts the generalizability of the results to real-world scenarios
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a solid approach, particularly with the introduction of the depth-aware CycleGAN and the Airway Structure Awareness Loss, which contribute effectively to bronchoscopic depth estimation. The experiments are somewhat limited to an ex-vivo dataset, which may restrict the broader applicability of the results. While the method shows promise, further exploration with real-world datasets and additional model comparisons would strengthen the findings. Overall, the paper is well-written and presents an interesting idea, but a few improvements in experiment scope are necessary for a stronger impact.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper
    1. A new method for monocular depth estimation in bronchoscopy capable of achieving competitive performance.
    2. A mathmetical model for generating random airways for training data.
    3. A new bronchoscopy dataset with lumen segmentation but without depth labels.
    4. Two new evaluation metrics for depth estimation in bronchoscopy.
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Although it is not the first method to use adversarial training via GANs for depth estimation, the method integrates simple yet effective strategies to make it work right, which is often an overlooked problem in training GAN-based architecture. The mathematical model for generating random airways is a solid idea to generate a large amount of training data. Similarly, the airway-aware loss function is a sound idea to improve the performance of the model.
    2. The proposed method performs competitively against the state-of-the-art methods while achieving real-time performance on a consumer grade GPU.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The Introduction section lacks a clear distinction between the previous methods that has used GANs and other generative / adversarial methods. This makes it difficult to have a connection between the previous work and the proposed method thereby also between the Introduction and the Methodology sections. I suggest a revision of the related-works component of the Introduction section to clarify the differences between the similar previous methods and the proposed methods.
    2. The main figure of the paper (Figure 1) is not clear. The input to the models, for example quoting from the paper, “Syn-to-Real translates simulated depth maps and frames from a synthetic airway model into a realistic style”, contradicts the graphical representation of the model which seems to take in only the depth map. I suggest a revision of the figure to make it more clear.
    3. The Methodology section has components that cause the understanding of the proposed architecture to be difficult. For example, subsection Depth-Aware CycleGAN has repititive information in first and second halves. Furthermore, although it carefully focuses on the reason why synthetic depth is chosen as an input of the model, but no explanation is given for the choice of the output synthetic RGB image of the real-to-syn model. How the method is structured is in deed important, but the reasoning behind the choices made in the architecture is equally important. I suggest a revision of the Methodology section to clarify the reasoning behind the choices made in the architecture. More space for such additions can be made by carefully removing the repititve information.
    4. The combined usage of pseudo-labeled real-images and synthetic images in the training of the model is not clear. I recommend extending on this topic in the Methodology section.
    5. The proposed Depth Contrast Consistency metric forms a coarse evaluation as it is only considering the average depth in the two locations of the scene while missing out on the tubular structure of the airway. Such scenes can be often observed when the camera is not directly parallel to the peripheries of the airway. This causes questionable deductions from the metric. Even when combined with the Lowest Depth Localization Accuracy metric, the proposed additional evalution is not sufficient to validate the performance of a depth estimation method for such anatomies.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. Direction arrows in Figure 1 are colliding with each others making the figure hard to read. Furthermore it is too small for a figure that has so much fine structure. I suggest to increase the size of the figure and to make the arrows more distinct.
    2. Instead of using a grayscale representation of the depth maps, I suggest opting for a more intuitive color map such as “Turbo” [1].
    3. I suggest adding bold and underlined text to the highest scores of each metric in the tables for a more clear reading experince.

    References:

    1. Mikhailov, Anton. “Turbo, An Improved Rainbow Colormap for Visualization.” Google Research Blog, August 20, 2019. https://research.google/blog/turbo-an-improved-rainbow-colormap-for-visualization/.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I would like to begin with thanking the authors for their work. The paper is overall well-structured yet requires improvements in the figures and the methodology section. The proposed architecture and automated training data generation are sound and the results are promising. However, the proposed evaluation metrics contain concerns regarding the deductions made from them. Although the proposed dataset lacks depth annotations to create a benchmark, which is lacking in this field, it is still a good source of assets for related research. In the light of these remarks, I recommend a “Weak accept”.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper describes a method to generate anatomically consistent depth estimation specifically designed for bronchial anatomy. The method leverages synthetic data in a CycleGAN formulation and introduces an airway structure awareness loss that enforces larger depths within airway lumen than outside airway lumen. Authors also introduce evaluation metrics that are tailored to the assessment of anatomical consistency in depth predictions within airway lumen. Finally, authors introduce a physiologically realistic bronchoscopy simulation and explain in detail the implementation of key geometric constraints that ensure realism.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The detailed explanation of how the physiologically realistic bronchoscopy simulation is developed and how realism is ensured via the various geometric constraints is impressive. Authors explanation of various implementation details and justification of decisions is clear. Results look interesting.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Not necessarily a major weakness, but it would be interesting to also see some evaluations, for instance, of standard depth metrics on simulated data where accurate ground truth is available.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    In Section 2, Methodology, is ‘enhances’ the appropriate word to describe Real-to-Syn? Real-to-Syn simplifies real frames by reducing texture and enabling the model to focus on geometric features. Please describe model components appropriately for what the represent.

    In Section 3, batch size of 2 seems quite small. Is there a reason for this? Did authors conduct experiments with larger batch sizes and not see any significant differences in the trained model?

    In Section 3, bottom of page 7, authors say they achieve the highest Depth Contrast Consistency at 97.27% although 3cGAN shows Depth Contrast Consistency at 99.27%. This is acceptable but authors should represent the results accurately and/or explain why this is acceptable.

    In Section 3, under “Classical Depth Estimation Performance”, authors mention that their results show limited improvement due to low quality ground truth from phantom data. In this case, it would be interesting to see performance on these metrics in a synthetic dataset where ground truth is of higher quality.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Interesting methodological contributions that take specifics of bronchial anatomy into consideration. Authors should, however, address additional comments.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We thank all reviewers for their thoughtful and constructive feedback. We are encouraged by the positive recognition of our contributions, including airway-specific priors, the depth-aware CycleGAN, and novel anatomical evaluation metrics. Below, we address the main concerns raised:

  1. CycleGAN Justification and Diffusion Alternatives (Reviewer #2 & #3) In response to R2 and R3’s suggestion to justify our choice of CycleGAN and consider alternative models such as diffusion (R2: “focus solely on CycleGAN… diffusion models?”; R3: “not the first method to use adversarial training”), we selected CycleGAN for its efficiency, modularity, and strong performance in unpaired synthetic-to-real translation. Our version incorporates airway-specific depth constraints for anatomical preservation (Section 3.2). Diffusion models are promising but their sampling process, which typically requires hundreds of iterative steps, is computationally intensive and slows down inference. It also makes it difficult to directly integrate task-specific losses such as our proposed Airway Structure Awareness Loss. We will clarify this at the start of Section 3.
  2. Limited Discussion of Prior GAN Work (Reviewer #3) R3 noted that the Introduction does not clearly differentiate prior adversarial methods from our own. While we cite related GAN-based work in Section 1 (Refs 2, 9, 13, 26), we acknowledge that more explicit positioning would improve clarity. Due to MICCAI’s strict page limits, we focused on methodological contributions, but we will improve this discussion in the final version.
  3. Figure 1 and Methodology Clarity (Reviewer #2 & #3) All reviewers noted issues with Figure 1 (R2: “could be enlarged”, R3: “input to model unclear”, R3: “colliding arrows”). We will enlarge and clarify the figure, explicitly separate the two branches (Syn→Real and Real→Syn), and revise it for readability. Section 3.2 already describes the structure, but we will further clarify input-output mappings and architectural decisions in the revised figure and caption.
  4. Use of Pseudo-Labeled and Synthetic Data (Reviewer #3) R3 asked for clarification on how pseudo-labelled real images and synthetic images are used. Section 4 explains that we use real bronchoscopy images with pseudo-depth generated by Depth Anything [24] and pair them with synthetic depth-RGB data. This dual-source setup enables the model to generalize well to real data while leveraging structural priors from synthetic sources. We will make this interplay more explicit in the revised text.
  5. Evaluation Metrics and Generalization (Reviewer #3) R3 questioned whether our proposed metrics (Depth Contrast Consistency, DCC) are sufficient. We agree that DCC is coarse on its own, which is why we also introduce the Lowest Depth Localization Accuracy metric in Section 4.2. Table 1 and ablation results further validate their combined value. We will emphasize this rationale in revision.
  6. Additional Clarifications (Reviewer #4) Batch size: Section 4 notes that the batch size was constrained by GPU memory; larger sizes offered no gains. Word choice: We agree “simplifies” is more accurate than “enhances” to describe Real→Syn (Section 3.2) and will revise accordingly. DCC accuracy statement: We will correct the text for Table 1 and clarify the interpretation. Synthetic evaluation: We agree that this could strengthen the analysis and will check whether the MICCAI guidelines permit the inclusion of such results in the final version. If not, we will include them in a future journal extension. Once again, we thank all reviewers for their detailed and constructive input, which will help strengthen the final version of our submission.




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top