Abstract

Hyperspectral imaging (HSI) is emerging as a promising novel imaging modality with various potential surgical applications. Currently available cameras, however, suffer from poor integration into the clinical workflow because they require the lights to be switched off, or the camera to be manually recalibrated as soon as lighting conditions change. Given this critical bottleneck, the contribution of this paper is threefold: (1) We demonstrate that dynamically changing lighting conditions in the operating room dramatically affect the performance of HSI applications, namely physiological parameter estimation, and surgical scene segmentation. (2) We propose a novel learning-based approach to automatically recalibrating hyperspectral images during surgery and show that it is sufficiently accurate to replace the tedious process of white reference-based recalibration. (3) Based on a total of 742 HSI cubes from a phantom, porcine models, and rats we show that our recalibration method not only outperforms previously proposed methods, but also generalizes across species, lighting conditions, and image processing tasks. Due to its simple workflow integration as well as high accuracy, speed, and generalization capabilities, our method could evolve as a central component in clinical surgical HSI.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3323_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3323_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Bau_Deep_MICCAI2024,
        author = { Baumann, Alexander and Ayala, Leonardo and Studier-Fischer, Alexander and Sellner, Jan and Özdemir, Berkin and Kowalewski, Karl-Friedrich and Ilic, Slobodan and Seidlitz, Silvia and Maier-Hein, Lena},
        title = { { Deep intra-operative illumination calibration of hyperspectral cameras } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15006},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes an autoencoder-based method for predicting the illumination distribution of an operating room (OR) imaging scene for the purpose of automated recalibration of an hyperspectral imaging camera as illumination condition in the OR changes, without the need for the use of the traditional white reference panel.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The downstream tasks of semantic segmentation and physiological parameter estimation was well implemented across species (porcine and rat).

    The paper describes the use of an autoencoder-based method to predict the white tile reference spectral illumination distribution of a scene rather than directly predicting the recalibrated form of an imaged tissue. This disentanglement of the space of all possible illumination distributions in a scene from the space of all possible tissue spectra configuration, is a good approach used in the paper.

    The authors generated a dataset of real and simulated scene illumination distribution captured on white reference images to encompass a wide range of illumination conditions encountered in a OR. The authors also generated a dataset of well-calibrated HSI cubes and generated uncalibrated versions of the HSI cubes by multiplying them with the white reference images of varying illumination distributions. This is a good approach as the relationship between the calibrated cubes, uncalibrated cubes, and the white reference images follow the calibration equation defined in [4, 15] based on the assumption that the dark reference of the HSI camera is negligible.

    A wide array of stray confounding lighting conditions was simulated in the paper using parameters such as illumination angle, illumination distance, and illumination intensity. The use of LED-based surgical lights as the stray illumination source, while the halogen lights of the TIVITA HSI camera is the main scene illumination, is a good simulation of an operating room illumination condition.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The claim “in open surgery, spectral data is affected by changes in illumination and must be correctly calibrated whenever lighting conditions vary [8]. The gold standard to achieve this is to switch off all light sources before acquisition” on page 1 does not follow from the references [15, 24]. Neither reference seem to address recalibration due to variation in lighting. The TIVITA hyperspectral imaging device was used in both cited studies. The TIVITA device used was fitted with integrated halogen lamps to illuminate imaging scene. In both cited studies, there was no other light source in the operating room apart from the halogens. So, illumination was controlled, there were no changes in illumination during imaging, and no need for recalibration. It seems intuitive that changes in illumination may affect spectral data, but the problem definition in this paper is not well structured as the cited studies avoid the problem altogether by controlling illumination.

    The claim “Spectral imaging has so far seen the development of one deep learning approach for multi-illuminant calibration, factorizing reflectance and illumination through an unrolling network [17] (Li et al., 2021). However, this research was conducted on outdoor scenes and does not generalize well to the illumination setup in an OR” on page 2 was not substantiated. There is a need to show evidence that the approach proposed in [17] will not generalize to illumination setup in an OR. The approach proposed in [17] was not directly for calibration, instead, it was for estimating the illumination distribution in an imaging scene without the need for a white reference. The estimated illumination distribution can then be used downstream for calibration.

    The claim “We were the first to provide in vivo evidence that dynamically changing lighting conditions in the OR can cause dramatic failures in HSI analysis” in the discussion section on page 4 was not substantiated. The study by [2] (Ayala et al., 2020) seem to already demonstrate this claim.

    The claim “…our method represents the only calibration model that is capable of maintaining high accuracy independently of the downstream task and domain, indicating high applicability for clinical use cases” on page 8 is a major claim that was not sufficiently substantiated in this paper. For example, the authors did not benchmark against the deep unrolling network approach in [17]. It is not apparent that the proposed method will outperform the method in [17].

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The claim “…our work presents the first learning-based light calibration method for hyperspectral imaging” on page 8 is a major claim that was not sufficiently substantiated in this paper. The method proposed in [17], which is a learning-based method that used a transformer network which may outperform the 3D ResNet used in this paper, challenges this claim. Though the authors of [17] tested their method on images captured with the Specim IQ which captured hyperspectral images. The authors in [17] captured both outdoor and indoor imaging scenes albeit no biological tissue was captured in the scenes. This paper should have used [7] as the baseline method to benchmark against.

    The first contribution stated on page 2 paragraph 2: “We are the first to experimentally demonstrate that previously proposed calibration methods fail in in vivo surgical settings” does not take [17] into account. Also, this contribution is significantly different from the first stated contribution in the abstract “We demonstrate that dynamically changing lighting conditions in the operating room dramatically affect the performance of HSI applications, namely physiological parameter estimation, and surgical scene segmentation.” The authors need to harmonize and reconcile these two statements.

    The quality of this paper could have been significantly improved to the point of full acceptance if the authors had focused more on the specific ways their proposed method was a better than previous methods (particularly the method in [17]) rather than focusing on the claims that they were the first to address the problem statements when there is evidence to the contrary.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors made some significant claims about their method that were not sufficiently substantiated. The decision not to benchmark against [17], a major method that addresses the same problem statement as this paper, negatively impacted this paper. The quality of this paper could have been significantly improved to the point of full acceptance if the authors had focused more on the specific ways their proposed method was a better than previous methods (particularly the method in [17]) rather than focusing on the claims that they were the first to address the problem statements when there is evidence to the contrary.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    My major criticism on the lack of comparison to [17] was satisfactorily addressed in the author feedback. The inaccessibility of the [17] method’s illumination data is a valid constraint. However, I believe it would still have been highly beneficial to include the application of the method of [17] in this paper’s OR-targeted illumination estimation and HSI cube calibration task. Including the non-competitive results, based on the application of [17], in the paper or supplementary material would have served as a great basis and motivation for this paper’s OR-targeted approach, as there would be clear evidence that [17] does not perform well in an OR setting.

    The delineation of the novel insights from novel methods in the author feedback sufficiently addresses my comments on the novelty claims in the paper. Integrating this into the discussion section of the paper is important.

    Overall, the author feedback satisfactorily addresses my concerns about the paper.



Review #2

  • Please describe the contribution of the paper

    This paper demonstrated the impact of changing lighting conditions in the operating room on HSI analysis in vivo, and proposed a learning-based approach to automatically recalibrate HSI images during surgery. The proposed method exhibits superior performance compared to conventional methods and demonstrates generalization across species, lighting conditions, and image processing tasks. Additionally, it showed the effectiveness of the proposed method in improving the performance of downstream tasks.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The strength of this paper lies in its demonstration of the impact of changing lighting conditions in the operating room on HSI analysis in vivo, followed by the proposal of a learning-based approach capable of automatic calibration, which exhibited high performance.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The weaknesses of this paper seem to be that the lighting conditions are inherently dependent on the dataset. Since the proposed method is applicable to other operating room environments, it would be desirable to validate it in other settings. Additionally, it may be worth investigating environments outside the operating room as well.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In the discussion section, Figure 6 is referred to, but it seems that there is no Figure 6 in the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-organized, and the proposed learning-based approach for automatic recalibration of HSI images during surgery is carefully demonstrated to be effective in improving the performance of downstream tasks. This is expected to expand the scope of clinical applications for HSI, increasing its potential as a more useful diagnostic and therapeutic support tool.

  • Reviewer confidence

    Not confident (1)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This work introduces a learning-based light calibration method for hyper spectral imaging in operation room.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    (1) The topic of light calibration method for hyper spectral imaging is of clinical significance. (2) The pear is generally well-written.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    (1) It is not very clear why the proposed learning-based method is better than existing models, such as AngulargGAN. It looks like the proposed model takes a 3D-ResNet as the core architecture for the learning. How do this 3D-ResNet work better than AngulargGAN needs more illustration. (2) In Fig.2, how do the white references be obtained through interpolations is not straightforward to me.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    More detailed analysis or experiments to compare the proposed method and existing models would be helpful to understand the advantages of the proposed approach.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper is of high quality in terms of topic, method, experiments and writing. There are some points that need more illustration for better understanding of the advantages of the proposed method. Therefore, I rate this paper as “Accept”.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper addresses the important aspect of white light correction of hyperspectral images. For HSI, it is standard practice in the literature to acquire a white reference image at the beginning of the image acquisition to correct the illumination. This is extremely susceptible to intraoperative illumination changes, which often occur. The paper presents a deep network method to perform the white light correction adaptively.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper are the simulation of white reference images to achieve a wide range of illumination settings for LED and halogen. Further, it evaluate the performance not only on classical colorchecker data but also shows the improvement on segmentation tasks.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The methods section misses few information about implementation details, but this are really minor points.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The paper is well written and structured. The topic is highly relevant and well presented. This means that there are no major points that need to be addressed by the authors. However, there are a few minor points that can be addressed. It has been described that this is an autoencoder architecture that uses a resnet. However, a more detailed description of how the resnet is designed is completely missing (only the number of parameters is given in the Supp). However, in the interests of reproducibility, this should be described briefly, also to make it easier for the reader to put it into context to the compared methods. Further:

    • In paragraph ‘test datasets:’ There is refer comment ‘(left)’, which I don’t understand.
    • In Discussion, page 8 There is a reference to Fig. 6, which does not exists. Did you mean Fig. 5?
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Strong Accept — must be accepted due to excellence (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The recommendation is based on the highly relevant topic, which is addressed and presented here in a clear and structured manner.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    I can see the points criticized by all of the reviewers. The rebuttal is ok if these points find its way into the discussion. In addition, I also don’t like statements like ‘we are the first’ as it’s really easy to miss some papers, just decrease the statement can holding the same massage.




Author Feedback

We thank reviewers R3, R4, and R6 for their (very strong) support of our work and will focus on R5‘s comments.

Key criticism - Lack of comparison to [17] (proposed for non-medical scenes): We had actually tested the method prior to submission, but the method’s illumination data was not accessible such that we were not able to fully reproduce it. We had also applied their method with our OR-targeted illuminations (note that this is already a core contribution of this work!) but did not obtain competitive results, which is why we discarded the work as out of scope (not designed for surgical requirements). We will add this important information to the discussion.

Let us further clarify our contributions (R5): Novel insights: 1) We are the first to experimentally demonstrate that previously proposed calibration methods fail in in vivo surgical settings. Note that this holds true despite prior work [2] because the latter worked exclusively in an ex vivo surgical setting - a completely different setting especially in the context of HSI because tissue perfusion has a radical effect on the spectra. 2) A secondary novel insight is the radical effect of the miscalibration on the downstream task of segmentation - also not investigated by any prior work. 3) One of the most surprising insights was that our method generalizes to completely different OR settings. The acquisition of data from different species in varying perfusion states was a crucial contribution to reveal this important finding.

Novel methods: 4) We propose the first learning-based approach to automatically recalibrate hyperspectral images during surgery, thereby solving an important clinical workflow issue. A key domain-targeted design choice was the physics-based simulation of illumination conditions to be expected in the operating room (see below).

WHY does our method perform so well (also compared to AngularGAN R6)?

  • We implemented a data-centric approach that leverages our prior knowledge on lighting conditions in the OR. Specifically, we assume a primary light source attached to the hyperspectral camera and straylight originating from ceiling light and surgical overheads that illuminate the surgical scene under various distances and angles. During method development we observed a substantial performance drop when omitting the simulation-based augmentations of real illuminations.
  • A key design choice was to learn the calibrated white tile image rather than the calibrated image itself (as proposed by AngularGAN). Outputting of a calibrated image might lead to an implicit encoding of scene properties. In fact, when we trained our network to reconstruct the calibrated image rather than the white tile image, downstream task performance dropped drastically.
  • We perform a spatially resolved calibration. Ablation studies performed in the development phase had confirmed our hypothesis that calibration with a single vector reduces performance substantially (see discussion).

We agree with R5 that it would be valuable to integrate all of these insights into the discussion.

Generalization across illumination conditions (R3): Our method only requires the primary light source (affixed to the camera and thus not undergoing dynamic changes) to be similar to the one used for generating the training set. The rat recordings are out of distribution with respect to the illumination conditions in the training set, indicating generalization not only across species but also across lighting conditions.

Claim of the need for turning lights off (R5): We apologize for the confusion. The standard approach is indeed to turn off all external light sources before acquisition, as described in [24 p.4] (animal study) and [15, p. 7] (wound monitoring). However, implementing this protocol proves unfeasible in the human OR [8, p. 14]. Our work tackles this problem.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    All reviewers recommend acceptance, and I believe this work is interesting and strong enough for publication at MICCAI. I will go with the reviewers’ universal opinion, and recommend acceptance.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    All reviewers recommend acceptance, and I believe this work is interesting and strong enough for publication at MICCAI. I will go with the reviewers’ universal opinion, and recommend acceptance.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The rebuttal properly addressed concerns from reviewers. Accept

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The rebuttal properly addressed concerns from reviewers. Accept



back to top