Abstract

The recognition of glioma boundary is challenging as a diffused growthing malignant tumor. Although fluorescence molecular imaging, especially in the second near-infrared window (NIR-II, 1000-1700 nm), helps improve surgical outcomes, fast and precise recognition remains in demand. Data-driven deep learning technology shows great promise in providing objective, fast, and precise recognition for glioma boundaries, but the lack of data poses challenges for designing effective models. Automatic data augmentation can improve the representation of small-scale datasets without requiring extensive prior information, which is suitable for fluorescence-based glioma boundary recognition. We propose Explore and Exploit Augment (EEA) based on multi-armed bandit for image deformations, enabling dynamic policy adjustment during training. Additionally, images captured in white light and the first near-infrared window (NIR-I, 700-900 nm) are introduced to further enhance performance. Experiments demonstrate that EEA improves the generalization of four types of models for glioma boundary recognition, suggesting significant potential for aiding in medical image classification. Code is available at https://github.com/ainieli/EEA.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0141_paper.pdf

SharedIt Link: https://rdcu.be/dV1Mw

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72069-7_13

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0141_supp.pdf

Link to the Code Repository

https://github.com/ainieli/EEA

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Xia_Data_MICCAI2024,
        author = { Xiao, Anqi and Han, Keyi and Shi, Xiaojing and Tian, Jie and Hu, Zhenhua},
        title = { { Data Augmentation with Multi-armed Bandit on Image Deformations Improves Fluorescence Glioma Boundary Recognition } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {130 -- 140}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The manuscript titled, “Data Augmentation with Multi-armed Bandit on Image Deformations Improves Fluorescence Glioma Boundary Recognition” demonstrates a data augmentation methodology, Explore and Exploit Augment (EEA) for improving the performance of DL models in fluorescence glioma boundary detection based on NIR-II fluorescence imaging.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of the paper lie in the novelty and the biomedical scope and applicability of the method in being incorporated into the training pipeline of DL models for better generalization, to ultimately help the surgeons in precise glioma resection.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The weakness of the paper lies in not including appropriate statistical tests with the model performance results. It would have been nice to see the performance of the model comparing NIR-I, and NIR-II and WL channel individiually to demonstrate how EEA improves the performance of glioma boundary recognition based on NIR-II fluorescence imaging specifically.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Comments to authors: The manuscript titled, “Data Augmentation with Multi-armed Bandit on Image Deformations Improves Fluorescence Glioma Boundary Recognition” demonstrates a data augmentation methodology, Explore and Exploit Augment (EEA) for improving the performance of DL models in fluorescence glioma boundary detection based on NIR-II fluorescence imaging. The main strengths of the paper lie in the novelty and the biomedical scope and applicability of the method in being incorporated into the training pipeline of DL models for better generalization, to ultimately help the surgeons in precise glioma resection. The weakness of the paper lies in not including appropriate statistical tests with the model performance results. It would have been nice to see the performance of the model comparing NIR-I, and NIR-II and WL channel individiually to demonstrate how EEA improves the performance of glioma boundary recognition based on NIR-II fluorescence imaging specifically. The manuscript could benefit from incorporating the abovementioned aspects.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Major factors to justify my recommendation include: Requires improvement in data representation.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper presents a data augmentation strategy for training DL models on NIR image data of glioma. The augmentation strategy is based on different image transformations, which are applied to the image in different strengths to augment the data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The augmentation strategy makes it possible to increase the amount of data available with the aim of making training with limited data sets more robust.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The augmentation methodology was divided into two parts, operators and magnitude (D and N). It can be assumed that both parts have a significant influence on the result in their own way. It is therefore a pity that an analysis of the performance of individual operators is missing and that an analysis of magnitude is only outlined. I would have expected more in-depth insights here, especially for the magnitude as this is the more easy part to do.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    In addition to the above-mentioned weaknesses of the publications, I have the following points that should be addressed in a rebuttal. I would find it more useful to name the augmentation operators used, or to include a reference to Tab S2 in the manuscript, instead of naming the operators that are not used. Is there a typo in tab S2 for Solarize with 256 and shouldn’t it be 255? I find the description of the data set structure with 3-channel and 9-channel very complicated, please reformulate this section. The abbreviation AA is not introduced. It is described in chapter 3.1 that automatic DA can have a negative effect on recognition. Has this been specifically investigated for EEA and how is it reflected here? This is of course relevant for a deeper interpretation of the performance results. Why were the four Ns used in Section 3.2 and not more steps around 11?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is a well-executed analysis of a relevant issue that has few weaknesses and open points.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a novel strategy for selecting the optimal image augmentation, using a reinforcement learning-like method to optimize the magnitude of specific operations in a chain of randomly selected operations. The proposed approach is tested against several relevant data augmentation strategies on a closed glioma dataset, using four CNN model architectures.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed novel strategy is very well conceived. The problem the strategy mitigates, i.e. finding the near-optimal image augmentation setup for improving classification of glioma images model performance when having a smaller-sized dataset, is worth exploring. The paper is an interesting read.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Chosen hyperparameter values for benchmark models can highly influence the results. Although the reported results suggest that the proposed method outperforms other benchmark methods, I am not convinced that this experiment is entirely unbiased.

    I am unsure if this is the proper place to describe this, but the paper is difficult to follow in some respects because of the verbal way of presenting the ideas, and some grammatical and syntactical errors in the text.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Section 2.3. The method assigns reward zero to operations that are not sampled. How does this reflect after processing B? Are there any zero r-s in the pQ table, following processing of one batch? If there are, then this could be a flaw in the method, considering the rewards for sampled operations are always positive. I believe this should be discussed.

    Section 3. Implementation Details. The authors should explain the choice of hyperparameter values for the models used. Moreover, they should explain why the experimental setup for the benchmark doesn’t favour their methods, i.e. that the experiment is unbiased.

    The paper is difficult to follow in some respects because of the verbal way of presenting the ideas, and some grammatical and syntactical errors in the text. I suggest the text is corrected by someone who is familiar with the field but is not acquainted with the specific method – for fresh perspective. Also, I believe the notation used for describing the method (variable names) could have been made simpler (e.g. I believe it would suffice to use “m” instead of “m^d”, without losing any information; for a specific position in the table (d,m), Q^{d,m}, …) – please consider simplifying it.

    The authors purposely chose to perform random selection of operators, for simplicity. It would be interesting to explore including this in the RL problem as well, and then perform an ablation study - but this can be something for future work.

    Minor remarks:

    • “Therefore, operators such as color and contrast that require the grayscale image are removed from O.” – is this statement correct?
    • “Note that identity is within the candidate operator set, thus the deformations with less than D operators applied are also included in the transformations.” - please clarify/rewrite.
    • “For each image I′ i , we sample D transformations …” – I believe this should be small letter d.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Medical datasets – especially annotated – are often limited in size. Therefore, any successful attempt at developing strategies aiding in supervised model learning is worthy of reporting. I believe that this work will be interesting to MICCAI audience.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We deeply thank the constructive comments from all reviewers. Hope the responses can address your concerns properly.

R1

  • Suggestions on statistical tests. We have done the 5-fold evaluation for all experiments. However, due to limited pages, the statistical variance is not shown in the manuscript.

  • Separate experiments for impact on WL, NIR-I, and NIR-II. Considering previous work DLS-DARTS having demonstrated the advantages of multi-modal data to single-modal data in fluorescence imaging-based glioma tissue analysis, we directly adopt the multi-modal setting to analyze all experiments. Meanwhile, note that the purpose of this work is to show the effectiveness of EEA in fluorescence glioma boundary recognition. Therefore, although the impact of EEA in different data modalities is worth studying, it will not severely influence the conclusion of our paper. Due to the limited pages, this experiment is not included in our manuscript. We will do further evaluations to show the detailed impact of our EEA on different intraoperative imaging modalities in the future.

R2

  • Questions on valid ranges of Solarize. The range of our EEA directly follows previous work RandAugment that uses a maximum value of 256 rather than 255.

  • Confused descriptions on 3 and 9 channel data. We emphasize that WL images contain RGB channels, while NIR-I/II images only contain one channel. Therefore, the 3-channel data is the concatenation of WL after RGB-to-grayscale conversion, NIR-I, and NIR-II. The 9-channel data is the concatenation of original WL, and NIR-I and NIR-II after grayscale-to-RGB conversion.

  • Questions on negative impact of EEA. It is not specially investigated in EEA, while we think it is worth evaluating. An intuitive way is to show the t-SNE of feature distributions of the augmented samples, which may show samples lie on the border between non-tumor and tumor clusters, or even form unexpected small clusters as the over-transformed samples that have a negative impact on the model prediction. It will be considered in future works.

  • Settings for hyperparameter N. We select 11 and 31 because these are magnitude levels in previous work AA and RA, respectively. Value 2 represents either deform or not, and 3 adds a less deformed choice for 2 that can be a representation of the cases with different deformation levels. We agree further investigation around 11 is worth trying. However, the original purpose of this experiment is not to find the best N for glioma boundary recognition, but to show that the selection of N is important for EEA, meanwhile 11 is just a satisfactory choice.

R3

  • Unclear details about the pQ-table and reward. First, we clarify that the positions in pQ-table (with shape DxN) can not reflect the sampled operators. The pQ-table is designed for magnitudes (or deformations) at different depth. “For positions that $O^{d,m^d}_i$ are not sampled, the reward is assigned 0.” refers to the magnitudes not sampled, rather than operators. Then, we adopt an epsilon-greedy strategy that always has probability epsilon to randomly select magnitudes rather than from pQ-table. Therefore, although in theory there exist probabilities that 0 rewards are in pQ-table, in practice it is hard to observe this phenomenon, especially where we have an epsilon decay strategy that starts updating pQ-table from total random selection.

  • Hyperparameter selection. Hyperparameters of different models are from original papers, and then tuned through grid search. Experimental setups are different to benchmarks because there are feature gaps between natural images and fluorescence images. A problem indeed existing is that no separate validation set is established to tune the hyperparameter values due to the limited number of patients. However, the goal of this paper is to demonstrate the improvement of EEA for glioma boundary recognition. We believe the hyperparameters for classification models will not make a significant impact to the conclusion.




Meta-Review

Meta-review not available, early accepted paper.



back to top