Abstract

Mycetoma, categorized as a Neglected Tropical Disease (NTD), poses significant health, social, and economic challenges due to its causative agents, which include both bacterial and fungal pathogens. Accurate identification of the mycetoma type and species is crucial for initiating appropriate medical interventions, as treatment strategies vary widely. Although several diagnostic tools have been developed over time, histopathology remains a most used method due to its quickness, cost-effectiveness and simplicity. However, its reliance on expert pathologists to perform the diagnostic procedure and accurately interpret the result, particularly in resource-limited settings. Additionally, pathologists face the challenge of stain variability during the histopathological analyses on slides. In response to this need, this study pioneers an automated approach to mycetoma species identification using histopathological images from black skin patients in Senegal. Integrating various stain normalization techniques such as macenko, vahadane, and Reinhard to mitigate color variations, we combine these methods with the MONAI framework alongside DenseNet121 architecture. Our system achieves an average accuracy of 99.34%, 94.06%, 94.45% respectively on Macenko, Reinhard and Vahadane datasets. The system is trained using an original dataset comprising histopathological images stained with Hematoxylin and Eosin (H&E), meticulously collected, annotated, and labeled from various hospitals across Senegal. This study represents a significant advancement in the field of mycetoma diagnosis, offering a reliable and efficient solution that can facilitate timely and accurate species identification, particularly in endemic regions like Senegal.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1516_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zin_Towards_MICCAI2024,
        author = { Zinsou, Kpêtchéhoué Merveille Santi and Diop, Cheikh Talibouya and Diop, Idy and Tsirikoglou, Apostolia and Siddig, Emmanuel Edwar and Sow, Doudou and Ndiaye, Maodo},
        title = { { Towards Rapid Mycetoma Species Diagnosis: A Deep Learning Approach for Stain-Invariant Classification on H&E Images from Senegal } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposes the use of deep learning for the classification of Mycetoma into 4 subclasses from H&E images. The main challenge is the variation in staining and so the authors propose to apply stain normalization during inference. They compare the use of 3 well known stain normalization methods, namely: Macenko, Vahadane, and Reinhard. The authors collect a dataset with regions extracted from slices with different magnification and use those for training and testing. Evaluating a DenseNet121 model over 4-fold cross-validation shows that Macenko gives the best performance, with 99% accuracy and F-score.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1- Paper is well written and easy to follow.

    2- The paper is well motivated. The authors attempt to automate the subclassification of Mycetoma in Senegal patients, which categorized as a Neglected Tropical Disease (NTD) to reduce the reliance on expert pathologists.

    3- The achieved performance reach 99% accuracy and F-score.

    4- The authors collect a dataset of Mycetoma with different subtypes.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1- The method lacks novelty. It is simply a direct application of DenseNet and stain normalization on the Mycetoma subclassification task from H&E images.

    2- There is not enough experiments and comparisons for an application paper, for instance:

    2.a- Stain, color, and brightness augmentation are known approaches to improve training with variations in staining. Also stain normalization during both training and inference. The paper does not explore any of these techniques.

    2.b- The paper uses only DenseNet121. It is not clear whether different backbone architectures would have similar behavior.

    3- When classifying WSIs, multiple instance learning approaches are usually used because we don’t know where to look in the WSI. Here the authors rely on selected regions from the WSIs for the classification and the same is done during inference. In practice during inference, the model needs to look at the WSI without prior knowledge of specific diagnostic locations. This makes the evaluation not reliable.

    4- The authors mention that the data is collected from diverse hospitals and that the staining is performed with “the ’Leica icc50 e’ microscope, connected to a desktop”. It seems that the variation in the staining is not due to different sites and machines but is intentional for experimental purposes. The paper does not explain how the variation in staining was controlled and the experimental settings used.

    5- The community would benefit from making the dataset public to further develop methods that address the problem.

    6- The dataset collected consists of 1289 images collected from whole slide images (WSIs) with different magnification levels. There isn’t enough details on how the images are selected. For instance, what is the size of the original image sizes before augmentation, how are regions selected, is the same region selected across different magnification levels, etc.

    7- The dataset is relatively small in terms of the number of patients from each class. When there are 5 cases from one class dividing among training, validation, and testing, this makes me question the reliability of the results and conclusions.

    Minor:

    1- Some of the references are repeated, example: references 8 and 10 are the same.

    2- The references style should be the same across all references.

    3- Incomplete sentence: “However, its reliance on expert pathologists to perform the diagnostic procedure and accurately interpret the result, particularly in resource-limited settings.”.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    There are some essential details missing with regard to the dataset collection. This includes the images collection from slices and how the variation in staining is obtained. Please refer to weaknesses section for more details.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please refer to strengths and weaknesses sections.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1- The evaluation of WSI classification based on pre-selected regions that exhibit the disease is not practical or reliable.

    2- The number of cases in each class is also too small for testing and for having a reliable evaluation that evaluates generalizability.

    3- Since this is mainly an application paper, it needs to have more details about the translation to the specific application and also more rigorous and extensive evaluations and experiments.

    Please refer to details in the weaknesses sections.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Strong Reject — must be rejected due to major flaws (1)

  • [Post rebuttal] Please justify your decision

    From the other reviewers comments it is clear that this is an important problem to the medical community. However, the rebuttal did not mitigate my concerns w.r.t. the experiments and the novelty. My major concerns experiments-wise still stand: 1- The evaluation approach is unreliable: uses pre-selected regions from the slides for WSI classification. 2- Dataset too small: the number of cases in each class is also too small for testing to get a reliable evaluation. 3- Not enough comparison with different model architectures, and augmentation and training strategies that handle stain variation.



Review #2

  • Please describe the contribution of the paper

    The paper focus on mycetoma diagnosis. It presents an automated approach to mycetoma species identification using histopathological images from black skin patients in Senegal. By integrating various stain normalization techniques and employing deep learning methods, the system achieves good accuracy in identifying mycetoma species. It is trained using a dataset of histopathological images collected, annotated and labeled.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper presents a nice piece of work in the field of mycetoma diagnosis by an automated approach to species identification using histopathological images from black skin patients in Senegal. The integration of various stain normalization techniques, such as Macenko, Vahadane, and Reinhard, effectively mitigates color variations, enhancing the reliability of species identification. Notably, the study achieves high accuracy rates of 99.34%, 94.06%, and 94.45% on the Macenko, Reinhard, and Vahadane datasets, respectively. Moreover, the development of a histopathological image dataset specific to Senegal and the implementation of a unified classification approach using the MONAI framework with DenseNet121 contribute to the novelty and robustness of the methodology.

    However, there are some areas for improvement. The relatively small cohort of 27 patients limits the generalization of the findings. Authors are requested to comment this point, while future research should involve the collection of larger datasets involving a greater number of patients to improve model generalization and performance. Overall, the paper presents a promising approach to mycetoma diagnosis.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Access to the data should be made available to allow comparison with alternative methods.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    I recommend to provide the dataset collected and used in the study along with access to the code. This would greatly enhance the transparency and reproducibility of the research findings.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    See above.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    see above

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors collected a dataset of mycetomas in Senegal and stained them with H&E. They then (a) compared three (+ control) stain normalization methods, both on their own terms and in their impact on accuracy of a DL classification pipeline.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The introduction does an excellent job of setting the medical use case, and how the ML methods studied add value to this use case. It cannot be overstated how valuable (and rare) this is in ML papers.

    The treatment of the different normalizing methods is of high interest, beyond mycetoma diagnosis.

    The dataset, if made available, would be of high value.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There is contradiction re the species in the dataset: are they 4 fungals, or 2 fungals and 2 bacterials?

    The citation list leaves out some useful references. For example, WHO documents,are typically wonderful basic references (it includes one, but leaves out others) . Also, there are valuable references at the CDC mycetoma website. The French language references are certainly welcome, but providing equivalent english references would help some of the readership, y compris moi-meme, merci :).

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    There is no mention of the dataset being made public. Thus reproducibility of the results is not possible. However, I believe their method could be reproduced adequately.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Thank you for a well-done, interesting and actionable paper.

    Some miscellaneous comments, of which two are important (labeled as such):

    “although distinguishing between eumycetomas and actinomycetoma and identifying specific species pose significant challenges”: This is crucial: correct identification determines treatment.

    “Moreover, histopathological analyzes…”: “analyses”

    “MONAI framework”: cite at first mention? Or both places? Or maybe just where you did so…

    Is Omar’s [18, 19] classification into eumycetoma and actinomycetoma sufficient to guide treatment? My very limited understanding is that this is the crucial distinction.

    “onsequently, there is a necessity to explore avenues for automating mycetoma species identification.” and “into four mycetoma species”: is this differentiation relevant to the treatment plan, or do they all receive the same anti-fungals? Also, is there reason to believe that the methods described would distinguish bacterial from fungal infections? Eg might the described methods translate to improvements in Omar’s method?

    2.2: Perhaps mention here that a summary of the main methods used ( macenko, vahadane, and Reinhard methods mentioned earlier) is given later. Because the reader immediately wants to know some details.

    Fig 7: can text within the figure be made bigger (it’s hard to read)?

    “67 slice” typo

    (Important) 3.1: describes two actinomycetoma species and two eomycetoma species in the dataset. This directly contradicts statements in the Introduction and Conclusion about identifying 4 eumycetomas. Please resolve or clarify this.

    “we utilized two metrics [SSIM and PSNR]”: how do these metrics relate to the clinical need (to differentiate pathogens with different treatment plans)?

    Fig 8: Can the text within the figure be made larger?

    Fig 9 (important): I expect there are std devs (confidence intervals) for these results, due to the multiple patients and images. These std devs are important to assess whether the methods have meaningfully different performance.

    “Evaluation was performed using the accuracy, precision, recall and f1-score metrics.”: How do these metrics relate to the clinical diagnostic needs? For example, distinguishing bacterial from fungal is vital. Is distinguishing fungalA from fungalB important (should it get equal weight)? But maybe at 98% accuracy the point is moot.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Strong Accept — must be accepted due to excellence (6)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The framing of the medical use case, and how ML relates to it, is well-done and gives a fine example of what we all should be doing (since the clinical, not ML, needs are ultimately what matter).

    The analysis of the stain normalization methods is actionable in a wide array of use cases, so many readers will take away useful findings.

    It treats an unfamiliar (well, Neglected) disease that deserves our attention and care.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Strong Accept — must be accepted due to excellence (6)

  • [Post rebuttal] Please justify your decision

    I wish to urge and extend my previous arguments in favor of this paper:

    1. The paper clearly frames the clinical context which the ML serves, and shows exactly what problem the ML addresses. This is an excellent example of truly centering the clinical need.
    2. The topic of stain normalization is very important, not just here but in (for example) malaria and helminthiasis diagnostics among other use cases.
    3. The paper ran experiments of specific methods (which I was not familiar with) on a relevant, concrete dataset. I came away with actionable guidance as to methods I can apply to my own work.
    4. The medical target itself is a valuable and novel contribution to the awareness of the MICCAI community. We devote almost no attention to Neglected Tropical Diseases, even though these affect hundreds of millions of people and represent high value targets for ML.
    5. The dataset is perhaps smaller than ideal - this is often the case for clinical data. However, it is a valuable window on the real world of a particular clinical need. While we would all prefer the data to be made public, that is not a requirement for publication at MICCAI.
    6. I agree with Reviewer 5’s comments about image details (item 6). I hope the authors can edit the paper to include this information.
    7. This paper has a different flavor than the standard complicated-architectures-on-curated-dataset submission. This is a strength: It is practical research grounded in clinical realities. I feel that MICCAI needs more work of this sort, to increase the likelihood that some of our algorithms can make the leap from papers to helping sick people.




Author Feedback

We thank all reviewers for their insightful feedback and suggestions.
Reviewers#3&#6 R3Q5 By incorporating data from a broader range of patients, we can improve the robustness of developed models across different clinical scenarios. It will allow a more accurate representation of disease variability, leading to more reliable diagnostic tools. R3Q6-R6Q8-R5Q6.5 The dataset is currently private. We will provide a public link for code. R6Q10) 2 types of mycetoma: eumycetoma (fungal) and actinomycetoma (bacterial). In this study, we considered 2 fungal species and 2 bacterial species (table1), totaling 4 mycetoma species. Identifying the species is crucial for appropriate treatment as different species of fungi or bacteria may respond differently to medication. This knowledge enables healthcare providers to prescribe the most suitable treatment regimen for better outcomes. Table2 presents a general summary of the metric evaluations for each dataset. While we meticulously evaluated each species, these detailed findings have not been disclosed publicly. The details will be made available with the code.

Reviewer#5 Q6.1 This study aims to compile data to create a novel dataset and develop an automated method for identifying mycetoma species using histopathological images of black skin patients from Senegal. Given the lack of pre-existing dataset, we undertook a thorough process of data collection, annotation, and labeling in collaboration with dermatologists from Sudan and Senegal.Currently, no automated method exists for species identification. The existing approach proposed by the researcher Omar only differentiates between eumycetoma and actinomycetoma, requiring the conversion of color images to grayscale to mitigate staining color bias. Our contribution is to propose the first method that identifies mycetoma species using color images, as these are essential for distinguishing the different species. To address the challenge of color variability, we employed three stain normalization methods, resulting in the creation of three distinct datasets. We evaluated the effectiveness of these normalization methods using the MONAI framework+DenseNet121 to classify four mycetoma species. Q6.2 Color augmentations are not currently used alongside stain normalization to thoroughly evaluate the impact of normalization techniques in contrast to using original images. The experiment (Monai+DenseNet), aims to showcase the effectiveness of stain normalization in enhancing species classification. In future studies, we plan to integrate color augmentations to enrich our analysis and conduct additional experiments with different model architectures. Q6.3 The primary clinical indicator of mycetoma in vivo is the presence of a distinctive “grain” within the affected tissue. For pathologists conducting histopathological examinations, the key focus lies in identifying these grains and observing their appearance and structure under the microscope. Each mycetoma species presents unique grain patterns, sizes, and colors, which can be discerned through specific histochemical staining techniques. For data collection, we gather 1289 images showcasing the presence of these grains. Q6.4 Various factors (page2:Moreover-images) contribute to color variability. Recognizing that pathologists lack the capacity to prevent/control these issues entirely, we have focused on finding solutions. Consequently, we have opted to utilize normalization techniques to reduce and mitigate these color variations among histopathological images. Q6.6 Original image sizes range from 16001200 to 46083456 pixels, depending on magnification and the observed species. Regions containing mycetoma grains are selected, and images are captured at three magnification levels for each grain across the entire slice. Q6.7 Acknowledging the limited size of our dataset, the small number of patients, we plan to expand our dataset in future research to further validate and enhance our findings.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper proposes a workflow for the automatic detection of mycetoma in histopathological images.

    The reviews for this paper were highly divided, both before and after rebuttal. From my assessment, this stems from two views on this paper: 1) the methodological view, where the paper (only) pipelines existing methods, and does not provide a evaluation of alternatives to choices in this pipeline (e.g., other network architectures); and 2) a view on the (neglected) disease & use case, with an understudied disease and diverse dataset considering patients from different hospitals.

    From my perspective, both of these views are valid and somewhat difficult to consolidate within this review process. I’d still recommend to accept this paper for the following reasons:

    • The majority of reviewers has voted towards accepting this paper.
    • From the reviewer comments and a brief assessment, it seems like the paper does not contain severe methodological errors, and while the ROI selection process should be assessed as well, a manual selection can be feasible within the intended use case.
    • The paper is (also) submitted under “clinical translation” and highlights corresponding issues with diagnosis of neglected tropical diseases.
    • Given the focus & location of this year’s MICCAI, I believe this paper may add value to challenges & opportunities for the MICCAI community.

    I’d like to strongly encourage the authors to provide missing details, especially with regard to the dataset, in the revised version of their paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper proposes a workflow for the automatic detection of mycetoma in histopathological images.

    The reviews for this paper were highly divided, both before and after rebuttal. From my assessment, this stems from two views on this paper: 1) the methodological view, where the paper (only) pipelines existing methods, and does not provide a evaluation of alternatives to choices in this pipeline (e.g., other network architectures); and 2) a view on the (neglected) disease & use case, with an understudied disease and diverse dataset considering patients from different hospitals.

    From my perspective, both of these views are valid and somewhat difficult to consolidate within this review process. I’d still recommend to accept this paper for the following reasons:

    • The majority of reviewers has voted towards accepting this paper.
    • From the reviewer comments and a brief assessment, it seems like the paper does not contain severe methodological errors, and while the ROI selection process should be assessed as well, a manual selection can be feasible within the intended use case.
    • The paper is (also) submitted under “clinical translation” and highlights corresponding issues with diagnosis of neglected tropical diseases.
    • Given the focus & location of this year’s MICCAI, I believe this paper may add value to challenges & opportunities for the MICCAI community.

    I’d like to strongly encourage the authors to provide missing details, especially with regard to the dataset, in the revised version of their paper.



back to top