Abstract

You Only Look Once (YOLO)-based object detectors have shown remarkable accuracy for automated brain tumor detection. In this paper, we develop a novel BGF-YOLO architecture by incorporating Bi-level Routing Attention (BRA), Generalized feature pyramid networks (GFPN), and Fourth detecting head into YOLOv8. BGF-YOLO contains an attention mechanism to focus more on important features, and feature pyramid networks to enrich feature representation by merging high-level semantic features with spatial details. Furthermore, we investigate the effect of different attention mechanisms and feature fusions, detection head architectures on brain tumor detection accuracy. Experimental results show that BGF-YOLO gives a 4.7% absolute increase of mAP50 compared to YOLOv8x, and achieves state-of-the-art on the brain tumor detection dataset Br35H. The code is available at https://github.com/mkang315/BGF-YOLO.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0908_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/0908_supp.pdf

Link to the Code Repository

https://github.com/mkang315/BGF-YOLO

Link to the Dataset(s)

https://www.kaggle.com/datasets/ahmedhamada0/brain-tumor-detection

BibTex

@InProceedings{Kan_BGFYOLO_MICCAI2024,
        author = { Kang, Ming and Ting, Chee-Ming and Ting, Fung Fung and Phan, Raphaël C.-W.},
        title = { { BGF-YOLO: Enhanced YOLOv8 with Multiscale Attentional Feature Fusion for Brain Tumor Detection } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15008},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The proposed modifications significantly improve tumor detection compared to YOLOv8.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Experimental results show that BGF-YOLO gives a 4.7% absolute increase of mAP50 compared to YOLOv8x, and achieves state-of-the-art on the brain tumor detection dataset Br35H.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Frankly speaking, this paper seems to be a combination of various existing modules. For example, as the authors mentioned:“we develop a novel BGF-YOLO architecture by incorporating Bilevel Routing Attention (BRA), Generalized feature pyramid networks (GFPN), and Fourth detecting head into YOLOv8”。
    2. The technical details described in this paper are not clear, leading to a lack of understanding of how the authors implemented it. For example, “We modify the structure of the FPN-PANet in the YOLOv8 to achieve multilevel feature fusion among different layers by strengthening the multipath fusion of the networks”.
    3. Furthermore, the authors only briefly compared several detection methods without thoroughly comparing with brain tumor detection methods.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    no

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please see the weakness part

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Limited in Novelty

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper introduces a modified YOLOv8 version with Bi-level Routing attention mechanism, a Generalized Feature Pyramid Network for feature fusion, and a Fourth detecting head. a combination of three interesting techniques in improving brain tumor detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Used the latest version of YOLO as the state of the art option for object detection and improved it more for brain tumor detection.
    • Justified the choice of the techniques with a relevant ablation study proving it’s efficacy.
    • The model’s benchmark with other architectures and ablation studies are well detailed and proves the proposed approach efficacy.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Brain tumor detection seems not so clinically relevant since we can do segmentation which is by far better in addressing such a disease.
    • Unclear weaknesses of other architectures.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Introduction jumped right into technical details, could have eased your way to the technical aspects.
    • The paper paragraphs are poorly structured and a weak flow of ideas / definitions …
    • Unclear graphical representation of the architecture.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Interesting approach in combining three state of the art techniques in improving the latest version of YOLO, but no convincing in terms of clinical relevancy, or the weaknesses of existing approaches.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes BGF-Yolov8 which includes Bi-level Routing Attention, Generalized-FPN, and Fourth detecting head. For the first time on Yolov8, the application concerns a brain tumor segmentation dataset.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper shows a new architecture of Yolov8 which renders better results than state of the art methods (the original Yolov8 and derivates).

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weaknesses are concerning the dataset they used (further datasets should be used to verify the generalization of the accuracy), and the comparison of architectures should be done also on time execution, memory…

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The architecture of this paper seems reproducible (the illustration may just be resized to be more readable). The code is given and clear details of the architecture conception are written in the article.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Authors may improve the readability of their main figure by increasing the size of their elements. Some further comparison such as the time execution, memory consumption should be given to compare BGF-Yolov8 and architectures from the litterature. Authors should test the same way as in this article other brain tumor datasets to generalize their method. In the annexes, the authors should also show results from the detection with other architectures on the same images.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This article show a new derivate of Yolov8 which provides better accurate results compare to the litterature for a single brain tumor dataset. However, some further analyses should be given.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #4

  • Please describe the contribution of the paper

    The paper describes a novel architecture called BGF-YOLO, which enhances the object detection performance of YOLOv8 for brain tumor detection in MRI images. The proposed BGF-YOLO achieves state-of-the-art performance on the brain tumor detection dataset Br35H, outperforming the baseline YOLOv8 and other advanced methods.The main contributions are:

    1. Incorporating a Generalized Feature Pyramid Network (GFPN) in the neck for effective multi-scale feature fusion.
    2. Leveraging a Bi-level Routing Attention (BRA) mechanism to focus on salient features and reduce redundancy.
    3. Adding a fourth detection head to handle objects at richer scales.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    the paper’s main strengths lie in its novel architectural contributions, thorough evaluations, state-of-the-art performance, and the exploration of a new application domain for the YOLOv8 object detector.The proposed BGF-YOLO architecture incorporates several novel components into the YOLOv8 object detector, specifically tailored for accurate brain tumor detection from MRI images.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Insufficient Comparative Analysis with State-of-the-Art Medical Image Detectors: While the paper compares BGF-YOLO’s performance with YOLOv8, DAMO-YOLO, and RCS-YOLO, all YOLO-based object detection models, this comparison is limited within the YOLO framework. It would be beneficial to include additional comparisons with other state-of-the-art object detection models, such as MaskRCNN or Segmentation Anything, especially considering that segmentation is a more detailed level of object detection.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    (1) Evaluate the proposed method on larger and more diverse datasets, potentially including other types of medical imaging modalities or pathologies, to assess its generalizability. (2) Compare the performance of BGF-YOLO with state-of-the-art medical image detection models specifically designed for brain tumor or lesion detection tasks, in addition to the general object detection models considered in the current evaluation. (3) Upon reviewing the anonymized code repository, I noticed that there is mention of RT-DETR-X and YOLOv9-E performance in README, yet I couldn’t find any discussion of these models in the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The architecture of BGF-YOLO is innovative, and the current comparison validates its effectiveness.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We appreciate the positive comments from reviewers and constructive suggestions for improvements. However, there is still misapprehension and imprecision in the reviewer’s findings to be clarified, which may affect the assessment of the validity of the paper.

1) Significance of the study (detection vs. segmentation) i) Reviewer #4: Brain tumor detection seems not so clinically relevant since we can do segmentation which is by far better in addressing such a disease. ii) Reviewer #4: But no convincing in terms of clinical relevancy. iii) Reviewer #6: It would be beneficial to include additional comparisons with other state-of-the-art object detection models, such as MaskRCNN or Segmentation Anything, especially considering that segmentation is a more detailed level of object detection. Response: Due to low costs, brain tumor detection is useful for automated medical screening. If needed, further scans are required, and brain tumor segmentation will play a role in confirming the diagnosis of an abnormality. In clinical practice, patients/doctors probably choose single plane/modality MRI scans to save money because the prices of MRI scans vary in different types. The detection models are suitable for this scenario. We’d like to clarify that Mask-RCNN and Segment Anything Model (SAM) are segmentation models, while the main task of this study/dataset is object detection not segmentation.

2) Novelty of the paper i) Reviewer #5: Frankly speaking, this paper seems to be a combination of various existing modules. For example, as the authors mentioned: “we develop a novel BGF-YOLO architecture by incorporating Bilevel Routing Attention (BRA), Generalized feature pyramid networks (GFPN), and Fourth detecting head into YOLOv8”. ii) Reviewer #5: Limited in Novelty. Response: We hope to clarify that our model is not merely a combination of various existing modules but introduces a new YOLO-based architecture to overcome the challenge of brain tumor detection. Moreover, we also introduced modifications in each individual module; for example, we designed a new enhanced GFPN-structure neck with CSP, Conv, Upsample, and Concat submodules. Furthermore, we also introduced an additional 160×160 detecting head aligned with the new structure of feature fusion networks in the neck part.

We group and summarize the major concerns of reviewers, which we carefully addressed as follows:

1) Figure quality improvement. i) Reviewer #1: Readability of their main figure. ii) Reviewer #4: Unclear graphical representation of architecture. Response: We will improve the figure quality in the revision by enlarging the elements and adjusting the color tone to highlight the major elements in the figure.

2) Further comparisons i) Reviewer #5: Without thoroughly comparing with brain tumor detection methods. ii) Reviewer #6: Limited within the YOLO framework. Medical image detection models. iii) Reviewer #6: Discussion of RT-DETR-X and YOLOv9-E. iv) Reviewer #1: Some further comparison such as the time execution. v) Reviewer #1: Results from the detection with other architectures. Response: We’ve compared it with RCS-YOLO, which is the best-performing method applied to brain tumor detection in previous studies. Moreover, we have conducted additional experiments to compare with other non-YOLO-based state-of-the-art object detection methods such as RT-DETR and Co-DETR. The results have been shown in the anonymous GitHub, which we will include in the main text and Table 1. We have also conducted further comparisons in terms of time execution. The potential limitation of the proposed BGF-YOLO is mainly on increased computational complexity. The inference speed of BGF-YOLO is 223.0 ms, which is longer than YOLOv8 with 78.9 ms. However, the BGF-YOLO gives a substantial improvement in accuracy despite a slight increase in computational effort. We will report the parameters of each method in Table 1, for example, YOLOv8x | 68.2M, BGF-YOLO | 84.29M. We will include sample detected images in the supplementary material in comparison with other different architectures.

3) Generalizability of our model i) Reviewer #6: Evaluate the proposed method on larger and more diverse datasets. ii) Reviewer #1: Other brain tumor datasets to generalize their method. Response: To demonstrate our model’s general applicability to other computer vision datasets, we evaluated the proposed model on a facemask dataset in the Coronavirus Disease (COVID-19) situation. See the External Validation on anonymous GitHub which performs better than YOLOv8. We will apply our method to other types of medical imaging modalities or pathologies in future work.

4) More details for clarity i) Reviewer #4:

  • Unclear weaknesses of other architectures.
  • Introduction jumped right into technical details …
  • poorly structured and a weak flow … ii) Reviewer #5: The technical details … For example, “We modify the structure of …” Response: We will highlight again the weakness of other architectures in the Introduction, which has already been discussed in detail under the Method section. For example, “… YOLOv8 still suffers…” in para. 3, Sec. 2.2. We have structured the Introduction section starting by describing a general background, motivation, and the limitations of existing methods for brain tumor detection (para. 1). As the YOLO-based model has achieved the best detection performance for brain tumor detection, we focus mainly on the review of the YOLO architecture to identify potential limitations for further improvements (para. 2) followed by recent improvements of YOLOv8 for object detection in natural images (para. 3). In the final paragraph (para. 4), we describe the contribution of our work by incorporating new attention mechanisms, multiscale feature fusion networks, and an enhanced detecting head. Besides, we have improved the paper’s overall organization and ensured better flow connections and consistency of ideas and term definitions when describing our method. We have already provided the technical details of each of the modified and new components in our architecture. For example, “the structure of FPN-PANet is modified by utilizing CSP to add skip connections and share dense information across various spatial scales…” in para. 3, Sec. 2.1.




Meta-Review

Meta-review not available, early accepted paper.



back to top