Abstract

Neonatal face detection is the prerequisite for face-based intelligent medical applications. Nevertheless, it has been found that this area has received minimal attention in existing research. The paucity of open-source, large-scale datasets significantly constrains current studies, which are further compounded by issues such as large-scale occlusions, class imbalance, and precise localization requirements. This work aims to address these challenges from both data and methodological perspectives. We constructed the first open-source face detection dataset for neonates, involving images from 1,000 neonates in the neonatal wards. Utilizing this dataset and adopting NICUface-RF as the baseline, we introduce two novel modules. The hierarchical contextual classification aims to improve the positive/negative anchor ratios and alleviate large-scale occlusions. Concurrently, the DIoU-aware NMS is designed to preserve bounding boxes of superior localization quality by employing predicted DIoUs as the ranking criterion in NMS procedures. Experimental results illustrate the superiority of our method. The dataset and code is available at https://github.com/neonatal-pain.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2862_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zha_Towards_MICCAI2024,
        author = { Zhao, Yisheng and Zhu, Huaiyu and Shu, Qi and Huan, Ruohong and Chen, Shuohui and Pan, Yun},
        title = { { Towards a Deeper insight into Face Detection in Neonatal wards } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper
    1. The authors have developed the first open-source neonatal face detection dataset. This dataset is comprehensive, and contains over 5000 images from 1000 neonates, significantly enriching the resources available for research in this critical but underexplored area.

    2. The manuscript introduces two innovative modules: Hierarchical Contextual Classification (HCC) for addressing class imbalance and occlusions, and DIoU-Aware Non-Maximum Suppression (DAN) for enhancing bounding box localization, significantly improving the accuracy and reliability of neonatal face detection systems.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The creation of a new, open-source dataset specifically for neonatal face detection addresses a significant gap in the field.

    2. The introduction of the HCC and DAN modules is well-articulated and supported by empirical data. These modules significantly improve detection performance by enhancing anchor balance and localization accuracy, respectively.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The evaluation is restricted to the NFD dataset and benchmarks against only a single existing face detection method. Expanding the comparison to include multiple state-of-the-art methods could better position the proposed approach within the field.

    2. The paper lacks detailed information about the diversity and representativeness of the dataset, which is crucial for understanding its applicability and limitations in real-world settings.

    3. There are inconsistencies in the manuscript, such as mismatched values between Table 2 and Figure 3(a) and repeated citations. These need correction to enhance the manuscript’s credibility and readability.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    This manuscript is for neonatal face detection by developing an open-source dataset and introducing novel detection modules. However, the evaluation could be enhanced in several ways:

    1. Incorporate additional external datasets and compare the proposed methods against multiple state-of-the-art face detection techniques to better establish robustness and effectiveness.

    2. Provide a detailed account of the demographic and medical variability of the neonates in the dataset. This transparency is crucial for assessing the applicability and identifying potential biases.
    3. Address minor errors throughout the manuscript to improve readability and ensure consistency across all sections.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper introduces the first open-source neonatal face detection dataset and innovative methodologies specifically designed for neonatal wards. These advances effectively address significant challenges in neonatal monitoring technologies and hold great potential to enhance clinical outcomes. Despite these strengths, the manuscript requires corrections to address certain deficiencies.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    Thanks to the authors for the feedback, which answered most of my questions. The application is still my concern. Yes, it can be used for face recognition, but any other applications? But face recognition is almost impossible with high occlusion on faces.



Review #2

  • Please describe the contribution of the paper

    The authors introduce a new datasets on neonatal face in hospital, hierarchical contextual classification and DIoU-aware NMS to improve neonatal face detection.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    New neonatal datasets: Neonatal dataset is scarce in the field and hence hinder the advancement in AI neonatal field. The datasets they claimed to be open-sourced consists of 5000 images from 1000 neonates which could help to push the field forward.

    Algorithm in improving face detection: Hierachical contextual and DIoU-NMS were introduced. Both techniques are not new but it is innovative that authors make use both the techniques advantages to solve the neonatal face detection problem such as class imbalance issue and neonatal face is often blocked by clothes or others.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Details of datasets is lacking: Although dataset is introduced which can help to push the field forward but only limited to face detection problem. For example, it would not be useful if other authors wish to leverage the datasets for pain score classification or temperature estimation. Morever, it is unclear if where is the datasets shared online.

    Limited novelty: Hierarchical contextual classification has been proposed in Qiu et.al for object detection. Qiu et.al - Hierarchical Context Features Embedding for Object Detection

    Lack of clarity in data splitting: The 5000 datasets are from 1000 neonates. I suppose 1 neonates will have 5 images. How did the author split the datasets? Did the author do a patient-level split to prevent data leakage?

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Authors claim they construct first open source neonatal datasets but it is not clear where are they releasing the datasets.

    Authors did not claim to release the source code.

    Authors did provide explanation on the methodology but it was brief. Suggest authors to release the source code for reproducibility.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Lack of clarify: It would be helpful if author provide explanation on how the data was split, does it consider patient-level split to avoid data leakage?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In addition to the introduction of valuable neonatal datasets, it is noted that the propositions of hierarchical contextual classification and DIoU-Aware NMS have been previously presented, thereby offering limited novelty in terms of methodological advancement.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Authors provide sufficient explaination towards comments.



Review #3

  • Please describe the contribution of the paper

    There are two major contributions in this paper: 1) this paper built the open-source Neonatal face detection (NFD) dataset; 2) the DIoU-aware NMS was designed to preserve bounding boxes

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The highlights of this paper include: 1) the construction of public NFD dataset 2) Hierarchical contextual classification branch 3) The definition of loss function for HCC 4) DIoU-aware NMS for bounding box detection

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1)The process delineated in Figure 2 is ambiguous due to the lack of specification for several symbols, such as ‘x’ and ‘+’. Additionally, the presence of two pyramid feature maps in the right portion of the figure is not explained. 2)The rationale behind configuring a six-level hierarchy within the architecture is not addressed and requires discussion. 3) The methodology for establishing bounding boxes in instances where the neonatal face is partially obscured is unclear and necessitates elaboration. 4) the computation platform didnot mention in the article.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) Fig 2 requires additional detail regarding the layer composition within the network architecture to enhance the reader’s understanding. 2) The ablation study, as presented, necessitates a comprehensive redesign to adequately assess the contributions of different components to the model’s performance.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Contributing positively to the overall evaluation are: 1)the construction of the public Neonatal Facial Dataset (NFD), and 2)the employment of Distance-IoU (DIoU) for bounding box prediction. Conversely, the design of the ablation studies is a detracting element that impacts the overall score negatively.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    All my concerns have been addressed and I basically agree with the response. This paper can be accepted after rebuttal.




Author Feedback

We thank all reviewers for appreciating the new dataset, method design, and clinical value.

1) Reply to R1

Details of Datasets: We will provide GitHub links for readers to download the dataset. Our dataset is the prerequisite for downstream tasks. For example, failure to detect faces, especially at critical moments of pain, can lead to incorrect pain score classification. In Section 5.4, we demonstrate how our dataset benefits downstream tasks, emphasizing that its value extends beyond just face detection.

Code: As stated in the Abstract, we will release the code.

Clarification on HCC: Our HCC and the Hierarchical Context Embedding (HCE) by Qiu et al. share a name but have different goals and methods. HCE targets false positives, while HCC aims to reduce false negatives. HCE embeds hierarchical segmentation features into detection features, while ‘Hierarchical’ in HCC refers to hierarchical classification, i.e., the first step of classification is added to improve the positive/negative anchor ratio in the second step. Moreover, HCC’s context comes from detection features, not segmentation features, and the way of embedding context in HCC is entirely different from that of HCE.

Data Splitting: we clarify that each neonate is associated with five images, and the dataset has been rigorously partitioned to maintain patient exclusivity.

2) Reply to R3

Method Evaluation: We limited our evaluation to the NFD dataset due to the scarcity of publicly available datasets. Neonatal faces vastly differ from adult faces, making evaluations on adult datasets an inaccurate measure of the NFD method’s effectiveness in practical applications. We are also concerned about the fullness of our evaluation. To address this, we quantified the performance gain of our method for downstream tasks in Section 5.4, thus adding an additional evaluative dimension. Moreover, we conducted a comparison with RetinaFace (called NICUface-RF in [6] and is the baseline) and YOLO5Face, instead of just one method. Both are leading methods in adult face detection. Other methods, such as Poly-NL and ASFD, do not provide code. The NFD dataset is self-built, and it is hard to reproduce these methods on NFD without code.

Details of Datasets: Due to space constraints, we mainly describe the participant age, data annotation, dataset partitioning, and challenges embedded in the dataset; the remaining information is provided on GitHub.

Inconsistencies: We verified that the numbers in Table 2 and Fig.3 (a) are consistent rather than mismatched. Table 2 uses percentages, while Fig.3 (a) expresses values in decimal. We will standardize these representations to percentages and refine the consistency of the citations.

3) Reply to R4

Figure 2: We will add the definitions of the ‘x’ and ‘+’. As detailed in Section 4, we implement a feature pyramid where the classification, regression, and DIoU heads assess each anchor across all levels of the feature pyramid. Thus, the inputs to the three heads are the same, namely the Pyramid Feature Map.

Six-Level Hierarchy: This is the commonly used feature pyramid that follows the baseline design and is not the core of our work.

Obtaining Bounding Boxes Under Occlusion: Our method uses anchor-based detection with 102,300 anchors at a 640x640 resolution. Such dense anchors provide ample alternatives for bounding boxes, so the key to coping with occlusion lies in the classification accuracy of the anchors (avoiding missed detections due to occlusion) and the localization precision (selecting the ones with high localization precision among dense anchors). The two correspond to our proposed HCC and DAN, respectively.

Ablation Study: R4’s critique of the ablation study stems from a misunderstanding of our method, which we explained earlier. We discuss the overall influence of the proposed modules and fully analyze their detailed design.

Computation Platform: The platforms are mainly AMD 7950X, NVIDIA P40, and PyTorch 1.12 with CUDA 11.3.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



back to top