Abstract

This study presents a novel approach for automating nutritional status assessments in children, designed to assist health workers in public health contexts. We introduce DomainAdapt a novel dynamic task-weighing method within a multitask learning framework, which leverages domain knowledge and Mutual Information to balance task-specific losses, enhancing the learning efficiency for nutritional status screening. We have also assembled an unprecedented dataset comprising 16,938 multipose images and anthropometric data from 2,141 children across various settings, marking a significant first in this domain. Through rigorous testing, this method demonstrates superior performance in identifying malnutrition in children and predicting their anthropometric measures compared to existing multitask learning approaches. Dataset is available at : iab-rubric.org/resources/healthcare-datasets/anthrovision-dataset

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/4215_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/4215_supp.pdf

Link to the Code Repository

N/A

Link to the Dataset(s)

https://www.iab-rubric.org/resources/healthcare-datasets/anthrovision-dataset

BibTex

@InProceedings{Kha_DomainAdapt_MICCAI2024,
        author = { Khan, Misaal and Singh, Richa and Vatsa, Mayank and Singh, Kuldeep},
        title = { { DomainAdapt: Leveraging Multitask Learning and Domain Insights for Children’s Nutritional Status Assessment } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    1) Data set Collection from 2,162 children across various settings, 2) 16,938 multipose images and anthropometric data for the automating malnutrition screening in children to address global health inequities.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors captured the images data in the real world settings for automating nutritional status assessments in children. They approach and dataset can assist health workers in public health contexts.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The is paper has several criticle weaknesses whicha are listed below: 1) Inadequate Explanation of Multiple Label Prediction Methods: The paper lacks clarity regarding the methods used for predicting multiple labels, particularly regarding the range of alpha and beta values. Additionally, the impact of the dynamic weight sharing mechanism and pose-wise feature sharing are unexplained, leaving gaps in understanding crucial aspects of the model’s operation.

    2) Insufficient Baseline experiment : The absence of references and baseline experiments raises questions about the reliability and reproducibility of the proposed approach.

    3) Lack of Details on Pose-wise Feature Fusion: The paper fails to provide sufficient details on how pose-wise feature fusion is implemented within the model architecture and how it impacts performance. This gap in explanation is concerning, especially considering that Table 4 highlights superior performance of individual poses over fusion methods, raising doubts about the rationale behind using fusion.

    5) Limited Evaluation Metrics and Comparison: Relying solely on accuracy as an evaluation metric may not adequately validate the approach. The paper overlooks the importance of employing mean average precision and F1 score for a more comprehensive evaluation, which could offer insights into the model’s performance across different classes. Additionally, the absence of baseline results from state-of-the-art methods for comparison further weakens the paper’s credibility.

    5) The basic explanatory data analysis details of the anthropometric dataset are missing. (about the each factor)

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    1) Using Other Evaluation Metrics: Instead of relying solely on accuracy, which may not capture the nuances of a model’s performance, employing mean average precision for all predicted labels and F1 score for individual labels offers a more comprehensive evaluation. Mean average precision considers the precision-recall trade-off, while F1 score balances precision and recall for each label, providing insights into the model’s performance across different classes.

    2) Brief Explanation of the Architecture Figure: The architecture figure typically illustrates the design and components of the model being used. A brief explanation of the architecture figure would involve highlighting these components and explaining how authors contribute to the model’s functionality and performance. For the posewise feature author used the two different light weight models but a) need reference and baseline experiment? Mostly for the feature extraction approaches people use the pertain weights of the trained model on a large dataset to get the similar features so b) why the author did not use and which approach the author used to get the features for the fusion.

    3) Missing Details of Pose-wise Feature Fusion: The absence of details regarding pose-wise feature fusion indicates a gap in the explanation of how this technique is implemented within the model architecture and how it impacts performance. The table 4: shows the performance of the individual pose (lateral) shows much better than the all poses (fusion) then why the author used the fusion method.

    4) Complex Analysis in Tables 1 and 2: Tables 1 and 2 likely present results from the experimental evaluation of the proposed approach. The inclusion of various metrics such as RMSE (Root Mean Square Error) and accuracy across different tasks (T1 to T9) suggests a comprehensive analysis of the model’s performance under different conditions or tasks. For example, RMSE may be used to evaluate regression tasks, while accuracy is commonly used for classification tasks. The use of these metrics across multiple tasks provides a holistic view of the model’s capabilities and limitations.

    5) Domain Adoption: The term “Domain Adopt” likely indicates the process of adopting a model or technique from one domain and applying it to address a different problem in another domain. This suggests that the study aims to utilize existing methods or models developed for one domain to solve a distinct problem in another domain. However, the presence of both ‘domain guided’ and ‘domain adopt’ terminologies without clear differentiation raises questions about the author’s rationale and the intended distinction between these terms within the context of the proposed task. Clarification is needed to understand the precise usage and implications of each term in the study’s context.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    1) Domain Adopt: The title is confusing as the term “DomainAdopt” mostly refers to the adaptation of a model or technique from one domain to another to address a different problem. However, the current paper is not about domain adaptation. Also, it is not clear why why different terminology is employed’ Domain guided’ and ‘domain adopt,’

    2: Lack of Novelty in Architecture: The architecture lack of novelty. The proposed model does not introduce any innovative or unique features compared to existing architectures.
    3) Absence of Baseline Results: The absence of baseline results comparing the proposed 4) Also see my comments under the weakness sections

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • [Post rebuttal] Please justify your decision

    Based on the issues identified and the rebuttal, I suggest rejecting the paper for the following detailed reasons:

    1. The title “Domainadopt” is unclear and confusing. It does not meet the standard of clarity expected in academic writing. A more straightforward and descriptive title would help readers understand the focus of the paper better. If the area chair permits, I suggest revising it for better comprehension.

    2. The author mentions computing the RMSE (Root Mean Square Error) for the regression targets, which is correct for a regression task. However, in the reviews the author also discusses the F1 score, For T1,T2,T3, which is typically used for classification tasks, not regression. The author does not explain why or how the F1 score was calculated, leading to confusion about its relevance and application in this context.

    3. The paper lacks technical detail in the Architecture. The author did not address any technical comments or provide reasoning for the methods and claims presented. This absence of explanation makes it difficult to understand or trust the results and conclusions.

    4. The author claims that their method demonstrates ‘robustness in real-world settings with multiple poses,’ but there is no explanation or supporting references to justify this statement. The results presented do not adequately support this claim, making it seem unsubstantiated.

    Overall, the paper needs significant revisions to improve clarity, provide appropriate technical details and justifications, and support its claims with evidence.



Review #2

  • Please describe the contribution of the paper

    An approach for nutritional status detection has been proposed. The large dataset is collected for the validation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Research questions are defined properly. The dataset has been collected for the evaluation.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The contribution for the methodology is limited. The result section is also not very impressive. The standard deviation of results has not been reported.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Fig 1 can be improved. More details of the methods can be added in the caption. The description of results should be improved. The mathematical symbol should properly written.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper has some merits. The contributions are not explained properly.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors addressed my questions.



Review #3

  • Please describe the contribution of the paper

    This paper introduces a new dataset and algorithm for detecting malnutrition in children. The authors propose a method for multi-task training and leveraging domain knowledge to train the network. Their model processes multi-pose images through both VGG and ResNet feature extractors. The ResNet extractor is utilized for malnutrition classification, while the VGG extractor serves as the regression pathway for aligning features with task-specific information. The outputs from these pathways are then merged and passed through a dynamic weight sharing mechanism, optimized using domain-based information.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper is written very clearly and has a strong evaluation. The authors examine the impact of their proposed method on class imbalance and compare it against other types of training. Moreover, they enhance the reliability of their proposed method by integrating domain knowledge alongside image-based classification.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Using images for malnutrition classification can present many problems. The dataset would need to be very carefully curated to mitigate any unintentional biases. Furthermore, there is an underlying assumption that BMI, height, weight and other features presented in the dataset are directly related to malnutrition. In addition, discussion of whether this method can be applied across populations from across the globe is important to understand the impact of this model. However, there is no discussion along these lines in the paper. These are important ethical considerations which should be addressed.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    This paper presents a novel application of deep learning in the medical domain for classifying malnutrition in children. The authors curate a new dataset comprising multi-pose images along with health data such as height, weight, and BMI, among others. They then utilize WHO and CDC statistics to calculate z-scores for these features and employ the obtained values to classify ground truth malnutrition. These images are subsequently input into a dual-branch network, where one branch extracts features from a ResNet for classification and another branch extracts features using a VGG model for regression. These features are merged after further processing with linear layers and trained using domain-specific knowledge to output the final classification. The training encompasses multiple tasks to create a robust model. The authors assess their mod-el’s performance with various training strategies, demonstrating that the proposed domain adapta-tion improves performance, especially under severe class imbalance. They also evaluate perfor-mance across multiple tasks, providing a comprehensive analysis and novel applications. However, there is minimal discussion regarding biases in the dataset and the generalizability of the model to broader populations. For instance, it remains uncertain whether this model can be applied to global populations using the curated dataset. Important ethical considerations, such as examining biases learned by the model from images and ensuring equal treatment of images across all genders and ages, are not addressed in this paper. The lack of insights into dataset characteristics and ethical considerations represents a weakness in the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper addresses a novel application. However, discussions on the dataset and the ethical considerations of building the dataset for malnutrition detection should be included.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    Thanks for the authors’ rebuttal. It partially addressed my concerns.




Author Feedback

R1,R3 and R4: Contribution and Baseline Experiments:Our primary contribution is introducing a novel multitask learning methodology with a unique dynamic, Domain and mutual information (MI) based task loss weighing metric.This learning methodology resulted a more robust model with better classwise accuracies than other sota/baseline multitask learning methodologies detailed in Sec 3 and Tab 2-3 and is thus broadly applicable to other medical domains with similar complexities between tasks.We’ll improve clarity on this.

R1: Assumption of Anthropometric Features:BMI, height, weight, and all others utilized in the paper are established indicators of malnourishment by WHO and CDC.However, the main essence of the methodology is to not depend on a single indicator but to utilise the predefined (domain based) and learned (MI at epoch) dependencies between multiple tasks to improve model performance. For instance, using logistic regression, Individually, Height (AUC=0.58), Weight (AUC=0.63), BMI (AUC=0.70), and Age (AUC=0.54) show weak to moderate predictive power. However, when these features are combined, they achieve a higher AUC of 0.87.

R1,R3: Bias, Ethical Considerations and Dataset Curation:The dataset includes diverse demographics (age, gender) and diversity in terms of data collection (illumination, pose, environmental setting) and includes 2162 individuals with an average age of 10.5 years and gender distribution of 1355 males and 786 females to depicting inherent bias.Model testing on a web-scraped dataset of 3050 subjects (annotated by expert medical professionals) from three ethnicities (White, Black, and Brown) showed an accuracy of 80.58% for binary malnourishment classification showing generalisability.We shall detail the curation process, data statistics and this case study in the supplementary material. We have obtained ethics committee approvals and guardian permissions for the use of this data for research, and will include relevant ethical discussions in the manuscript.

R3: Title and Domain Terminology:By the term “DomainAdapt” We intended to emphasize integrating domain knowledge within our dynamic task-weighing method and how it is adapting the training and loss function. While we believe the current title captures the essence of our approach, we are open to considering alternative titles such as “DomainKnow” to reflect the core contributions better and avoid confusion with domain adaptation literature.

Multiple Label Prediction Methods and Pose-wise Feature Fusion: We provided information on multiple label prediction in Sec 2.1 and 3 - Rep. of Tasks.We appreciate the comments and will improve the clarity and details, including the range of alpha and beta, in the updated manuscript.The ablation study in Tab 4 shows that the lateral pose has a higher RMSE than other poses, making it not the best choice.The all-pose model performs best or on par for regression, ensuring robustness in real-world settings with multiple poses.For classification, accuracy has better classwise accuracy balance as demonstrated in Tab 2-3.The posewise feature fusion is done by concatenating the features after passing through their respective feature extractors as depicted in the Fig1 and Section2.We shall improve clarity in manuscript and figures.Multiple poses are used keeping in mind a real-world setting where we may come across varied poses and this learning methodology incorporating multiple pose fusion exhibited increased robustness and generalizability.

R3,R4: Evaluation Metrics and Architecture Figure: We appreciate the reviewers’ inputs.Although additional metrics for all experiments were not included due to space constraints, we will add more detailed tables in the supplementary material.For example, the Map for T1, T2 and T3 is 0.94, 0.92 and 0.75 respectively and F1 is 0.70, 0.85 and 0.69 respectively.We will include standard deviation in Tab1-4 and elaborate on figure captions.Fig1 will be improved to reflect suggestions




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper presents a novel approach for automating nutritional status assessments in children, designed to assist health workers in public health contexts. The authors introduce “DomainAdapt,” a novel dynamic task-weighing method within a multitask learning framework, which leverages domain knowledge and Mutual Information to balance task-specific losses, enhancing the learning efficiency for nutritional status screening. A dataset comprising 16,938 multipose images and anthropometric data from 2,162 children across various settings was generated, marking a significant first in this domain. After careful consideration of the authors’ rebuttal, two of the three reviewers lean towards a weak accept, whereas the third reviewer rejects the paper. I agree that the authors have somewhat addressed the major concerns and questions raised by the reviewers, however, the lack of technical/algorithmic details compromise the potential innovativeness of the paper. This said, I lean towards rejecting the paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper presents a novel approach for automating nutritional status assessments in children, designed to assist health workers in public health contexts. The authors introduce “DomainAdapt,” a novel dynamic task-weighing method within a multitask learning framework, which leverages domain knowledge and Mutual Information to balance task-specific losses, enhancing the learning efficiency for nutritional status screening. A dataset comprising 16,938 multipose images and anthropometric data from 2,162 children across various settings was generated, marking a significant first in this domain. After careful consideration of the authors’ rebuttal, two of the three reviewers lean towards a weak accept, whereas the third reviewer rejects the paper. I agree that the authors have somewhat addressed the major concerns and questions raised by the reviewers, however, the lack of technical/algorithmic details compromise the potential innovativeness of the paper. This said, I lean towards rejecting the paper.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    I appreciate authors for submitting a well-written manuscript. The rebuttal addressed the main concerns raised by two of the three assigned reviewers. However, the reviewer #3’s concerns on missing technical details and limited Evaluation Metrics and Comparison are still not addressed. Due to this, I recommend reject.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    I appreciate authors for submitting a well-written manuscript. The rebuttal addressed the main concerns raised by two of the three assigned reviewers. However, the reviewer #3’s concerns on missing technical details and limited Evaluation Metrics and Comparison are still not addressed. Due to this, I recommend reject.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper received mixed reviews and the criticism relates to novelty. This meta reviewer argues that the paper makes a valuable contribution despite its limitations and is suitable for the health equity track. In particular, the paper methodology is generally sound and focuses on automating nutritional status assessments in children. The dataset used in the project is extensive and unique, adding to the translational value of this approach. Thus, the paper makes a good starting point for further research and is well in scope for the health equity track.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper received mixed reviews and the criticism relates to novelty. This meta reviewer argues that the paper makes a valuable contribution despite its limitations and is suitable for the health equity track. In particular, the paper methodology is generally sound and focuses on automating nutritional status assessments in children. The dataset used in the project is extensive and unique, adding to the translational value of this approach. Thus, the paper makes a good starting point for further research and is well in scope for the health equity track.



back to top