Abstract

Despite recent advancements in federated learning (FL) for medical image diagnosis, addressing data heterogeneity among clients remains a significant challenge for practical implementation. A primary hurdle in FL arises from the non-IID nature of data samples across clients, which typically results in a decline in the performance of the aggregated global model. In this study, we introduce FedMRL, a novel federated multi-agent deep reinforcement learning framework designed to address data heterogeneity. FedMRL incorporates a novel loss function to facilitate fairness among clients, preventing bias in the final global model. Additionally, it employs a multi-agent reinforcement learning (MARL) approach to calculate the proximal term (μ) for the personalized local objective function, ensuring convergence to the global optimum. Furthermore, FedMRL integrates an adaptive weight adjustment method using a Self-organizing map (SOM) on the server side to counteract distribution shifts among clients’ local data distributions. We assess our approach using two publicly available real-world medical datasets, and the results demonstrate that FedMRL significantly outperforms state-of-the-art techniques, showing its efficacy in addressing data heterogeneity in federated learning.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2368_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2368_supp.pdf

Link to the Code Repository

https://github.com/Pranabiitp/FedMRL

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Sah_FedMRL_MICCAI2024,
        author = { Sahoo, Pranab and Tripathi, Ashutosh and Saha, Sriparna and Mondal, Samrat},
        title = { { FedMRL: Data Heterogeneity Aware Federated Multi-agent Deep Reinforcement Learning for Medical Imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The study introduces FedMRL, a new framework for federated multi-agent deep reinforcement learning that targets the issue of data heterogeneity across different clients. FedMRL uses a multi-agent reinforcement learning approach to determine the proximal term in the personalized local objective function, ensuring that it converges towards the global optimum.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1) Solid literature review; paper is well-written. Presentation is clear.

    2) The use of DRL for federated learning across different clients seems to be new.

    3) The combination of QMIX for calculating the proximal term and SOM for server-side weight aggregation should be new and reasonable.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    This work essentially presents a general learning model, which would be best published in venues related to learning, such as the places where the major references like Fedavg [12], Fedprox [7], Fednova [18], and FedBN [9] published.

    The innovation in this work is not related to MIC / CAI / Clinical Translation / Health Equity. Therefore, MICCAI may not be an appropriate venue for evaluating and publishing this work.

    In Question 3, “Please categorize the relevancy of the paper,” I don’t believe this paper relates to any of the four categories provided. I chose “MIC” only because I had to select one.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The innovations in this work primarily focus on federated learning. Therefore, I would recommend submitting the work to conferences such as ICLR, ICML, NeurIPS, etc.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work essentially presents a general learning model, which would be best published in venues related to learning, such as the places where the major references like Fedavg [12], Fedprox [7], Fednova [18], and FedBN [9] published.

    The innovation in this work is not related to MIC or CAI. Therefore, MICCAI may not be an appropriate venue for evaluating and publishing this work.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper introduces FedMRL, a novel FL framework specifically designed to manage data heterogeneity in medical imaging diagnosis. The approach integrates multi-agent RL to dynamically adapt the optimization process for decentralized data, addressing the challenges posed by non-IID data distributions among various clients, such as hospitals.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Main strength of FedMRL: 1) Uses multi-agent reinforcement learning for dynamically adjusting the proximal term (μ) in the local training of each client, adapting the training process to the specific data characteristics of each client, improving the relevance and effectiveness of the model updates.

    2) Incorporates a loss function to ensure fairness among clients - this loss function minimizes discrepancies in loss values across clients, aiming for a more equitable model performance irrespective of client data distribution.

    3) Employs a self-organizing map to adjust weights during model aggregation, based on the similarity of client data distributions to the global model, which helps to mitigate the impact of data distribution shifts, promoting a more stable convergence to a global optimum.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Authors acknowledge potential scalability issues with increasing numbers of clients and the high computational demands of MARL and SOM-based aggregation. The methods, while effective, may not be practical in real-world scenarios where hundreds or thousands of hospitals might participate in federated learning systems. The computational overhead related to calculating personalized μ values and performing adaptive weight aggregation using SOM could be significant, especially in environments with limited resources.

    • The paper compares FedMRL with several state-of-the-art methods but does not include comparisons with other reinforcement learning-based or advanced federated learning strategies that address data heterogeneity. For instance, methods such as FjORD (Federated learning with Joint Optimization for Reduced Disparity) which also aim to address heterogeneity in federated learning setups could provide an example for comparison (Smith et al., 2021, Arxiv).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • You have clearly outlined the datasets used and the general approach for training and testing. However, more details on the data preprocessing steps, model parameter settings, and validation procedures would enhance reproducibility.

    • The results section is detailed with comparisons to state-of-the-art methods. However, the impact of FedMRL’s novel components (adaptive μ values and SOM-based weight adjustments) on performance could be further clarified with an ablation study. Consider adding an ablation study that systematically removes or varies components of the FedMRL framework to show their individual contributions to the overall performance. This would provide deeper insight into which elements are most critical for handling data heterogeneity.

    • The manuscript is generally well-written but occasionally suffers from jargon that may not be accessible to all readers in the broader community.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and introduces a novel methodolgy which has strong results. The pseudo code and publicly available dataset make it easily reproducible. However, there is information missing as previously mentioned which would make the paper a clear accept.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The contributions of this paper are valid, however I am uncertain weather MICCAI is the right place to publish it



Review #3

  • Please describe the contribution of the paper

    FedMRL incorporates a novel loss function to facilitate fairness among clients, preventing bias in the final global model.

    FedMRL employs a multi-agent reinforcement learning (MARL) approach to calculate the proximal term (μ) for the personalized local objective function, ensuring convergence to the global optimum.

    FedMRL integrates an adaptive weight adjustment method using a Self-organizing map (SOM) on the server side to counteract distribution shifts among clients’ local data distributions.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Promising way to utilise MARL in FedProx to handle the Non-IID data. Utilising SOM for the Non-IID data aggregation is novel too.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The loss function is simple summation of three items. The normalization or the weight control are missing. How do you scale these three items and adjustably control their importances in the learning is necessary?

    Clinical meaning is limited. It is basically a FL model evaluated with medical data.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Carefully consider the design of Eq 3

    More details about ablation test can be added.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is a meaning and promising research for federated learning. But actual Clinical meaning is limited.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The extended discussion highlights the effectiveness of the work, and the provision of source code addresses the reproducibility issue. However, the fundamental problem remains unchanged: this is an innovation in Federated Learning (FL), not in Medical Image Computing (MIC). Nonetheless, it cannot be denied that it plays a beneficial role in distributed training scenarios across multiple hospitals. Therefore, I maintain a ‘weak accept’ stance and leave the final decision to the meta-reviewer.




Author Feedback

We thank the reviewers for their positive and constructive suggestions, highlighting our novel and effective method for medical image classification. Here are our responses:

  1. Not related to MIC or CAI (R1, R4): We appreciate the acknowledgment of our paper’s strengths and the suggested venues such as ICLR, ICML, and NeurIPS. We will certainly consider these for future work. However, we emphasize that the primary objective of our work is medical image classification using federated learning (FL). Our work addresses the crucial challenge of data heterogeneity in FL-based medical image classification tasks. We utilized popular datasets for skin cancer and diabetic retinopathy, demonstrating that our approach mitigates challenges associated with FL-based computer-assisted diagnosis. For comparative analysis, we employed well-known popular baselines. Notably, in MICCAI 2022 and 2023, there were a total 20 accepted FL-based papers using similar baselines, underscoring the relevance of our approach. We believe our proposed approach is a valuable contribution to the healthcare domain and aligns with MICCAI’s objectives. We humbly request reviewer R1 to reconsider the score and acknowledge our work’s significance.

  2. More details on the data preprocessing, model parameter settings, and validation procedures would enhance reproducibility (R3): Due to page limitations, we could not provide extensive details in the manuscript. Some of these are mentioned in Section 4.2. We will include the remaining details in the camera-ready paper and share a GitHub link to facilitate reproducibility.

  3. The manuscript is generally well-written but occasionally suffers from jargon (R3): We will review the entire manuscript and revise it to reduce jargon, ensuring better accessibility for all readers.

  4. Potential scalability issues with increasing numbers of clients and the high computational overhead (R3) : Our model is designed for hospitals, clinics, and research institutions, enabling cross-silo FL with typically a few hundred clients. While the paper focuses on the efficacy of FedMRL in addressing data heterogeneity, future research will optimize computational aspects and explore deployment strategies for large-scale FL systems. We will explore distributed computing and resource-sharing models to alleviate the server-side burden. Additionally, we will investigate more computationally efficient MARL algorithms.

  5. Comparisons with other RL-based or advanced FL strategies (R3): We have compared our work with four popular state-of-the-art baselines addressing data heterogeneity. Most existing RL-based works use a single-agent approach, differing from our multi-agent approach. In future work, we will explore these RL-based methods for comparative analysis.

  6. Ablation study (R3,R4): Due to page limitations, we were unable to include ablation studies. We will add a small section discussing the outcomes in the camera-ready paper to provide readers with an understanding of the different parts of the architecture.

  7. The normalization or the weight control in the loss function and carefully consider the design of Eq 3 (R4): The loss function in FedMRL balances local loss, proximal regularization, and fairness across clients without explicit weight control. These terms collectively address data heterogeneity and ensure fair convergence across hospitals. The proximal term (μ) is dynamically adapted through our multi-agent RL approach, providing implicit weight control based on client state and performance feedback. This adaptive mechanism adjusts the regularization term’s importance according to each client’s data distribution and model performance. Exploring optimal weight settings and their impact on performance is a promising direction for future work, potentially leading to extensions of the current work.

  8. Github link for reproducibility (R1, R4): We will provide the Github link in the camera-ready version of the manuscript.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Reviewers are not sure this is a paper appropriate for MICCAI, because it is a “pure” federated learning research with experiments on two medical imaging datasets. I recommend acceptance with the intention to encourage researchers from outside to participate in MICAAI. But this paper can be rejected if it is considered out of the scope of MICCAI.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Reviewers are not sure this is a paper appropriate for MICCAI, because it is a “pure” federated learning research with experiments on two medical imaging datasets. I recommend acceptance with the intention to encourage researchers from outside to participate in MICAAI. But this paper can be rejected if it is considered out of the scope of MICCAI.



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper can be accepted with minor revision. and hope the authors should update again their paper before submit the final version.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper can be accepted with minor revision. and hope the authors should update again their paper before submit the final version.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This work has three reviewers, and two of them are positive to accept it. Although this work has minor drawbacks, I think this work can be accepted by MICCAI 2024. The authors are suggested to revise the paper based on the reviewer comments, especially explain why this the proposed federated learning method is dedicated for medical images.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    This work has three reviewers, and two of them are positive to accept it. Although this work has minor drawbacks, I think this work can be accepted by MICCAI 2024. The authors are suggested to revise the paper based on the reviewer comments, especially explain why this the proposed federated learning method is dedicated for medical images.



back to top