Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Federated learning is a decentralized training approach that keeps data under stakeholder control while achieving superior performance over isolated training. While inter-institutional feature discrepancies pose a challenge in all federated settings, medical imaging is particularly affected due to diverse imaging devices and population variances, which can diminish the global model’s effectiveness. Existing aggregation methods generally fail to adapt across varied circumstances. To address this, we propose FedCLAM, which integrates client-adaptive momentum terms derived from each client’s loss reduction during local training, as well as a personalized dampening factor to curb overfitting. We further introduce a novel intensity alignment loss that matches predicted and ground-truth foreground distributions to handle heterogeneous image intensity profiles across institutions and devices. Extensive evaluations on two datasets show that FedCLAM surpasses eight cutting-edge methods in medical segmentation tasks, underscoring its efficacy. The code is available at https://github.com/siomvas/FedCLAM .

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3215_paper.pdf

SharedIt Link: https://rdcu.be/eHdSJ

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04978-0_24

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/siomvas/FedCLAM

Link to the Dataset(s)

N/A

BibTex

@InProceedings{SioVas_FedCLAM_MICCAI2025,
        author = { Siomos, Vasilis AND Passerat-Palmbach, Jonathan AND Tarroni, Giacomo},
        title = { { FedCLAM: Client Adaptive Momentum with Foreground Intensity Matching for Federated Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15965},
        month = {September},
        page = {247 -- 257}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes a new federated framework for medical image segmentation, named FedCLAM. The framework incorporates a Foreground Intensity Matching (FIM) loss to address inter-institutional heterogeneity by penalizing discrepancies in intensity distributions between predicted and ground-truth foreground regions. In addition, the method introduces a client-adaptive, momentum-based aggregation strategy derived from each client’s loss reduction during local training.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The introduction of the Foreground Intensity Matching (FIM) loss is a notable strength, which is specifically designed to address inter-client heterogeneity in medical image segmentation.
2. The experimental section is thorough and well-structured. The authors compare their method against a diverse set of baselines, covering a wide range of federated segmentation approaches. The dataset usage is relatively comprehensive, and the inclusion of detailed ablation studies helps validate the effectiveness of the proposed components.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The quality of the method illustration is a notable weakness. Several visual elements, such as disconnected lines and inconsistent shapes for identical components, indicate a lack of rigor in figure design. These issues undermine the clarity of the method and give the impression of carelessness.
2. There is an inconsistency between the method description and the pseudocode regarding the update rule. Equation (3) does not clearly specify whether the speed vectors are computed using values from the previous round or only from the current one, which may cause confusion about the temporal dependencies in the proposed formulation.
3. The explanation of the proposed Foreground Intensity Matching (FIM) loss lacks clarity. While the idea of aligning intensity distributions between predicted and ground-truth regions is interesting, the formulation incorrectly refers to “ground-truth intensity values,” which do not exist—ground-truth labels are categorical and contain no intensity information. In practice, intensity values are derived from the input image using the predicted or ground-truth masks, but this critical detail is not explicitly stated in the text or equations, potentially leading to misunderstandings about the loss computation.
4. In the ablation study, the dice score decreases as the weight of the FIM loss increases, suggesting that the loss may negatively impact model optimization when overly emphasized. This raises concerns about its stability and overall contribution to performance.
5. The paper should report the number of samples per client in the dataset description, as this information is critical for assessing whether performance differences are influenced by data imbalance—particularly since FedAvg uses client data quantity as aggregation weights.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1. Please be more rigorous in the design of figures. Visual quality and clarity are not merely aesthetic considerations—they reflect the overall level of care and precision in the work. Inconsistent shapes, disconnected lines, and unclear layout can diminish the perceived credibility of the method.
2. Ensure consistency in mathematical notation throughout the paper. Every symbol or variable introduced should be clearly defined, and formulas presented in the main text should match those used in the pseudocode. Discrepancies between them can confuse readers and hinder reproducibility.
3. In Section 3.2, adding brief explanations of the roles of multiplication and addition within the speed vector computation would help convey the intuition behind the formulation and improve logical clarity.
4. In Section 3.3, the idea of addressing inter-site intensity variation is a valuable and innovative contribution. However, the method description repeatedly refers to “ground-truth intensity values,” which is misleading, as ground-truth masks are categorical and contain no intensity information. It is only by inspecting the code that it becomes clear the intensity values are actually extracted from the input image using the predicted or ground-truth masks. This crucial detail should be explicitly stated in the text to avoid confusion and to accurately reflect the actual implementation of the Foreground Intensity Matching (FIM) loss.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

While the paper presents a novel perspective and addresses a relevant problem, the overall presentation lacks clarity and rigor. The method description and diagrams are not carefully crafted, and some important details, such as how the intensity values are obtained in the proposed loss, are missing from the text and only become clear after examining the code. This makes the paper difficult to follow for readers who rely on the manuscript alone. Furthermore, the ablation results are not fully convincing, as performance drops when the FIM loss weight increases. These issues collectively affect the overall quality of the work.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper introduced a novel federated learning framework on segmentation. Proposed framework featured a new client-adaptive and momentum-based aggregation method as well as dampening factors to address data heterogeneity issue in federated learning. In addtion, the proposed framework proposed a new foreground intensity matching loss to further harmonize contrast and brightness bias among different datasets. The paper also performed extensive evaluation on two datasets and showed SOTA performance among multiple baseline models.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed client adaptive momentum as well as the dampening term innovatively addresses the current challenge of non-iid federated learning with adaptive aggregation adjustment based on each local updates.
2. The proposed Foreground Intensity Matching Loss innovatively enforces the model to learn site-specific contrast/intensity based feature to further improve segmentation performance, which could be generalized in centralized learning setup as well.
3. The study performed detailed evaluation and thorough ablation study, which showed the proposed framework’s SOTA performance on two public datasets and explored the contribution of each introduced module through detailed ablation study.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The adopted backbone segmentation model is still vanilla UNet. It would be interesting to compare performance of the proposed framework on different backbone models to further evaluate the performance across different model architecture.
2. Although the paper compared proposed method with extensive baseline federated learning models, it would interesting to compare the performance with centralized learning method and validate how much gap there is between proposed federated learning method and the centralized learning method.
3. Although the implementation of the proposed method is federated learning, the proposed paper did not mention the privacy policy within the framework design, which may not represent the real world federated learning setup.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper clearly presents the current advancement and challenges on federated learning for data with non-iid distribution. Based on the current challenge, the paper proposed innovative modification from both the aggregation process and loss functions to suppress the effect of data heterogeneity. The paper also presents extensive validaiton on two distinct datasets to prove better performance of proposed methods than those of the prior works. The innovation on federated learning framework and structured validation support the paper to be accepted.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

There are three major categories of parameter aggregation methods in federated learning: (1) cost-based weighted averaging, (2) similarity-based weighted averaging, and (3) optimization-based updates. This study proposes a novel method that falls under the category of optimization-based updates. The main contribution lies in integrating locally trained and validated information into momentum and dampening terms. These terms are used by the aggregator to construct a velocity vector that guides the update direction from the previous global parameters to the next ones, scaled by a learning rate. In addition, the study introduces a new loss component based on pixel-level similarity, specifically tailored for medical imaging tasks.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) One of the main strengths of the paper is the proposal of a novel method for computing momentum and dampening terms in the model aggregation process in federated learning. This approach offers a fresh perspective on how local training can be effectively utilized to guide global updates. (2) Additionally, the study introduces a loss component that reflects the characteristics of medical images, aiming to mitigate the training instability often observed with traditional loss such as dice and binary cross entropy. (3) Another notable strength is the use of an optimization-based update scheme that appropriately considers both global and local optimizers.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

One of the main weaknesses of the paper is the claim that the proposed method outperforms state-of-the-art algorithms, without sufficient consideration of how performance can vary depending on local data distributions. In federated learning, the behavior and effectiveness of algorithms are highly sensitive to heterogeneity in client data, which can significantly affect both training and final performance.

Therefore, a more comprehensive evaluation using diverse performance metrics specific to federated learning is necessary. For example, the convergence score used in the FeTS Federated Learning Challenge provides a meaningful measure that captures how quickly different algorithms reach maximum performance within a same training time. Incorporating such metrics would strengthen the evaluation and provide a more nuanced comparison.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

Convergence score is the time-to-convergence metric that is calculated as the area under the validation learning curve over a specific period, where the horizontal axis represents simulated runtime, and the vertical axis represents the current best score. The score itself is computed as the average of the DICE scores for enhancing tumor, tumor core, and whole tumor over the validation split of the training data. Detailed information can be found at: https://github.com/FETS-AI/Challenge/blob/main/Task_1/README.md
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents an impressive performance on evaluation against various state-of-the-art models, demonstrating the effectiveness of the proposed approach. Particularly noteworthy is the use of momentum and dampening terms in the federated model aggregation process, which addresses the instability that often arises when local training information is incorporated into aggregation. Furthermore, the ablation study provides valuable insights into the selection and impact of the hyperparameters associated with these mechanisms.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We’d like to thank all the Reviewers for their valuable time and effort in reviewing our paper. There were some interesting suggestions and comments raised in the Reviews, which we address below:

Reviewers #2 and #3 raised excellent points with regards to possible extensions of our experiments, namely expanding to more backbone architectures, comparing centralised performance, adding privacy measures like DP, and comparing convergence scores. We plan to acknowledge these avenues in the camera-ready paper and endeavour to pursue them further.

Reviewer #1 had some constructive feedback which will help us improve the manuscript for the camera-ready version. Concretely:

We will clarify how the speed vectors are initialised in Eq. (3) and add a $r-1$ exponent to the RHS to clarify temporal dependencies.

We will rephrase the term “Ground-truth intensity” to “Intensity of ground-truth foreground” where it appears, so that the FIM loss formulation is clearer and aligned between our text and our code.

We will add further details of the number of samples per client in our dataset description.

Once again, we’d like to thank the Reviewers for their kind comments and acknowledgment of our contributions. Your feedback has been invaluable in helping us improve the quality of our work, and we look forward to incorporating these refinements in the final version.

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

FedCLAM: Client Adaptive Momentum with Foreground Intensity Matching for Federated Medical Image Segmentation

Author(s):