Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Federated learning (FL) provides a promising paradigm for collaboratively training machine learning models across distributed data sources while maintaining privacy. Nevertheless, real-world FL often faces major challenges including communication overhead during the transfer of large model parameters and statistical heterogeneity, arising from non-identical independent data distributions across clients. In this work, we propose an FL framework that 1) provides inherent interpretations using prototypes, and 2) tackles statistical heterogeneity by utilising lightweight adapter modules to act as compressed surrogates of local models and guide clients to achieve generalisation despite varying client distribution. Each client locally refines its model by aligning class embeddings toward prototype representations and simultaneously adjust the lightweight adapter. Our approach replaces the need to communicate entire model weights with prototypes and lightweight adapters. This design ensures that each client’s model aligns with a globally shared structure while minimising communication load and providing inherent interpretations. Moreover, we conducted our experiments on a real-world retinal fundus image dataset, which provides clinical-site information. We demonstrate inherent interpretable capabilities and perform a classification task, which shows improvements in accuracy over baseline algorithms.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3955_paper.pdf

SharedIt Link: https://rdcu.be/eHwXb

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04981-0_44

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/berenslab/FedAdapterProto

Link to the Dataset(s)

N/A

BibTex

@InProceedings{OfoSam_PrototypeGuided_MICCAI2025,
        author = { Ofosu Mensah, Samuel AND Djoumessi, Kerol AND Berens, Philipp},
        title = { { Prototype-Guided and Lightweight Adapters for Inherent Interpretation and Generalisation in Federated Learning } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},
        page = {464 -- 473}
}

Reviews

Review #1

Please describe the contribution of the paper

In this work, authors propose an FL framework that is claimed to provide interpretations using prototypes, and 2) tackles statistical heterogeneity by utilising lightweight adapter modules. Each client locally refines its model by aligning class embeddings toward prototype representations and simultaneously adjust the lightweight adapter. This approach is claimed to be communication efficient.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novelty is in design of adapters and prototypes in a clever way to make it communication efficient and interpretable. However the reviewer needs more evidence and experimentation to decide whether these claims are generalizable.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
A. Although communication efficiency is claimed, no quantification is done, no comparison is presented with other algorithms OR other communication efficient Federated learning approaches . FOr example:
1. M. Chawla, G. R. Gupta, S. Gaddam and M. Wadhwa, “Beyond Federated Learning for IoT: Efficient Split Learning With Caching and Model Customization,” in IEEE Internet of Things Journal, vol. 11, no. 20, pp. 32617-32630, 15 Oct.15, 2024
2. M. Beitollahi and N. Lu, “FLAC: Federated learning with autoencoder compression and convergence guarantee”, Proc. IEEE Global Commun. Conf., pp. 4589-4594, 2022.
3. F. Sattler, S. Wiedemann, K.-R. Müller and W. Samek, “Robust and communication-efficient federated learning from non-IID data”, arXiv:1903.02891, 2019.
B. Interpretability studies are not clear. It is not sure if there has been a human validation of the interpretability claims. No metric is presented. Only some sample figures are presented.

C. Only 1 dataset is used and the limitations of the approach have not been discussed.

D. More questions: How to choose prototypes? How many prototype examples need to be chosen per client? Does that depend on number of classes? Is the prototype the same for every client? In the federated setting, who is supposed to choose prototypes?
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The claims haven’t been justified with proper experimentation. The design has promise, but the evaluation needs to be thorough. Please refer to the weakness section to see the limitations.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper proposed the first federated learning(FL) framework to integrate adapter based fine-tuning with prototype-based learning. In addition, although the performance of proposed framework did not yield the SOTA performance, the proposed framework allowed inherent interpretations on the local features learned by each local model.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The paper is the first to combine template matching and adapter in federated learning. The introduced prototype based learning incorporated a learnable multi-scale template matcher to explicitly learn multi-scale feature for sliding window interpretation on feature importance, which is a noval way to explain how the model handles the input infomration
2. The paper presents extensive benchmark comparison between proposed model and prior works.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Although the proposed work did extensive benchmarking comparing proposed framework with prior works, the ablation study was missing to examine the effectiveness of each proposed module.
2. The description on regularization of prototype was unclear in the paper. A clear explanation to correct-class prototype and wrong-class prototype as well as the loss function utilized for lclst and lsep is needed to understand how prototype is updated locally.
3. Based on the description in the paper, the major difference between FedAdapter with proximal constraint is the added prototype based learning, which lower the performance of the model, but the paper did not have clear interpretation on the reason of this phenomenon.
4. The paper only reports the accuracy for the classification task implemented, while there are more robust metrics (AUCROC, mean Averge Precision, etc.) which are not included, which weaken the overall assessment of the model performance.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper proposed an interesting lightweight prototype based method in federated learning to improve interpretability of local models while maintaining efficient communication among sites. However, the detail design of the prototype based learning in loss function needs to be clearly explained, and an insight on why added prototype module reduce the performance comparing to FedAdapter is also required in order to evaluate the impact of the paper.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

Authors’ rebuttal answers major concerns mentioned in initial review. However, based on authors’ rebuttal, federated adapter with proximal constraint achieves the best performance. Therefore, proving that the proposed method is not the most optimal solution on data heterogeneity across sites, authors should modify their contribution as achieving efficient communication while preserving similar performance to avoid any confusion. Overall, the proposed work is innovative and should be accepted.

Review #3

Please describe the contribution of the paper

The transfer of model weights for large models across networks can be challenging for federated learning training with limited bandwidth or other scarce resources.

This work utilized the transfer of lighter weight components - “adapters” and “prototypes” - as condensed representations of the local models, which was then aggregated to perform the federated training. The authors demonstrated their methods on retinal fundus images, Eye-PACS, and utilized a pretrained ResNet50 model, and then performed federated training for four clients.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper used federated learning, with two main benefits - reducing the transfer of heavy model filesizes to minimize communication overhead, and to utilize different clients to aim for explainability of the trained networks.

The FedAdapter approach achieved high results, with a proximal constraint (although slightly better than the original prototypes - ~86.7 AUC versus 78 AUC for FedAvg).

On the APTOS dataset (as an external dataset), the classification was ~95% accurate.

The paper looks at interpretability - Figure 4 looks at how the different clients identify different regions, showing any inherent weaknesses of the model.

Prototype learning (much like federated learning) is still a new and developing field.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

A discussion on where this would be applicable would be greatly appreciated. Any hospital that has access to advanced imaging equipment will have access to the internet, or else how is a healthcare practitioner supposed to analyze the images? While the work is great in understanding theoretical understanding and possible frameworks, it would benefit from describing more potential application scenarios.

The paper mentions aligning the embeddings of the prototypes, and proposes a general methodology, but lacks details on how the prototypes are selected initially and refined during the federated learning process. This information would greatly enhance the paper.

Although this is slightly addressed in the introduction, more text could be included on a more detailed discussion of how combining adapters and prototypes overcome where prior works fall short.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Federated learning is still quite unique, and there are improvements to be made, especially as model weights may be able to be reduced - the adapter and prototype framework explores methods for performing FL. The combination appears to be unique, although both adapters and prototypes have been used before. Their work also seems to improve on state of the art. The retinal datasets, although used previously in FL contexts, doesn’t appear to be one frequently published on.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank the reviewers for their insightful feedback. They appreciated our novel proposed model designed to efficiently communicate only adapters and prototypes in a federated learning setting while providing interpretations.

Applicability [R1]: We apologise for the omission. Our target users are resource-constrained or bandwidth-constrained nodes such as 1) community hospitals in low- and middle-income countries with internet connectivity but with typical upstream rates of < 5 Mbps and expensive data plans, 2) on-premises Picture Archiving and Communication Systems (PACS) that are permitted to exchange parameters but not pixels or 3) mobile medical (diabetic retinopathy (DR)) screening units that connect only intermittently (e.g. nightly sync over 4G or satellite).

Questions on prototypes [R1, R3]: In our work, choosing a prototype depends on several hyperparameters as correctly pointed out. Specifically, we used the same number of prototypes per client. The number of prototypes depended on the number of classes C and clients K (i.e. C x K). The prototypes start identically from the global parameters, and they are adapted by the client during local training. In the federated learning setting, the server chooses the prototypes in the sense of providing initial weights and clients adapt them during local training. We will clarify these questions in the paper.

Combining adapters and prototypes [R1]: In our work, the adapters mitigate client-specific features-shift by giving each client a low-rank residual, while prototypes impose a shared interpretable structure to the logit space, helping the global model and therefore also the local models to resist label-shift and offering case-level explanations. Their combination gives both personalisation and interpretability. We will include this explanation in the work.

Ablation study [R2]: Our method is a combination of the FedPrototypical and FedAdapter, thus these individual methods already represent the effectiveness of the missing modules. We will add more information to Table 1 including other performance metrics (in this case, AUC). We acknowledge that an ablation study over different adapter architectures (e.g. LoRA) and prototype hyperparameters is missing. We will address how each choice affects accuracy, interpretability, and communication cost in future work.

Describing prototype regularisation[R2]: We apologise for the omission and will clarify them in the paper. The correct-class prototypes are representative of the correct class being classified, while wrong-class prototypes represent instances that are not in the correct class. The cluster loss (lclst) pulls each feature vector towards the nearest prototype of its own class while the separation loss (lsep) pushes it away from the wrong-prototype. The gradient of the combined regulariser is used to update the prototype vectors locally on each client; the resulting prototypes are averaged by the server to form the global set for the next communication round.

Interpreting the difference between FedAdapter with proximal constraint [R2]: The adapters with proximal constraint alone have the full capacity of the classifier head, whereas our model shares part of the capacity with prototypes. The tiny accuracy gap is therefore expected; the gain is the provision of visual explanations.

Communication efficiency claims [R3]: We apologise for omitting this part from the work. We will add the number of parameters of each algorithm and their corresponding bytes to Table 1 to clarify communication efficiency claims.

Interpretability studies [R3]: While bounding boxes show evidence of DR lesions (Figure 3), we did not add human validation of the interpretability claims, and this will be added to the limitation of the work. We plan to do a further study which includes human validation of the interpretations.

Dataset and limitations [R3]: Besides the EyePACS dataset, we reported a zero-shot evaluation on the APTOS dataset (Sec4.3)

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The rebuttal addresses the key concerns raised by the reviewers, as confirmed by R1 and R3. I carefully examined the response to R2 and believe the clarification can be helpful.

back to top

Prototype-Guided and Lightweight Adapters for Inherent Interpretation and Generalisation in Federated Learning

Author(s):