Abstract

Federated learning (FL) has emerged as a promising approach for collaborative medical image analysis, enabling multiple institutions to build robust predictive models while preserving sensitive patient data. In the context of Whole Slide Image (WSI) classification, FL faces significant challenges, including heterogeneous computational resources across participating medical institutes and privacy concerns. To address these challenges, we propose FedWSIDD, a novel FL paradigm that leverages dataset distillation (DD) to learn and transmit synthetic slides. On the server side, FedWSIDD aggregates synthetic slides from participating centres and distributes them across all centres. On the client side, we introduce a novel DD algorithm tailored to histopathology datasets which incorporates stain normalisation into the distillation process to generate a compact set of highly informative synthetic slides. These synthetic slides, rather than model parameters, are transmitted to the server. After communication, the received synthetic slides are combined with original slides for local tasks. Extensive experiments on multiple WSI classification tasks, including CAMELYON16 and CAMELYON17, demonstrate that FedWSIDD offers flexibility for heterogeneous local models, enhances local WSI classification performance, and preserves patient privacy. This makes it a highly effective solution for complex WSI classification tasks.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1647_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/f1oNae/FedWSIDD

Link to the Dataset(s)

CAMELYON16 dataset: https://camelyon17.grand-challenge.org/Data/ CAMELYON17 dataset: https://camelyon17.grand-challenge.org/Data/

BibTex

@InProceedings{JinHao_FedWSIDD_MICCAI2025,
        author = { Jin, Haolong and Liu, Shenglin and Cong, Cong and Feng, Qingmin and Liu, Yongzhi and Huang, Lina and Hu, Yingzi},
        title = { { FedWSIDD: Federated Whole Slide Image Classification via Dataset Distillation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15973},
        month = {September},
        page = {183 -- 193}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper introduces FedWSIDD, an innovative federated learning algorithm tailored for the classification of whole slide images (WSIs). FedWSIDD utilizes techniques such as dataset distillation, stain normalization, and template matching to create synthetic slides from real WSIs. This approach addresses the variability in computational resources and multiple instance learning (MIL) architectures across different institutions. The method demonstrates superior test classification accuracy compared to other state-of-the-art federated learning techniques on the CAMELYON16 and CAMELYON17 datasets, providing a practical solution for federated learning in resource-constrained digital pathology contexts.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    One of the key advantages of this paper is its use of dataset distillation to generate synthetic slides for transmission to the server, instead of traditional model weights. This innovative approach reduces the need to share real patient data and accommodates differences in MIL algorithms and heterogeneous computational capacities across various centers. The authors conducted a thorough evaluation, benchmarking FedWSIDD against other standard and personalized federated learning methods, clearly illustrating its benefits over competing techniques. The disclosure of source code enhances the work’s utility for the broader research community.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Despite its novel contributions, the complexity of the method may pose challenges for adoption, particularly in medical centers with limited technical expertise. While synthetic data offers privacy benefits, it may compromise explainability, as evidenced by difficulties in interpreting patterns on synthetic slides in Figure 3. Additionally, while the methodology excels in classification tasks, the reliance on synthetic data might omit complex disease patterns in pathology slides, potentially impacting applications such as staging and survival prediction. The evaluation scope is limited to specific datasets like CAMELYON16 and CAMELYON17; expanding to broader datasets could enhance the generalizability of the findings. Furthermore, the paper lacks a discussion on the implementation in real-time clinical workflows, which is crucial for its practical application in medical settings.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The authors state that in each distillation round, one real slide (xj) and one synthetic slide (sj) are selected for each class. Clarification on the selection process and whether synthetic slides are categorized into different classes would be beneficial in the text. If I understand correctly, the authors use M synthetic slides per class (M=10 as implemented). Each synthetic slide is distilled using one real slide from the same class. Given a sufficient number of distillation rounds, each synthetic slide will be feature-matched to all real slides. Consequently, all aggregated extracted features of synthetic slides as per Equation (3) would be consistent across the index j. The authors are encouraged to comment on how the number of distillation rounds affects the expressive power of synthetic slides. A discussion on the method’s limitations would be valuable, alongside a detailed description of the stain normalization process, particularly how it applies to synthetic slides. A more comprehensive account of data preprocessing steps, including stain normalization and synthetic data generation, would be essential for replicating the experimental conditions. Additionally, it would be beneficial to provide an analysis of the computational requirements and overhead resulting from synthetic data generation and stain normalization processes, particularly in practical deployments.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method seems to be beneficial to the research community. However, there are unclear parts in the writing that needs further clarifications. Moreover, the limitations of the method was not discussed.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    Many thanks to the author for addressing my comments. I suggest accepting this paper.



Review #2

  • Please describe the contribution of the paper

    The paper presents a new federated learning method applicable to the problem of WSI classification.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper combines the idea of federated learning with a specific application in WSI classification. They do this while avoiding the need to have each patch classified, by assuming that multiple instance learning methods will be applied at each node. What is shared between nodes are synthetic patches that (hopefuly) share the relevant features of the original patches. By sharing only synthetic patches, the problem of violating data sharing rules is avoided. The results show that, at least in the benchmarks used, the method outperforms both standard FT methods and what the authors call personalizaed FL methods.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    It is unclear to me how can the authors guarantee that the synthetic patches, obtained by a distillation method, preserve the original properties of the real patches.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The results shown on the CAMELYON datasets indicate that the methods leads to effective distributed learning, and the approach is sound, despite the doubts posed above about the fidelity of the synthetic patches when compared with the real patches.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors answered, to some degree, my central objection, that the synthetic patches may not carry enough information.



Review #3

  • Please describe the contribution of the paper

    The main contribution of the paper is the introduction of FedWSIDD, a novel federated learning framework specifically designed for weakly supervised classification of Whole Slide Images (WSIs). Unlike conventional FL approaches that require sharing model parameters or assume homogeneous computational resources across institutions, FedWSIDD proposes to share synthetic slides, generated through a custom dataset distillation algorithm. This approach improves inter-institution collaboration while preserving patient data privacy and allowing flexibility in local MIL (Multiple Instance Learning) model selection. Moreover, the framework integrates stain normalization into the distillation process, enhancing the quality and generalizability of synthetic slides. Extensive experiments on benchmark datasets (CAMELYON16 and CAMELYON17) demonstrate superior classification performance and robustness of FedWSIDD, especially in heterogeneous and privacy-sensitive environments.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strengths of this paper lie in the clinical relevance of the addressed problem and the originality of the proposed solution. The authors introduce a novel federated learning framework based on dataset distillation, tailored for WSI classification under weak supervision. The formulation is well-motivated, and the method demonstrates both practical scalability and adaptability across heterogeneous models. The paper provides a thorough description of the data processing and experimental setup, ensuring reproducibility. A broad and well-designed set of experiments is presented, including ablation studies and performance evaluation across diverse configurations. Particularly noteworthy is the inclusion of statistical significance testing. An additional strength, though not emphasized by the authors, is the reduced variance in performance across trials, indicating that the proposed method not only improves mean performance but also increases training stability compared to existing FL techniques.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    I did not identify any major weaknesses in the paper. If anything, one suggestion for improvement would be to provide a more in-depth discussion regarding the choice of the number of synthetic patches B. While the experiments offer empirical evidence supporting the use of a limited number of synthetic slices, a more conceptual and methodological rationale behind this design choice would have been valuable. This could help readers better understand the trade-offs involved and potentially guide future applications or extensions of the method. Furthermore, since this work proposes a federated learning algorithm where communication between clients and the central server is a key component, an analysis of the system load in terms of the volume of data transmitted during each round of training would have been a valuable addition to assess scalability and practical feasibility in real-world distributed settings.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I decided to recommend an accept because the topic is highly relevant and timely. The paper is well written, the proposed method is clearly presented and technically sound, and the experimental evaluation is both comprehensive and appropriate. Moreover, the comparison with state-of-the-art approaches is well conducted, supporting the claims of the authors and demonstrating the effectiveness of the proposed solution.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We thank R1 and R2 for their positive assessments and R3 for the constructive feedback.

R1, R3: The effectiveness and fidelity of the synthetic patches. Effectiveness: Our preliminary results indicate that training solely on synthetic slides can still achieve comparable performance to training on local real data, suggesting that synthetic slides effectively capture discriminative properties of real slides. Moreover, FedWSIDD leverages both real and synthetic slides during local training, ensuring that complex disease patterns in the original data are not overlooked. Future work will extend evaluations to more complex tasks (such as staging and survival prediction). Fidelity: Synthetic patches are initialized randomly and iteratively aligned with randomly selected real patches during distillation, allowing each synthetic patch to learn distinct trajectories and capture diverse aspects of the real data [5, 27]. Regarding the number of distillation rounds, our current setup adopts the standard setup in prior work [12]. Further ablation studies will be explored in a future journal submission.

R2-1: Rationale behind the choice of the number of synthetic patches. The rationale aligns with the objective of dataset distillation, leading us to begin with small values for M and B. As shown in Section 3.4 and Fig. 2, unlike natural image datasets where more synthetic data often improves performance, the gains on WSI datasets plateau beyond a certain point. This likely reflects the limited data variation in WSI datasets, where a smaller set of synthetic samples can effectively capture key features.

R2-2: Analysis of system load. Using CAMELYON16 as a reference: FedWSIDD transmits 20 synthetic slides (10 per class), with each slide comprising 100 patches of size 3x64x64. This results in a data volume of ~9 MB per round.

R3-1: The method complexity may pose challenges for adoption. Compared to other FL frameworks that require multiple communication rounds, FedWSIDD uses a one-shot communication which reduces the overall complexity. Moreover, we have open-sourced the code and are committed to providing support to facilitate adoption.

R3-2: Expanding experiments to broader datasets. We chose CAMELYON16/17 as they are widely used in prior works [19]. Besides, we have also conducted preliminary evaluations on TCGA-IDH. Additional results will be included in a future journal submission.

R3-3: Clarify the stain normalisation. For each class, we randomly initialize M synthetic slides consisting of B patches, each as learnable parameters. We then sample a real slide from the same class and apply a differentiable stain normalisation [8] to both real and synthetic patches. Specifically, a template patch with pixel mean and standard deviation closest to the dataset average is selected, and its stain vector and concentration are pre-computed offline. During training, stain normalisation is applied via matrix multiplication, aligning both real and synthetic patches to the template’s colour appearance using the derived stain vector and concentration. We will revise the manuscript to clarify this process further.

R3-4: Computational overhead. Using CAMELYON16 as a reference, local training takes ~1 hour per client. With M=10, B=100, each distillation step takes ~3 hours per client. For comparison, other baselines that require multiple global communications (we use 20 rounds following [19]), resulting in a total runtime of ~20 hours. In contrast, FedWSIDD’s one-shot communication significantly reduces overhead.

R3-5: Limitations. The trade-off between realism and privacy is discussed in Section 3.4. To address this, future work will explore generative models to enhance interpretability; while FedWSIDD was not explicitly tested in real clinical workflows, it was evaluated in a practical homogeneous local model setup (Section 3.3), and ongoing collaborations with multiple medical centers will further assess its clinical applicability.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top