Abstract

In medical imaging, domain adaptation (DA) enables the transfer of knowledge from models trained on labeled source domains to unlabeled target domains that exhibit distribution shifts. In real world, medical images often contain multiple disease-related labels. However, existing multi-label domain adaptation (MLDA) algorithms face two primary challenges in addressing multi-label domain shifts: inadequate capture of disease features and insufficient integration of information from each individual class. To tackle these challenges, we propose a novel approach, Wasserstein Adversarial Learning with Class-Level Alignment, designed to align feature distributions for medical MLDA. By utilizing adversarial learning guided by Wasserstein distance, our approach captures more complete domain-invariant representations of lesion region. Additionally, we introduce a class-level alignment loss that leverages individual class information to further reduce domain discrepancies. Extensive experiments on real medical datasets demonstrate that our method significantly enhances medical multi-label domain adaptation and outperforms existing state-of-the-art algorithms.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5107_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/lwjie595/WAL_CLA

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiuWen_Solving_MICCAI2025,
        author = { Liu, Wenjie and Miao, Fuyou and Wang, Xu},
        title = { { Solving Medical Multi-Label Domain Adaptation via Wasserstein Adversarial Learning with Class-Level Alignment } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},

}


Reviews

Review #1

  • Please describe the contribution of the paper

    C1. The authors introduce a Wasserstein adversarial learning process guided by the Wasserstein distance to enhance model’s ability of capturing more intrinsic characteristics of medical imaging in MLDA. C2. The authors combine the class-level alignment loss that explicitly aligns feature distributions for each class to integrate individual label information better.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    S1. The paper structure is reasonable and the method is decripted clearly. S2. The experiment is comprehensive and reasonable. S3. The motivation is reasonable.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    w1. The authors do not display the more visualization results. w2. The authors should evluate the method on more datasets, such as Synthetic double-moon dataset, Office-31, Office-Home, and so on.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The experiments are not enough.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The authors introduce Wasserstein Adversarial Learning with Class-Level Alignment (WAL-CLA), designed specifically for medical multi-label domain adaptation.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. They are exploring a challenging problem in medical domain, domain adaptation and multi-label classification.
    2. They utilize Wasserstein distance which seems to be more stable measure for source-target alignment.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. More explanation is needed in why multi-label data in medical domain is challenging and how it is different from standard multi-class.
    2. The explanation given in introduction section on why most existing multi-class DA algorithms are not good enough to perform on multi-label domain shifted problems. What exactly existing multi-class DA algorithms lack?
    3. I’m skeptical about the domain splits, which needs more evidence of in what ways this split represents challenging distribution shifts in a real clinical environment?
    4. How about the trade off or interfere of those losses, binary cross entropy and adversarial WGAN-like objective?
    5. It lacks generalizability. They should evaluate their model on larger cohort data.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Authors should elaborate on the unique challenges of multi-label domain adaptation for medical data. Also, it would be beneficial to provide deeper theoretical support for the class-level alignment, and validate the real impact on clinically relevant tasks and metrics.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a method for multi-label domain adaptation, using adversarial learning based on Wasserstein distance, combined with a class-level alignment loss. The method is specifically designed for multi-label problems, such as the chest X-ray classification tasks that the paper is evaluated on. The evaluation of the paper indicates that this approach outperforms several baselines.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method is an interesting extension to the Wasserstein-based approach, tailored to the multi-label application.

    • The experiments include a small ablation study showing the contribution of the two components.

    • The introduction provides a clear justification of the method. The implementation is fairly clearly explained.

    • The authors made their code available.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The discussion of the results is somewhat limited. For example, it would have been interesting to discuss why the proposed method seems to work less well on Fron->Lat and Lat->Fron.

    • The evaluation covers two datasets, but they are both chest X-ray datasets. First, it would have been nicer to include a different application. Second, given that they are both chest X-ray datasets, it would have been interesting to do the adaptation between those two datasets.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • Table 2: Typo: “Targe” is missing a t at the end.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In general, a nice paper with a method that seems to work.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I’d like to thank the authors for their responses in the rebuttal. Overall, I still think this is an interesting paper. I would have liked to see a comparison on a non-chest-X-ray dataset, but I think the current evaluation is sufficient to get an idea of the performance.




Author Feedback

We sincerely thank all reviewers for their thoughtful and valuable feedback. We will carefully correct confusing explanations and refine the paper to improve readability in revision. Below we respond to the reviewers’ questions.

Q1: Experiments (R1,R2,R3) We appreciate these constructive suggestions. To the best of our knowledge, ChestX-ray14 and CheXpert are already among large publicly available X-ray datasets. Specifically, ChestX-ray14 contains 112,120 images, while CheXpert contains 224,316 images. We believe that the two datasets used in our study are sufficiently large to validate the effectiveness of our method. In addition, Office-31 and Office-Home are mainly designed for multi-class domain adaptation (DA), where each data is associated with a single label, rather than supporting multi-label DA (MLDA). The synthetic Double Moon dataset is typically used as a toy example for DA. Also, none of these 3 datasets are tailored for medical applications, which limits their relevance to our setting. Moreover, many publicly available medical datasets lack multi-label annotations, which limits their applicability to our setting. To address this, we are currently collaborating with the medical imaging center of our local hospital to curate new multi-label ultrasound datasets. In future, we will further extend our method to a broader range of medical modalities and clinical applications in an extended journal version. Since new results in rebuttal are not allowed, we will include more results on the fundus dataset ODIR-5K, and the transfer results between the two chest X-ray datasets in revision if permitted.

Q2: Limited discussion (R1) Thanks for the comment. The reason of the method working less well on Fron→Lat and Lat→Fron is mainly due to the severe data imbalance between two domains when split by view, where Fron domain contains over 190k samples and Lat domain includes only about 30k samples. This large imbalance poses a significant challenge to the Wasserstein adversarial alignment. We will discuss our results more specifically in final version.

Q3: Visualizations (R2) Thanks for the suggestion. We will include Grad-CAM-based attention map visualizations to better illustrate which regions of the image the model attends to during MLDA process.

Q4: Challenge of multi-label data (R3) Thanks for your advice. As the example clarified in Page2, multi-label data refers to cases where a patient’s X-ray image may exhibit multi co-occurring diagnostic labels, such as Atelectasis and Effusion. For clinicians, it increases the risk of overlooking one or more relevant labels in diagnosis. Meanwhile, multi-class data refers to instances where each data is annotated with only a single diagnostic label. We will add more explanation in revision.

Q5: Multi-class DA lack (R3) Multi-class DA models mainly focus on assigning a single label among multiple labels to each sample, which limits their abilities to capture co-occurrence of interrelated labels and to couple dependencies between multi disease indicators. In contrast, MLDA models are designed to capture the presence of multi diagnoses simultaneously.

Q6: Domain splits (R3) To illustrate distribution shift across domains, we take label statistics of the Fron and Lat domains from CheXpert dataset as an example. In Fron domain, 41.7% of samples are labeled with ‘Pleural Effusion’ and 26.4% with ‘Edema’. In contrast, in Lat domain, only 29.6% and 7.8% of samples are labeled with them, respectively. Moreover, as shown in Tables 1&2, ResNet trained on source domain performs significantly worse when transferred to target domain, compared to the ResNet trained directly on target domain.

Q7: Trade off of losses (R3) The impact of each loss component on our model performance is shown in Table 3. We have conducted parameter sensitivity analysis on the PA→AP and AP→PA of ChestX-ray14 to determine the weights of different loss terms. Due to limited space, the full details are not presented.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top