Abstract

Developing robust machine learning algorithms is of utmost importance for their applications to biomedical imaging applications. This issue is non-trivial, as networks are generally trained with datasets taken from relatively homogeneous samples dominated by statistically more probable disease classes, leading to unbalanced class distributions. One possible solution is to resolve the intrinsic biases towards certain dominating classes in the training datasets through more data collection with a more diverse sample, which is often prohibitively expensive. Another solution is to directly implement established uncertainty estimation measures for more robust predictions, which are nevertheless computationally demanding and insensitive to class imbalance. To address this issue, we propose a novel class-aware and uncertainty-aware pseudocoreset framework consisting of the following components: 1) An efficient framework with last layer Laplacian approximation 2) Class-aware calibration with error-based regularization, and 3) a Wasserstein distance-based regularization which explicitly imposes uncertainty-awareness. We evaluate our method for In-Distribution calibration, Out-of-Distribution inference, and class balance evaluations in two public skin cancer datasets taken from samples from different geographical location with differing skin colors. Our method outperforms various baseline uncertainty quantification and Bayesian pseudocoreset methods.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2092_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/fx-erick/LLLP

Link to the Dataset(s)

N/A

BibTex

@InProceedings{EriFra_Last_MICCAI2025,
        author = { Erick, Franciskus Xaverius and Müller, Johanna Paula and Li, Zhe and Kainz, Bernhard},
        title = { { Last Layer Laplacian Pseudocoresets for Robust Medical Image Analysis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {277 -- 287}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    In this paper the authors lay out a framework for a class- and uncertainty-aware classification using Bayesian pseudocoresets. This framework improves efficiency by using Laplacian approximation, adds a class-aware regularisation term, adds an uncertainty-aware regularisation term based on calibration, and uses Wasserstein-2 distance regularisation to minimise divergence between pseudocoresets and the original datasets. The authors then conducted several sets of experiments using the ISIC 2019 and ASAN datasets. The authors experimented with in-distribution vs. out-of-distribution, class balance experiments, and an ablation study with the components of the framework LLLP.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed framework helps make existing methods for uncertainty quantification applicable for more medical image analysis methods with a specific focus on skin lesion classifications.
    • The authors have proposed an approach that focuses on computational efficiency in order for these methods to be used in real clinical applications.
    • The methodology is clearly explained, with each component of the pipeline described in sufficient detail to justify its inclusion.
    • The authors have conducted a multi-part evaluation, conducting multiple experiments focusing on different aspects of the proposed method.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • For evaluating uncertainty quantification and out-of-distribution systems, it has recently recommended that using the area under the risk-coverage curve is a more robust form of evaluation. See “A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification.”
    • Expected Calibration Error was used in this paper for the class-aware regularisation term and for evaluation; ECE is generally disfavoured within the MICCAI community with a preference to KDE-ECE that uses a KDE instead of binning. See “Metrics reloaded: recommendations for image analysis validation.”
    • The ASAN skin dataset is made up of close-up photographs, whereas the ISIC dataset is made up of dermoscopic images. This means that training on one set and evaluating the other will lead to issues with distribution shift.
    • There is not enough information in the paper to reproduce these experiments; the code used or supplemental material with more details would help with reproducibility.
    • Minor Issue: Colour jitter was used as part of the data augmentation; this is not a good augmentation to use with skin lesions as slight differences in colour can be key for classifications.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well formatted and explained, with a well-justified novel framework that combines pre-existing methods with some novel contributions to make them work together. There are some issues stopping this from being an Accpet, as mentioned in the weaknesses section.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper
    • The authors propose to optimize the Bayesian pseudocoresets using last layer Laplacian followed by closed form 2-Wassertein distance
    • In addition, the authors propose to optimize the expected calibration error in addition to the cross entropy loss from Zhao et. al
    • Results show improved accuracy, calibration and negative log likelihood and ablation of the different portions of the loss show all portions of the loss are important
  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • While last layer laplacian approximation for uncertainty estimation has been published in multiple prior works, its application in constructing pseudocoresets seem novel
    • Good ablations on the three part loss function and comparison to prior methods
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Last layer Laplacian approximation makes the computation of 2 Wassertein distances simpler, but it comes at the cost of computing the hessian for every pseudocoreset. This was not discussed in the paper

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Clear organization, non-trivial contributions, good experiments and ablations

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper proposes a class aware pseudocoreset construction methodology by combining various regularization terms.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper attempts to solve an important problem in medical AI, i.e. class imbalance is intrinsic to healthcare applications and often overlooked in common AI methodologies. Using ASAN skin dataset for outlier benchmarking presents a real world use case / challenge in skin cancer analysis

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The complete flow of the framework could be better explained. The method builds on prior work but assumes all readers are already knowledgeable of the “classical” flow and terminology. However the average MICCAI attendee would be helped with a more guided/self explanatory explanation of the method. Several concepts attempt to reduce the computational load or efficiency of training (e.g. not using trajectory matching, the concept of first creating psuedocoresets itself,…) but how much these components actually contribute to a more efficient system is not quantified in the experiment section

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The fact that the method tries to solve real world use case, e.g. evaluating skin lesions on darker skin, show merit to the proposed approach, However the explanation of the different components should be explained towards a broad AI medial audience who not necessarily know what pseudcoresets are, why they are needed/beneficial, how this all relates to the Laplacian approximation, …

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

We thank the reviewers for the constructive comments. We address the following discussion points as follows. ECE, OOD detection metrics: We implemented the common metrics used in various uncertainty quantification methods to serve a fair comparison with baseline and state-of-the-art methods. We are aware that there are newer, more robust metrics available, and we would additionally implement them in future studies to develop a more robust evaluation. Choice of datasets and distribution shifts: We explicitly aim to detect semantic outliers, instead of quantifying or detecting explicit distribution shifts in line with currently established state-of-the-art and baseline methods. In these works, neural networks were similarly explicitly trained with in-distribution datasets and inferred with out-of-distribution datasets. Quantifying and implementing explicit information of distribution shifts, however, is a potential future avenue that can further improve our method’s robustness. Last-layer Laplacian approximation: The last-layer formulation allows a significantly light-weight alternative to uncertainty quantification without the computational burden of re-training or fine-tuning the networks, which is the case for Bayesian neural networks as used in the previously established pseudocoresets [2,3,4], and classic UQ baselines such as ensembles and SNGP. While the computation of Hessians can be heavy, the Laplace-redux library [1], allows for efficient partial Hessian approximations. Additionally, the partial Hessian approximations are only done in the last layer instead of the entire networks’ weights. Taking into account both training and inference time, our method still offers a lighter-weight alternative to other baselines. Comparison of computational efficiency: We will include a further analysis of computational efficiency in the final version. We will release our souce code to facilitate further explorations and studies. [1] Daxberger, E., Kristiadi, A., Immer, A., Eschenhagen, R., Bauer, M., Hennig, P. Laplace redux – effortless bayesian deep learning, NeurIPS 2021. [2] Kim, B., Choi, J., Lee, S., Lee, Y., Ha, J.W., Lee, J.: On divergence measures for bayesian pseudocoresets, NeurIPS 2022. [3]. Kim, B., Lee, H., Lee, J.: Function space bayesian pseudocoreset for bayesian neural networks, NeurIPS 2023. [4] Manousakas, D., Xu, Z., Mascolo, C., Campbell, T.: Bayesian pseudocoresets, NeurIPS 2021




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top