Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Recent advances in histopathology vision-language foundation models (VLFMs) have shown promise in addressing data scarcity for whole slide image (WSI) classification via zero-shot adaptation. However, these methods remain outperformed by conventional multiple instance learning (MIL) approaches trained on large datasets, motivating recent efforts to enhance VLFM-based WSI classification through few-shot learning paradigms. While existing few-shot methods improve diagnostic accuracy with limited annotations, their reliance on conventional classifier designs introduces critical vulnerabilities to data scarcity. To address this problem, we propose a Meta-Optimized Classifier (MOC) comprising two core components: (1) a meta-learner that automatically optimizes a classifier configuration from a mixture of candidate classifiers and (2) a classifier bank housing diverse candidate classifiers to enable a holistic pathological interpretation. Extensive experiments demonstrate that MOC outperforms prior arts in multiple few-shot benchmarks. Notably, on the TCGA-NSCLC benchmark, MOC improves AUC by 10.4% over the state-of-the-art few-shot VLFM-based methods, with gains up to 26.25% under 1-shot conditions, offering a critical advancement for clinical deployments where diagnostic training data is severely limited. Code is available at https://github.com/xmed-lab/MOC.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0042_paper.pdf

SharedIt Link: https://rdcu.be/eHwTM

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04971-1_40

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/xmed-lab/MOC

Link to the Dataset(s)

N/A

BibTex

@InProceedings{XiaTia_MOC_MICCAI2025,
        author = { Xiang, Tianqi AND Li, Yi AND Zhang, Qixiang AND Li, Xiaomeng},
        title = { { MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15964},
        month = {September},
        page = {423 -- 432}
}

Reviews

Review #1

Please describe the contribution of the paper

The main contribution of this article is the proposal of a Meta Optimized Classifier (MOC), which presents a novel task aimed at addressing the limitations of current WSI classification methods based on the Visual Language Foundation Model in situations of data scarcity. In addition, the methods proposed in this article have excellent performance, surpassing traditional multi instance learning methods trained on large-scale datasets.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

This paper introduces an innovative framework consisting of two key components: -Meta-learner: A meta-learner that automatically optimizes the classifier configuration by selecting from a diverse set of candidate classifiers, thereby maximizing classification performance. -Classifier Bank: A classifier bank that houses a variety of candidate classifiers, enabling a more comprehensive pathological interpretation and improving the model’s ability to adapt to a wide range of pathological features.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

One of the main weaknesses of the paper is that while it emphasizes the few-shot classification capability of vision-language foundation models as its key innovation, it does not fully leverage the potential of these models. The paper does not sufficiently demonstrate how VLFMs are effectively utilized in addressing few-shot learning challenges. Additionally, the proposed meta-optimizer, which is presented as an innovative component, lacks thorough justification. The use of a simple linear layer does not convincingly reflect the concept of meta-optimization, and the explanation of how it contributes to few-shot learning remains underdeveloped. Overall, the logical foundation and innovation of the meta-optimizer appear insufficiently explored.ntly demonstrate how VLFMs are effectively utilized in addressing few-shot learning challenges. Additionally, the proposed meta-optimizer, which is presented as an innovative component, lacks thorough justification. The use of a simple linear layer does not convincingly reflect the concept of meta-optimization, and the explanation of how it contributes to few-shot learning remains underdeveloped. Overall, the logical foundation and innovation of the meta-optimizer appear insufficiently explored.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

My overall score for this paper was primarily based on my evaluation of the proposed method’s design and innovation. While the paper addresses an interesting problem in the context of few-shot learning for VLFMs, I found that the method lacks sufficient novelty. The key innovation, namely the meta-optimizer, is not convincingly presented, and its functionality appears to be overly simplistic, with the use of a linear layer not effectively demonstrating the concept of meta-optimization. Additionally, the paper does not fully exploit the potential of VLFMs in tackling the few-shot classification task, which limits the impact of the proposed approach.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper proposes Meta-Optimized Classifier (MOC), a novel few-shot learning framework for VLFM-based whole slide image classification under data-scarce conditions. MOC features a meta-learner that selects optimal classifier configurations from a diverse classifier bank, enabling robust and holistic pathological interpretation. It achieves significant performance gains, improving AUC by up to 26.25% in 1-shot settings and 10.4% on TCGA-NSCLC, advancing the clinical applicability of vision-language models in histopathology.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The proposed VLM-based few-shot WSI classification addresses a highly promising research direction. It has the potential to alleviate data scarcity, leverage the capabilities of existing foundation models more effectively, and holds significant value for novel or rare disease scenarios.
2. The paper is clearly written, with well-structured language and easy-to-follow explanations.
3. The main idea is novel and should be encouraged.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. A critical concern lies in Table 1 and Table 2, where the baseline “MOC w/o (M, u)” yields identical results across different shot settings. What is the reason for this? Furthermore, this baseline already significantly outperforms all other compared methods, which raises serious concerns. A baseline should represent a simplified version of the proposed method without its key innovations. If such a baseline already exceeds the performance of all competing approaches, it appears highly unreasonable. The authors must clearly explain this anomaly.
2. Additionally, based on my extensive experience in this field, it is extremely difficult for any model to achieve strong performance with only 1-shot training samples. Yet, the proposed method reaches an AUC of 88.29 in the 1-shot setting, which is highly surprising and implausible. The authors must provide a detailed explanation for this result. Moreover, how are the few-shot training samples selected? How are the test samples chosen? Is the test set visible during training? Please also provide per-fold results to better understand performance consistency.
3. Given the exceptionally strong results reported, the authors are encouraged to release the code publicly to support reproducibility.
4. Regarding the method: how are the candidate classifiers selected? I remain skeptical as to whether, under a 1-shot training scenario, the meta-learner can effectively learn how to select the optimal classifier. The authors should provide more justification or empirical evidence to support this.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Despite my concerns about the results, I acknowledge the methodological novelty and the promising research direction. Therefore, I am inclined to give a weak accept in this first-round review. I encourage the authors to thoroughly address all the concerns, upon which I will consider adjusting my score. My overall attitude is cautiously optimistic.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

This paper introduces MOC, a meta-optimized classifier for few-shot WSI classification using vision-language foundation models (VLFMs). MOC consists of (1) a meta-learner that dynamically selects an optimal classifier configuration from a bank of diverse candidate classifiers and (2) a classifier bank that houses diverse candidate classifiers, enabling comprehensive pathological interpretation. Compared to conventional few-shot VLFM-based methods, MOC replaces static classifier designs with a dynamic and parameter-efficient meta-optimization strategy. Authors validate the method on multiple few-shot benchmarks, including TCGA-NSCLC, outperforming state-of-the-art approaches by up to 10.4% in AUC, with performance gains further increasing to 26.25% under 1-shot conditions.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Classifier diversity through classifier bank: The proposed approach introduces classifier diversity in a principled way, encouraging complementary diagnostic signals rather than relying on a single model.
2. Modular and interpretable design: The classifier bank is composed of clearly defined candidate classifiers, each with interpretable roles, which allows for easy experimentation and extension to new domains or classifier types.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. Limited architectural novelty: Many individual elements (e.g., classifier ensembles, prompt tuning, patch scoring) have been seen in prior works. The innovation is primarily in the combination.
2. Scalability concerns: The paper lacks analysis of computational cost and does not examine trade-offs between performance gains and added complexity.
3. Insufficient discussion: The paper does not sufficiently discuss the limitations of the proposed method.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1. Figure 2 consist of excessive range of font sizes, which makes the figure look inconsistent.
2. Figure 3 can be enlarge for better visibility.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors clearly outline key implementation details, including consistent data splits, backbone models, preprocessing steps, and prompt settings across all compared methods.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank all reviewers and AC. R1-R3 all recognize our method’s exceptional performance, calling it innovative (R1&R3), promising (R1), clear (R1, R2), and interpretable (R2). Key concerns include Novelty (R2, R3); Exploitation of VLFM (R3); Meta Learner Rationale & Architecture (R3); Experiment Details (R1) and Model Efficacy (R2).

Experiment Details (R1)

“Baseline”(R1.Q1): By removing all trainable components from MOC, our baseline is a zero-shot method. We report the average zero-shot results across all test sets and shots, so they are identical. Unlike compared methods that rely on intensive classifiers prone to overfit, we introduce a novel paradigm using parameter-free classifiers. Both our baseline and MOC mitigate catastrophic forgetting seen in previous work, achieving superior performance. Notably, MOC further improves baseline results by over 3.29%.

“1-shot Results” (R1.Q2&Q4): Previous methods aim to optimize the universal intensive classifier, overfit on 1-shot samples with catastrophic forgetting. In contrast, our candidate classifiers are parameter-free, preserving VLFM’s pathology understanding. As candidates have complementary focuses, suboptimal relative significance has limited harm and improves with more training data (see Tab.1&2).

“Data Split & Result Details” (R1.Q2): We perform 5-fold experiments with strict train/val/test separation. Few-shot samples are randomly drawn from the train set, and the test set remains unseen during training. Results are reported as mean ± standard deviation, with low std indicating performance consistency (e.g., 2.65 vs TOP’s 6.95 in 1-shot NSCLC).

Reproducibility (R1.Q3): Code and exact data splits will be released upon acceptance.

Novelty (R2.Q1, R3) We are the first to apply meta-learning for few-shot pathology classification. Unlike prior methods that treat all patches equally with one same but intensive classifier prone to outliers in few-shot scenarios, our approach leverages meta-learning to dynamically customize optimal classifiers for each patch. This is achieved via a bank of parameter-free candidate classifiers, each with different diagnostic emphases.

Our major difference and advantages:

Customized classifiers for each patch: Conventional classifier ensemble treats patches equally, and operates at slide level. In contrast, our meta-learning operates at patch-level, tailoring unique classifiers for each patch, achieving superior performance (see Tab.4).

Stronger Explainability: Conventional patch score use single Blackbox as classifier. MOC is meta of multiple parameter-free and interpretable classifiers with different focuses on diagnostic emphases.

Prompt Tuning: There seems to be some misunderstanding. Our method doesn’t involve any prompt tuning, as our instruction prompts are fixed.

Model Efficacy (R2.Q2): MOC has low computational cost as the only trainable part is a two-layer perceptron.

Future Work (R2.Q3): MOC is flexible and complementary with other few-shot work (i.e., prompt-tuning), we will seek unified few-shot pipeline.

Figure Layout (R2): Figures will be adjusted for better visibility in final version.

Exploitation of VLFM (R3) We exploit VLFM’s potential via aggregating VLFM’s multi-aspect pathology findings for prediction. Unlike prior methods (see Tab.1&2) with deficient classifiers, our MOC utilizes diverse candidate classifiers to capture different diagnostic emphases hidden inside VLFM’s findings and a meta-learner to aggregate them comprehensively.

Meta Learner Rationale & Architecture (R3) “Meta” means learn-to-learn, accordant with our “learn to customize classifiers”. MOC’s meta-learner dynamically compose unique classifiers for each patch. With few-shot samples and few candidate classifiers, a two-layer perceptron suffices for the meta-learning objective (see Tab.4).

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

MOC: Meta-Optimized Classifier for Few-Shot Whole Slide Image Classification

Author(s):