Abstract

Despite the notable success of current Parameter-Efficient Fine-Tuning (PEFT) methods across various domains, their effectiveness on medical datasets falls short of expectations. This limitation arises from two key factors: (1) medical images exhibit extensive anatomical variation and low contrast, necessitating a large receptive field to capture critical features, and (2) existing PEFT methods do not explicitly address the enhancement of receptive fields. To overcome these challenges, we propose the Large Kernel Adapter (LKA), designed to expand the receptive field while maintaining parameter efficiency. The proposed LKA consists of three key components: down-projection, channel-wise large kernel convolution, and up-projection. Through extensive experiments on various datasets and pre-trained models, we demonstrate that the incorporation of a larger kernel size is pivotal in enhancing the adaptation of pre-trained models for medical image analysis. Our proposed LKA outperforms 11 commonly used PEFT methods, surpassing the state-of-the-art by 3.5% in top-1 accuracy across five medical datasets. The code is available at: https://github.com/misswayguy/LKA.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2911_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/misswayguy/LKA

Link to the Dataset(s)

Blood cell dataset: https://www.kaggle.com/datasets/paultimothymooney/blood-cells; BUSI dataset: https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset Brain tumor dataset: https://www.kaggle.com/datasets/masoudnickparvar/brain-tumor-mri-dataset Tuberculosis dataset: https://www.kaggle.com/datasets/tawsifurrahman/tuberculosis-tb-chest-xray-dataset Covid-19 dataset: https://www.kaggle.com/datasets/pranavraikokte/covid19-image-dataset

BibTex

@InProceedings{ZhuZiq_LKA_MICCAI2025,
        author = { Zhu, Ziquan and Lu, Si-Yuan and Huang, Tianjin and Liu, Lu and Liu, Zhe},
        title = { { LKA: Large Kernel Adapter for Enhanced Medical Image Classification } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15965},
        month = {September},
        page = {402 -- 412}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The main contribution of this paper is the design of the Large Kernel Adapter (LKA), a parameter-efficient fine-tuning method that enhances medical image classification by expanding the effective receptive field. Unlike existing PEFT methods that overlook the importance of capturing long-range spatial context, LKA incorporates a channel-wise large kernel convolution between a down-projection and up-projection, effectively adapting pre-trained models to the anatomical variability and low contrast of medical images. With minimal increase in parameters, LKA achieves up to 3.5% improvement in top-1 accuracy over 11 state-of-the-art PEFT methods across five medical datasets, and often outperforms full fine-tuning.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1).Novel Method Design: The proposed Large Kernel Adapter (LKA) introduces channel-wise large kernel convolution into the adapter framework, effectively expanding the receptive field while maintaining parameter efficiency. 2).Strong Empirical Evaluation: Extensive experiments on five medical datasets and three model backbones demonstrate consistent and significant performance gains over 11 state-of-the-art PEFT methods. 3).Comprehensive Ablation Study: The paper provides a careful analysis of kernel sizes, integration positions, and bottleneck widths, offering clear evidence supporting the design choices. 4).High Parameter Efficiency: LKA achieves superior performance with minimal increase in trainable parameters compared to baseline adapters. 5).Clinical Relevance: The method is validated on diverse and practical medical image classification tasks, supporting its potential real-world applicability.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    1). Limited Novelty in Adapter Framework: The adapter structure largely follows existing designs [5], and large kernel techniques have been explored in prior works such as ChannelNets and large kernel attention. 2). Lack of Evaluation Beyond Medical Imaging: The method is only evaluated on medical datasets, with no evidence of generalizability to other vision tasks. 3). No Analysis of Computational Cost or Inference Speed: The paper lacks analysis of runtime, memory, or inference speed, despite claiming parameter efficiency. 4). No Detailed Clinical Impact or Interpretability Discussion: There is no exploration of clinical impact or model interpretability, which are critical in medical AI. 5). Kernel Size Selection is Empirical: Kernel size selection is empirical, with no theoretical justification or adaptive mechanism.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper proposes a simple yet effective module, the Large Kernel Adapter (LKA), which enhances the receptive field in adapter-based parameter-efficient fine-tuning for medical image classification. The motivation is well-grounded in the unique challenges of medical imaging, such as low contrast and high anatomical variability, and the proposed solution is easy to integrate into existing architectures while maintaining parameter efficiency. The authors conduct extensive experiments across multiple datasets and backbones, showing consistent improvements over 11 strong baselines. Additionally, the paper includes detailed ablation studies that provide useful insights into kernel size, adapter placement, and bottleneck width. However, the work lacks deeper theoretical analysis to support its empirical findings, and the generalizability of the method beyond the medical domain is not explored. Some visual explanations, such as ERF visualizations, could also be clearer. Despite these limitations, the strong empirical performance and practical value of the proposed method justify a Weak Accept recommendation.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes a novel parameter-efficient method named LKA by integrating large-kernel convolution within adapters. This method expands the effective receptive field to capture critical features, such as anatomical structures. Extensive ablation studies and comparisons with other methods demonstrate the superior performance and effectiveness of the model

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed methodology is both novel and intuitively understandable. It introduces only a small number of additional parameters to the pre-trained model while yielding significant improvements in accuracy
    • The paper presents a thorough analysis, including ablation studies on kernel size, adapter placement, and comparisons across different backbone models. It also benchmarks against various baselines to validate the effectiveness of the approach
    • In-depth discussions on the effective receptive field contribute to a clearer understanding of the model’s internal workings
    • The use of multiple diverse datasets demonstrates the model’s robustness and generalizability across different domains within medical imaging
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The mechanism of the LK-Conv module is not clearly explained, which may hinder understanding for readers unfamiliar with the architecture. Additional formulas or illustrative figures would help clarify how LK-Conv operates.
    • Key dataset details are missing. To fully understand the implementation, information such as dataset size, data dimensions, and descriptive characteristics should be included. Furthermore, the procedure for splitting the data into training and testing sets is not clearly described.
    • Several implementation details—such as the number of training epochs and other protocol specifications—are absent. This lack of information hinders reproducibility. Moreover, the evaluation protocol is insufficiently described, particularly regarding how many times the experiments were repeated to assess performance variability.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • Providing more details in the LK-Conv-like figure would enhance the reader’s understanding of the proposed method.
    • Additional information on implementation and datasets would strengthen the paper’s reproducibility; including the source code would be particularly beneficial.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although the proposed method is novel, easy to follow, and supported by comprehensive experiments, the paper lacks certain implementation and dataset details necessary for full reproducibility (codes would be necessary too). Additionally, a clearer explanation of the method—through more visualizations or mathematical formulations—would further enhance its understandability

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    In this article, based on the characteristic that medical images require attention to a larger receptive field, the author proposes a new adapter for the fine-tuning of neural networks. This adapter increases the receptive field through a large kernel size and is verified on six publicly available medical image datasets. According to the experimental results, the adapter proposed in the article outperforms some previous adapters in the performance of large models such as Swin-L and ViT-L for medical image classification. As ablation study, the article compares the impacts of different kernel sizes, different positions for adding the adapter, bottleneck width, as well as the use of CW-Conv and D-Conv on the performance of the proposed adapter. Moreover, by increasing the neck-size of the vanilla adapter, it rules out the influence of the increase in parameters brought about by the large kernel size on the performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main innovation in this article is applying a large kernel size during the fine-tuning of large-scale image models to obtain information from a larger field of view. The article has a high degree of completeness, and a relatively large number of ablation studies have been carried out.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The innovation of using a large kernel size to obtain a larger field of view in medical image processing is not very high. For example, there is an article titled “A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity” by Shuxiao Ma et al., which was published in Brain Sci. in 2022, volume 12, issue 12, on page 1633, with the DOI: https://doi.org/10.3390/brainsci12121633.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    It is an innovation point that this article applies the advantage of a large kernel size bringing a wider field of view to the field of adapter design for the fine-tuning of large-scale image models. Moreover, the author has verified it on multiple large-scale models and several medical datasets.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A




Author Feedback

N/A




Meta-Review

Meta-review #1

  • Your recommendation

    Provisional Accept

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A



back to top