Abstract

Transfer learning, by leveraging knowledge from pre-trained models, has significantly improved the performance of downstream tasks. However, as deep neural networks continue to scale, full fine-tuning poses substantial computational and storage challenges in resource-constrained environments, limiting its practical adoption. To address this, parameter-efficient fine-tuning (PEFT) methods have been proposed to reduce computational complexity and memory requirements by updating only a small subset of parameters. Among them, matrix decomposition-based approaches such as LoRA have shown promise, but often struggle to fully capture the high-dimensional structural characteristics of model weights. In contrast, high-order tensors offer a more natural representation of neural network parameters, enabling richer modeling of multi-dimensional interactions and higher-order features. In this paper, we propose tCURLoRA, a novel fine-tuning method based on tensor CUR decomposition. By stacking pre-trained weight matrices into a third-order tensor and applying tensor CUR decomposition, our method updates only the compressed tensor components during fine-tuning, thereby substantially reducing both computational and storage costs. Experimental results show that tCURLoRA consistently outperforms existing PEFT approaches on medical image segmentation tasks. The source code is publicly available at: https://github.com/WangangCheng/t-CURLora.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0014_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/WangangCheng/t-CURLora

Link to the Dataset(s)

N/A

BibTex

@InProceedings{HeGua_tCURLoRA_MICCAI2025,
        author = { He, Guanghua and Cheng, Wangang and Zhu, Hancan and Cai, Xiaohao and Yu, Gaohang},
        title = { { tCURLoRA: Tensor CUR Decomposition Based Low-Rank Parameter Adaptation and Its Application in Medical Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15975},
        month = {September},
        page = {583 -- 593}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The study adapts different parameter-efficient fine-tuning (PEFT) strategies to medical image segmentation, to reduce computational complexity and storage requirements by freezing most of the pretrained model parameters. The main contributions are two-fold: (1) A novel fine-tuning method based on tensor CUR decomposition (tCURLoRA), which addresses core issues of vanilla finetuning methods such as LoRA, which fail to capture structural characteristics of model weights by simple matrix decomposition. This is alleviated by the proposed method, which concatenates pretrained parameter matrices followed by tensor CUR decomposition to update lower-order tensor components during fine-tuning and hence enable to capture higher-order features and multi-dimensional interactions. And (2) a benchmark for application of eight PEFT strategies to medical image segmentation tasks, on one segmentation model for one pretraining dataset to three transfer learning tasks of arbitrarily shrinked sample size.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The presented work is well motivated, and although the method is theoretically complex, the authors made it easy to understand the outcomes of the study and the “take-home-message” is clear. Moreover, benchmarking eight PEFT strategies in medical segmentation is a relevant contribution to the community.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    There are several major concerns:

    • The most crucial aspect is the evaluation scheme: The selection of 5 samples from 2 downstream tasks and 10% of the other dataset is not explained and does not make much sense to me, e.g. the former would define a 5-shot task. (“In the fine-tuning phase, we randomly selected 5 training samples for the EADC-ADNI and LPBA40 datasets, while 10% of UPENN-GBM samples were used for training, with the rest for testing.”)
    • A plot comparing PEFT strategies with performance against data ratio would demonstrate performance gains for different sample size scenarios. Results for full datasets are missing.
    • The results are presented without standard deviations.
    • The proposed work uses only one segmentation model for their tCURLoRA method. More segmentation models need to be compared to provide unbiased results for employing tCURLoRA in medical segmentation tasks. Moreover, 90 million of parameters are nowadays quite common for models in terms of complexity, what about models with higher parameter sizes?
    • The methods section is not comprehensive: e.g. what is [] in the last equation of Sec 2.2 where C is derived? What is C† and what is the ifft operation? Could the authors please provide more information about efficiency in terms of a training and inference duration, for better comparison?

    Minor issues:

    • Contributions 1 and 2 should be fused
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Flaws in experimental design (see major weaknesses), lack of comparison with other segmentation models.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have clarified my questions and have shown that they considered all aspects and concerns raised in my review. They mention that - due to space limitations - they were not able to include all those aspects in the paper, but will release them together with the code, which is fine as well. Accordingly, I would change my final rating to “accept”.



Review #2

  • Please describe the contribution of the paper

    The paper offers a new method for parameter-efficient training of the transformer-based models by introducing tensor CUR decomposition-based low rank parameter adaptation. In my understanding, this paper extends from CURLoRa and tackles the limitations of this approach by concatenating weight matrices into a 3D tensor for fine-tuning. Experiments on segmentation have shown that the method outperforms other competing methods.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The proposed method is well formulated with relevant equations and figures to support the reader’s understanding
    • The application is novel as it offers a parameter-efficient training with few parameters while providing state-of-the-art performance on segmentation
    • Outperforming the full finetuning while training significantly low number of parameters makes the method interesting and potentially strong for other applications
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The explanation of the methodology seems to be confusing to the reader. The paper does not fully explain the tensor decomposition and its strength over CURLoRa and other method, in terms of how it benefits the training costs. While it was shown to have less memory, there is no direct indication how training is optimizer (e.g. FLOPS, training time, etc.).
    • The number of parameters being trained seem to be a bit more than the CURLoRa method, but the performance gains outweigh it
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    • While the method looks compelling, I think some few additional experiments could be added to improve it, like clearly showing and justifying the use of tCURLoRa. It could be FLOPS, training time, or other training efficiency metrics. Or it could be more informative details persuading the readers that the method is not only performance-wise accurate, but, for example, easier to apply, consumes significantly less training computing resources during training, etc.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method is interesting and offers a new prospects towards parameter-efficient training in medical imaging. But the paper’s organization and more convincing details about the method are needed to present to the research community

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    My comments were addressed, especially about training efficiency metrics. I hope they will be added to the paper. So I evaluate this paper as to be accepted.



Review #3

  • Please describe the contribution of the paper

    This paper porposed tCURLoRA parameter efficient fine-tuning methods, which concatenates pre-trained weight matrices into a 3D tensor. This method uses tensor CUR decomposition to extract key substructures. It only updates lower-order tensor components during fine-tuning. And it reduces computational and storage overhead while maintaining performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The primary technical contribution lies in its novel tensor-based weight sharing approach. This is the first work to propose stacking weight matrices into a 3D tensor structure and applying CUR decomposition, departing from traditional matrix-based LoRA methods. This innovation is particularly interesting because it enables parameter sharing across layers, potentially capturing cross-layer relationships that were previously unexplored. The approach provides a fresh perspective on how weight matrices in different layers might be interrelated, opening new possibilities for parameter-efficient fine-tuning.

    The method demonstrates impressive resource efficiency, which has significant practical implications. By reducing the number of parameters that need to be updated during fine-tuning and maintaining a lower memory footprint compared to standard LoRA, the approach achieves faster training times. This efficiency is not merely an incremental improvement but represents a meaningful step toward making model fine-tuning more accessible to institutions with limited computational resources, particularly relevant in medical settings where high-end hardware may not always be available.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The fundamental theoretical foundation lacks rigorous justification. While the paper proposes stacking weight matrices into a 3D tensor, it doesn’t adequately explain why weight matrices from different layers should share a common structure suitable for tensor decomposition. Similar concerns about parameter sharing and weight matrix relationships have been discussed in previous works on neural network compression and adaptation. The paper needs to provide mathematical proof or empirical evidence showing why CUR decomposition is specifically suitable for capturing cross-layer patterns.

    Additionally, crucial ablation studies are missing - there’s no systematic analysis of how different tensor ranks affect performance, no investigation of alternative tensor decomposition methods (like Tucker or CP decomposition), and no exploration of how the method performs under different base model architectures.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. Table 1 should add the in-line citation in different previous methods.
    2. Experimental details is confusing, because the training data selection lacks proper justification, with no explanation for why only 5 samples were used for EADC-ADNI and LPBA40, or why 10% was chosen for UPENN-GBM, and no comparative experiments with different sample sizes were conducted. The hyperparameter settings are inadequately described, missing crucial information about tensor operations such as rank selection criteria, CUR decomposition parameters, and compression ratio determination, while also lacking ablation studies on these parameters and justification for the choice of 1000 epochs.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents an innovative tensor-based weight sharing approach with promising resource efficiency, demonstrating competitive performance on medical image segmentation tasks while using only 2.98% of full fine-tuning parameters. However, the work suffers from significant theoretical weaknesses, lacking rigorous justification for weight matrix stacking and missing crucial ablation studies on tensor ranks and alternative decomposition methods. Additionally, the experimental details are inadequately described, with unclear training data selection criteria and insufficient hyperparameter settings, making it challenging to reproduce and validate the method’s practical utility.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    While the rebuttal addresses some concerns, it fails to provide sufficient theoretical justification for the tensor-based approach, lacks rigorous validation of hyperparameter choices, and offers limited discussion of medical applications, making the paper’s contribution unclear in terms of both engineering innovation and theoretical advancement.




Author Feedback

We appreciate the reviewers’ valuable feedback. [R1, R2, R3] Due to space limit, some details the reviewers raised including more experiments and ablation study for different parameters may miss in our first submission. Later code release will incorporate these details that cannot be added in the paper due to space limit. Q1. [R1, R3] The selection of 5 samples from 2 downstream tasks and 10% of the other dataset is not explained. A: Since hippocampus segmentation is relatively well-defined, with fixed positioning and small shape variations, we selected 5 samples for the low-data regime. For brain tumor segmentation, which has larger shape and position variations, we used 10% of the dataset (15 samples) to better reflect its complexity. It was chosen to evaluate our method’s performance in resource-constrained scenarios. Q2. [R1, R3] Comparative experiments with different sample sizes. A: We indeed previously conducted experiments on the EADC-ADNI dataset with varying training sample sizes. Results indicate that PEFT methods perform better than full fine-tuning with smaller samples, while the performance gap narrows as the sample size increases. When the sample size exceeds 20, full fine-tuning becomes better. In particular, the comparison performance of tCURLoRA against other PEFT methods is consistent with what reported in our manuscript.
Q3. [R1, R3] How the method performs under different segmentation models? A: We selected the representative transformer-based model UNETR for fine-tuning experiments with tCURLoRA, validating its robustness across three datasets. Our method is also applicable to other models, including larger ones, and we plan to extend comparisons in future work. Q4. [R1, R2] Provide more training efficiency metrics. A: Our method includes 2.683M trainable parameters, took a training time of 495 ms/epoch and a memory usage of 11.72 GB, ranked second just behind CURLoRA. To further enhance efficiency, the trade-off between the tensor rank reduction and segmentation accuracy could be optimized. Q5. [R2, R3] The methodology lacks clear justification for stacking weight matrices into a 3D tensor and using tCUR decomposition to capture cross-layer patterns. A: Transformer layers share a common structure, performing self-attention-based feature extraction. Stacking their parameter matrices into a 3D tensor preserves low-rank properties. Tensor decomposition, particularly tCUR, effectively leverages this structure. Unlike Tucker and CP decompositions, tCUR introduces tensor products that efficiently model cross-layer interactions, extending information flow across height, width, and layers, beyond simple pairwise interactions in matrix multiplication. Q6. [R2, R3] How different tensor ranks affect performance and training efficiency? A: Based on our evaluation of different tensor ranks on the EADC-ADNI dataset, the findings indicate: i) tCURLoRA is robust to rank variations, with only a 0.14 accuracy difference compared to methods like LoRA, which show differences > 0.5; and ii) tCURLoRA’s lowest accuracy (84.81) surpasses the best accuracy of other methods (84.64). This robustness enables us to reduce the tensor rank, enhance training efficiency, and still maintain superior segmentation accuracy. Q7. [R1] The results are presented without standard deviations. A: Upon review, we found that our tCURLoRA also performs well in terms of standard deviations. E.g., the standard deviations of the Dice scores for tCURLoRA across three datasets are 2.98, 3.02, and 9.73, which are all smaller than that of LoRA, i.e., 3.74, 3.06, and 11.17. Q8. [R1] What is [] in the last equation of Sec 2.2? What is U† and what is the ifft operation? A: U† represents the Moore-Penrose inverse of tensor U, and ifft denotes the inverse Fast Fourier Transform. The notation [] specifies the default number of points in the transformation. This follows the conventions in [13], ensuring clarity and alignment with established notations.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top