Abstract

Multi-modal pre-trained models efficiently extract and fuse features from different modalities with low memory requirements for fine-tuning. Despite this efficiency, their application in disease diagnosis is under-explored. A significant challenge is the frequent occurrence of missing modalities, which impairs performance. Additionally, fine-tuning the entire pre-trained model demands substantial computational resources. To address these issues, we introduce Modality-aware Low-Rank Adaptation (MoRA), a computationally efficient method. MoRA projects each input to a low intrinsic dimension but uses different modality-aware up-projections for modality-specific adaptation in cases of missing modalities. Practically, MoRA integrates into the first block of the model, significantly improving performance when a modality is missing. It requires minimal computational resources, with less than 1.6\% of the trainable parameters needed compared to training the entire model. Experimental results show that MoRA outperforms existing techniques in disease diagnosis, demonstrating superior performance, robustness, and training efficiency. The code link is: https://github.com/zhiyiscs/MoRA.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1536_paper.pdf

SharedIt Link: https://rdcu.be/dV1VT

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72384-1_26

Supplementary Material: N/A

Link to the Code Repository

https://github.com/zhiyiscs/MoRA

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Shi_MoRA_MICCAI2024,
        author = { Shi, Zhiyi and Kim, Junsik and Li, Wanhua and Li, Yicong and Pfister, Hanspeter},
        title = { { MoRA: LoRA Guided Multi-Modal Disease Diagnosis with Missing Modality } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15003},
        month = {October},
        page = {273 -- 282}
}

Reviews

Review #1

Please describe the contribution of the paper

The paper propose a method called Modality-aware Low-Rank Adaptation (MoRA) to improve the performance and robustness of pre-trained models to disease diagnosis when the data is modalityincomplete in the training and testing sets and reduce computational resources of fine-tuning. The method can achieve state-of-the-art performance and robustness compared with other fine-tuning methods with missing modalities.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) The paper is well written. The idea of MoRA is easy to understand. 2) The experimental results of MoRA are better than the state-of-the-art methods. 3) The ablation study is convincing.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

1) The novelty of MoRA is limited, due to it is a simple extension of LoRA. 2) Are the splits of training, validation, and testing randomly? I think cross-validation would be more reasonable, and means and standard deviations should be provided in the experiments. 3) The comparisons of the computational resources are not given. 4) Code is not provided.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

1) Using cross-validation to do the experiments. 2) Providing the comparisons of the computational resources. 3) Providing the code.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well written. The experiments are convincing. The novelty is limited.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #2

Please describe the contribution of the paper

The authors introduce a computationally efficient method named Modality-aware Low-Rank Adaptation to address the problem of of missing modalities within real-world diagnostic scenarios while saving computational resources. Experimental results demonstrate that MoRA surpasses existing techniques in disease diagnosis.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The problem of fine tuning pre-trained models with missing modalities for disease diagnosis is interesting.
2. The manuscript is well written, and the proposed method is also easy to implement.
3. The experimental results are promissing.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. How is the performance of this method in extreme cases, such as 100% image+0% text and 0% image+100% text?
2. More metrics are needed to comprehensively evaluate the classification performance, such as accuracy, sensitivity, specificity, etc.
Please rate the clarity and organization of this paper

Very Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

See the main strengths and weaknesses above.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

See the main strengths and weaknesses above.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Review #3

Please describe the contribution of the paper

This work introduce a computationally efficient method named Modality-aware Low-Rank Adaptation to deal with the missing modality issue.
Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This work introduce multi-modal pre-trained models to disease diagnosis and propose MoRA to improve performance and robustness when the data is modalityincomplete in the training and testing sets.
2. It can achieve state-of-the-art performance and robustness compared with other fine-tuning methods with missing modalities.
Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1. There are some types in the section 3.4. For example, “with65% image-modality”.
2. In the table, why do you choose the missing rate form text and images between 30% and 65%? How about 0%?
3. In the ablation study, it only has the result for ODIR dataset. How about such experimental result for CXR dataset? Can you get the similar finding or conclusion for CXR ?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Do you have any additional comments regarding the paper’s reproducibility?

N/A
Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

see above
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

Weak Accept — could be accepted, dependent on rebuttal (4)
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall result need clearer explanation and more experiments to demonstrate the .
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

N/A
[Post rebuttal] Please justify your decision

N/A

Author Feedback

We sincerely thank all reviewers for their valuable comments. We are grateful that our work was recognized as well-written (R3, R4), promising results (R3, R4, R6), interesting (4), and convincing ablation (R3). Regarding the concerns, we address one by one as follow:

Are the splits of training, validation, and testing randomly? Cross-validation recommended (R3-Q2) The dataset was randomly divided according to a split ratio of 0.8, 0.1, and 0.1. This setting is consistent with the experimental settings in the baseline papers (MAPs and MSPs). However, we acknowledge that cross-validation provides a more robust evaluation method.

The comparisons of the computational resources (R3-Q3) Thank you for the comment. We compare the computational resources required for our method and other competing methods below. The resource comparison shows that our method is more efficient than competing methods in terms of both memory usage and training time.

Dataset Method Memory 1000 training steps ODIR MAP 13.0 GB 1.82 h ODIR MSP 12.1 GB 1.85 h ODIR MoRA 11.6 GB. 1.59 h

CXR MAP 14.4 GB 1.71 h CXR MSP 12.4 GB 1.75 h CXR MoRA 12.2 GB 1.58 h

Code not provided (R3-Q4) We plan to open-source our code.

What would be the performance in extreme cases? Why do you choose the missing rate form text and images between 30% and 65%? (R4-Q1, R6-Q2) We followed the experimental settings used in the baseline papers (MAPs and MSPs). In our study, Figure 2 illustrates several extreme cases where the missing rate is exceptionally high (e.g., 0.9). For even more extreme scenarios, such as 100% image and 0% text or 0% image and 100% text, fine-tuning is not feasible; only the evaluation of the pre-trained model is possible, which makes it as a zero-shot problem.

Can you get the similar finding or conclusion for the ablation study using CXR? (R6-Q3) We observed similar results for the CXR dataset; however, we did not present these results due to page limitations.

Meta-Review

Meta-review not available, early accepted paper.

back to top

MoRA: LoRA Guided Multi-Modal Disease Diagnosis with Missing Modality

Author(s):