List of Papers Browse by Subject Areas Author List
Abstract
In the task of disease prediction, medical data with different modalities can provide much complementary information for disease diagnosis. However, existing multi-modal learning methods often tend to focus on learning shared representation across modalities for disease diagnosis, without fully exploiting the complementary information from multiple modalities. To overcome this limitation, in this paper, we propose a novel Multi-modal Graph Disentangled Representation (MGDR) approach for brain disease prediction problem. Specifically, we first construct a specific modality graph for each modality data and employ Graph Convolutional Network (GCN) to learn node representations. Then, we learn the common information across different modalities and private information of each modality by developing a disentangled representation of modalities model. Moreover, to remove the possible noise from the private information, we employ a contrastive learning module to learn more compact representation of private information for each modality. Also, a new Multi-modal Perception Attention (MPA) module is employed to integrate feature representations of multiple private information. Finally, we integrate both common and private information together for disease prediction. Experiments on both ABIDE and TADPOLE datasets demonstrate that our MGDR method achieves the best performance when compared with some recent advanced methods.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2494_paper.pdf
SharedIt Link: https://rdcu.be/dV1Om
SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72069-7_29
Supplementary Material: N/A
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{Jia_MGDR_MICCAI2024,
author = { Jiang, Bo and Li, Yapeng and Wan, Xixi and Chen, Yuan and Tu, Zhengzheng and Zhao, Yumiao and Tang, Jin},
title = { { MGDR: Multi-Modal Graph Disentangled Representation for Brain Disease Prediction } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
year = {2024},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15002},
month = {October},
page = {302 -- 312}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper introduces the Multi-modal Graph Disentangled Representation (MGDR) method, which contributes by integrating the Multi-Graph Representation Learning (MGRL) module to capture dependencies among multi-modal subjects for context-aware representation. In addition, it employs the Disentangled Representation of Modalities (DRM) to effectively extract common and private information for brain disease diagnosis. Experimental results show its superiority over state-of-the-art methods on two datasets.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Novel Methodology: This paper introduces a novel Multi-modal Graph Disentangled Representation (MGDR) method, which integrates both Multi-Graph Representation Learning (MGRL) and Disentangled Representation of Modalities (DRM). This method offers a new perspective on capturing dependencies among multi-modal subjects for context-aware representation. Innovative Use of Multi-modal Data: This paper demonstrates an original way to leverage multi-modal data for improved prediction accuracy. This innovative method enhances the interpretability of the model’s predictions and sheds light on the underlying mechanisms of brain diseases.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
Lack of Comparative Analysis: While this paper claims superiority over other state-of-the-art methods, it lacks a comprehensive comparative analysis with existing approaches in the field of multi-modal brain disease prediction. Without such analysis, it is challenging to assess the true extent of the proposed method’s performance improvement. Limited Explanation of Novelty: Although this paper introduces the MGDR method as novel, it does not provide a detailed explanation of what aspects of the method are truly innovative compared to existing approaches. Without a clear delineation of novelty, it is difficult for readers to discern the unique contributions of the proposed method. Insufficient Clinical Feasibility Demonstration: While this paper mentions brain disease prediction as the application domain, it lacks a thorough demonstration of the clinical feasibility of the proposed method.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
(1)“In other words, they generally exploit the common information of modalities, failing to fully consider the supplementary information in the private information of each modality.” What is the meaning of “the supplementary information in the private information of each modality” in this sentence? (2)SVD can be used to decompose a matrix into a more concise representation, which helps extract the main features from the data. Discussions should be added about parameters may arise in SVD, such as the number of principal components to retain. In addition, after using SVD, it is necessary to explain how the resulting more compact representation of common information is related to the original information and how it is utilized in subsequent steps. (3)The usage of the Dirichlet energy function lacks appropriate reference. (4)The contrastive learning module is not reflected in Figure 1. (5)Why there are no classification results of the MAFGN method on ABIDE, please explain. (6)I consider that it would be advantageous to compare with advanced multi-modal fusion classification methods, rather than solely focusing on GNN-based methods. (7)In the DRM module, involving self-attention mechanism, it is possible to visualize and analyze the biomarkers corresponding to features with high attention scores and their clinical significance.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Reject — could be rejected, dependent on rebuttal (3)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Lack of Clear Explanation for Problem to be Solved: This paper inadequately explains the issue it aims to address, particularly regarding the problem of multi-modal fusion. The failure to fully consider the supplementary information within the private information of each modality may result in an incomplete understanding of the multi-modal data and could potentially hinder the effectiveness of the proposed method in capturing the complete range of information from each modality. Insufficient Discussion on SVD Parameters and Interpretation: While this paper mentions the use of SVD to obtain a more concise representation of data, it lacks discussions on important parameters such as the number of principal components to retain. Incomplete Presentation of Results: This paper does not provide classification results of the MAFGN method on the ABIDE dataset, and no explanation is given for this omission. In addition, this paper lacks analysis of visualizing biomarkers corresponding to featu
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #2
- Please describe the contribution of the paper
- The authors propose a multi-graph representation learning approach that fully exploits the dependencies of subjects to learn context-aware feature representation for each subject.
- A multi-modal disentangled representation pipeline by considering both common information of modalities and private information of each modality is utilized to better guide disease prediction.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The multi-modal disentangled representation pipeline for learning both common and private information of each modality has the potential for improving the diagnostic performance.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
The description of the experimental settings is unclear, such as the hyperparamethers of Eq.10, the implements of single-modal-based method.
The proposed method exhibits a higher standard deviation than others, notably MAFGN and MMGL. A detailed analysis of this is necessary.
This work utilizes multi-modal data for representation learning, yet it lacks experiments using only single-modal data.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
The description of the experimental settings is unclear. For instance, Inception GCN, MLP, and similar single-model methods are mentioned, but it is not explained how these were implemented to process multi-modal data in this study.
The proposed method exhibits a higher standard deviation than others, notably MAFGN and MMGL. A detailed analysis of this is necessary.
This work utilizes multi-modal data for representation learning, yet it lacks experiments using only single-modal data. The proposed multi-modal strategy has several limitations: 1) the prediction stage requires multi-modal data as input; 2) it does not consider the contribution of each modality; 3) without comparing to single-modal methods, the performance improvement claim lacks support.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Weak Accept — could be accepted, dependent on rebuttal (4)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The multi-modal disentangled representation pipeline, which learns both common and private information of each modality, is intriguing and may enhance diagnostic performance for PD or ASD. However, additional experiments are necessary. The authors should provide more detailed analysis of the experimental results. Furthermore, diagnostic tasks such as AD vs. MCI, and MCI vs. NC, are more critical than conducting direct three-class classification.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Review #3
- Please describe the contribution of the paper
Authors propose MGDR, a novel approach for brain disease prediction using multi-modal medical data. MGDR integrates individual modality graphs with a GCN to disentangle common and private information, refining the latter through contrastive learning. Additionally, a Multi-modal Perception Attention module enhances feature integration. This method outperforms advanced counterparts on ABIDE and TADPOLE datasets, validating its efficacy without extra parameters.
- Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1-Novelty: MGDR’s use of multi-graph representation and disentangled representation of modalities is innovative. This dual approach allows the model to effectively exploit both the common and private information from multi-modal data, which is crucial for improving the accuracy of disease prediction. 2-High performance: The model demonstrates superior performance over existing state-of-the-art methods in terms of multiple evaluation metrics on two different datasets. This proves the novelty of the model for brain disease prediction task. 3-Comprehensive Evaluation: Alongside quantitative evaluations, the ablation studies demonstrate the contribution of each component of the MGDR model, proving the effectiveness of integrating disentangled common and private information for disease prediction.
- Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
1-Complexity: The model’s complexity, with multiple layers of graph representation and disentangled representation, might lead to increased computational demands. This could limit its applicability in real-time or low-resource settings. 2-Lack of Comparative Analysis Against Non-Graph-Based Methods: The paper focuses on comparing MGDR with other graph-based and multi-graph-based methods. A comparison with advanced non-graph-based methods could provide a clearer picture of MGDR’s positioning within the broader landscape of disease prediction methodologies. 3-Generalizability: While MGDR shows excellent results on the ABIDE and TADPOLE datasets, its performance on other datasets or across different types of brain diseases has not been discussed. This raises questions about its generalizability to other conditions or more diverse patient populations.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Do you have any additional comments regarding the paper’s reproducibility?
N/A
- Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
1- Given the intricate architecture of the MGDR model, which integrates multiple layers of graph representation and disentangled modalities, could you elaborate on the computational complexity of this model and how does this it impact the model’s performance in real-time or resource-constrained environments? 2- The model’s performance with varying numbers of input modalities and its subtypes presents an intriguing area of investigation. Could you discuss how the MGDR model’s performance might be affected when the number of multi-modal inputs is altered? Specifically, what are the observed outcomes when the inputs are from different subtypes of modalities such as functional/structural/morphological brain networks? 3- In the evaluation of the MGDR model, the selection of the ABIDE and TADPOLE datasets has been pivotal. To enhance the transparency and robustness of the study, it would be beneficial to provide a rationale for choosing these specific datasets over other available options. Clarification on this matter in the revised version of the paper would contribute significantly to understanding the scope and applicability of the model across different datasets and brain disease types
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making
Accept — should be accepted, independent of rebuttal (5)
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper presents a novel method that is in the core of MICCAI conference.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed
N/A
- [Post rebuttal] Please justify your decision
N/A
Author Feedback
To Reviewer #1 Q1: Inception GCN, MLP, and similar single-model methods are mentioned, but it is not explained how these were implemented to process multi-modal data. A1: Inception GCN uses multi-modal data to construct a single graph, and utilizes GCN to aggregate multi-modal features. The MLP method first concatenates multi-modal features together and then inputs them into multi-layer perceptron for prediction.
Q2: The proposed method exhibits a higher standard deviation than others, A2: Our model adopts a 10 fold cross validation method. Due to the uneven distribution of features in different training and testing sets, adding SVD may result in some fluctuations in the model’s performance.
Q3: This work utilizes multi-modal data for representation learning, yet it lacks experiments using only single-modal data. A3: Thanks for your suggestions. We will further enrich our evaluation by incorporating the comparison with some single-modal methods.
To Reviewer #5 Q1: What is the meaning of “the supplementary information in the private information of each modality” in this sentence?
A1: Here, “the supplementary information” refers to the complementary information of modality. For example, MRI provides the information on the structural characteristics of the brain, while PET provides information on the metabolism and function of the brain.Q2: it is necessary to explain how the resulting more compact representation of common information is related to the original information and how it is utilized in subsequent steps.
A2: Yes. Using SVD, we can obtain the more compact representation of common information. We will add the discussions of the proposed method w.r.t SVD parameters in revision.Q3: Why there are no classification results of the MAFGN method on ABIDE, please explain. A3: MAFGN does not report the performance on ABIDE and the code of MAFCN is also not released. We will try to re-implement MAFGN on ABIDE in our future work.
Q4: I consider that it would be advantageous to compare with advanced multi-modal fusion classification methods, rather than solely focusing on GNN-based methods. A4: We will include some more non-GNN advanced methods in the future experiments.
Q5: it does not provide a detailed explanation of what aspects of the method.
A5: First, we propose to exploit a novel disentangled representation pipeline for multi-modal data learning by exploiting both common information of modalities and private information of each modality. Second, we employ a new contrastive learning module to learn more compact representation of private information for each modality.Q6: it lacks a thorough demonstration of the clinical feasibility of the proposed method. A6: We apology for the unclear claim here. Our proposed method is currently evaluated on several public datasets and we report the prediction performance on these datasets.
To Reviewer #7 Q1: could you elaborate on the computational complexity of this model? A1: We will provide the details of computational complexity of our model in revision. We cannot provide them in rebuttal due to the penalty rule (should not include any new experiments) of rebuttal stage.
Q2: Could you discuss how the MGDR model’s performance might be affected when the number of multi-modal inputs is altered?
A2: Yes. We can discuss the model’s performance w.r.t the number of multi-modal inputs. When using only the fMRI modality, the performance is not very well. Adding data from the other three modalities one by one shows a significant improvement in performance.Q3: it would be beneficial to provide a rationale for choosing these specific datasets (ABIDE and TADPOLE) over other available options. A3: (1) They are publicly available and challenging in the field of brain disease prediction. (2) Both of them contain multi-modal data. (3) They cover different types of brain diseases, effectively evaluating the model’s applicability.
Meta-Review
Meta-review #1
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper integrates graph representation and disentangled representation learning for brain disease prediction from multi-modal data. The proposed methodology incorporates some intriguing ideas and was compared with multiple GCN-based approaches on two public benchmarks. However, most of the compared methods date from before 2022, which may not be up-to-date. Additionally, as pointed out by Reviewer #, more non-GCN-based comparison methods should be included. Given the large variation in Table 2, p-values should be reported to verify if the performance improvement is statistically significant. This work is on the borderline due to the flaws in experimental validation. I am slightly inclined to accept it.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
This paper integrates graph representation and disentangled representation learning for brain disease prediction from multi-modal data. The proposed methodology incorporates some intriguing ideas and was compared with multiple GCN-based approaches on two public benchmarks. However, most of the compared methods date from before 2022, which may not be up-to-date. Additionally, as pointed out by Reviewer #, more non-GCN-based comparison methods should be included. Given the large variation in Table 2, p-values should be reported to verify if the performance improvement is statistically significant. This work is on the borderline due to the flaws in experimental validation. I am slightly inclined to accept it.
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The author’s rebuttal responded to most of the concerns. Discussion about the computational complexity, number of multi-model inputs and high standard deviation will be included in revision, and there will be further investigate non-graph-based and single-model comparisons in future experiments. Some concerns remained unsolved, e.g., number of principal components in SVD. Strength: This paper introduces a novel Multi-modal Graph Disentangled Representation (MGDR) method, which integrates both Multi-Graph Representation Learning (MGRL) and Disentangled Representation of Modalities (DRM). This method offers a new perspective on capturing dependencies among multi-modal subjects for context-aware representation. Weakness: Lack of comparative analysis against non-graph-based methods and single-modal methods.
- What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).
The author’s rebuttal responded to most of the concerns. Discussion about the computational complexity, number of multi-model inputs and high standard deviation will be included in revision, and there will be further investigate non-graph-based and single-model comparisons in future experiments. Some concerns remained unsolved, e.g., number of principal components in SVD. Strength: This paper introduces a novel Multi-modal Graph Disentangled Representation (MGDR) method, which integrates both Multi-Graph Representation Learning (MGRL) and Disentangled Representation of Modalities (DRM). This method offers a new perspective on capturing dependencies among multi-modal subjects for context-aware representation. Weakness: Lack of comparative analysis against non-graph-based methods and single-modal methods.