Abstract

Alzheimer’s disease (AD) is an irreversible neurodegenerative disease, where early diagnosis is crucial for improving prognosis and delaying the progression of the disease. Leveraging multimodal PET images, which can reflect various biomarkers like Aβ and tau protein, is a promising method for AD diagnosis. However, due to the high cost and practical issues of PET imaging, it often faces challenges with incomplete multimodal data. To address this dilemma, in this paper, we propose a Graph-embedded latent Space Learning and Clustering framework, named Graph-SLC, for multiclass AD diagnosis under incomplete multimodal data scenarios. The key concept is leveraging all available subjects, including those with incomplete modality data, to train a network for projecting subjects into their latent representations. These latent representations not only exploit the complementarity of different modalities but also showcase separability among different classes. Specifically, our Graph-SLC consists of three modules, i.e., a multimodal reconstruction module, a subject-similarity graph embedding module, and an AD-oriented latent clustering module. Among them, the multimodal reconstruction module generates subject-specific latent representations that can comprehensively incorporate information from different modalities with guidance from all available modalities. The subject-similarity graph embedding module then enhances the discriminability of different latent representations by ensuring the neighborhood relationships between subjects are preserved in subject-specific latent representations. The AD-oriented latent clustering module facilitates the separability of multiple classes by constraining subject-specific latent representations within the same class to be in the same cluster. Experiments on the ADNI show that our method achieves state-of-the-art performance in multiclass AD diagnosis. Our code is available at https://github.com/Ouzaixin/Graph-SLC.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1165_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1165_supp.pdf

Link to the Code Repository

https://github.com/Ouzaixin/Graph-SLC

Link to the Dataset(s)

https://adni.loni.usc.edu/data-samples/access-data/

BibTex

@InProceedings{Ou_AGraphEmbedded_MICCAI2024,
        author = { Ou, Zaixin and Jiang, Caiwen and Liu, Yuxiao and Zhang, Yuanwang and Cui, Zhiming and Shen, Dinggang},
        title = { { A Graph-Embedded Latent Space Learning and Clustering Framework for Incomplete Multimodal Multiclass Alzheimer’s Disease Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15007},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper introduces a graph-based latent space learning and clustering method for multimodality data in Alzheimer’s Disease (AD). It comprises three main components: The reconstruction module; The subject-similarity graph embedding module; The AD clustering module.

    The paper demonstrates the performance of utilizing various modalities in the results and compares them with methods from existing literature. Additionally, an ablation study is conducted to justify the inclusion of each component of the model.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper introduces an architecture consisting of three components: multimodal reconstruction, subject similarity graph embedding, and a clustering module for downstream classification tasks.

    2. The selection of a modality shared layer and modality-specific layer is appropriate for multimodal studies, enabling the downstream classification task to leverage multimodal information effectively.

    3. The paper provides a diagnosis of its modality choices in Table 2, compares its method with other literature in Table 3, and justifies the selection of model components through an ablation study.

    4. Performance evaluation is conducted on a large dataset from ADNI, ensuring the robustness and scalability of the proposed method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The method of determining subject similarity by taking the negative averaged absolute difference between scans may present issues, particularly for Tau-PET data. The absolute value of scans may not be meaningful, especially considering the different value distributions in the cerebellum, which serves as a reference region for Tau-PET and Abeta-PET scans. Additionally, proper preprocessing steps for PET scans are missing in the experiment description, and the chosen similarity metric may not be suitable due to the inherent noise in PET scans.

    2. The justification for choosing hyper-parameters lambda1 and lambda2 as 1 and 0.1, respectively, is required. Understanding the rationale behind these choices would provide insights into their impact on the model’s performance and optimization process.

    3. The discussion in section 3.3 regarding the imputation-based method seems insufficient. If the author perceives the synthetic PET images as noisy and wishes to evaluate their impact on classification results accurately, a more controlled experimental setup is warranted. In this scenario, a single model should be utilized to ensure consistency across other conditions, including parameter size and network structure. The primary variation in the experimental setup should focus solely on whether synthetic PET images are used or not. By maintaining uniformity in all other conditions, the impact of synthetic PET images on classification results can be accurately assessed.

    4. In the results comparing with other literature methods (Table 3), it is essential to control the network size or ensure that they are in the same range. Additionally, the size of the trainable parameters for each method should be described to provide a comprehensive understanding of the model complexities and computational requirements.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    While the model may outperform certain methods in the literature, if the classification error remains substantial when utilizing multimodality, it is crucial for the author to investigate the source of this error. One potential avenue for exploration is the visualization of the latent space of Alzheimer’s Disease (AD) and Mild Cognitive Impairment (MCI) using the multimodal data. This visualization can provide insights into how the model represents and distinguishes between different disease states, potentially shedding light on areas where the model struggles or misclassifies. Also, what are the potential usage of this method beyond classification?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is organized in a good shape and results are compared with other literatures. However, the classification accuracy is still very low. Subject similarity metric is not a proper choice. PET data preprocessing may not be proper according to the description.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The method for addressing missing data in multimodality is appealing. However, the improvement over other methods is not significant.



Review #2

  • Please describe the contribution of the paper

    The paper proposed a latent space learning method to handle the missing modality problem in AD diagnosis. The framework consists of a multimodal reconstruction module to synthesize each modality from the shared subject-specific representation, a subject-similarity graph embedding module to utilize the cross-subject relationship to guide the representation learning for missing modality, and a clustering module to facilitates better separation of classes. The method was evaluated on ADNI and demonstrated superior performance when compared with other methods handling missing inputs.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The method is relatively novel. The idea of using the cross-subject similarity to guide the feature learning of missing modality is new. The idea of distribution consistency loss in the latent clustering module also seems to be new, but not easy to follow.
    • The evaluation is solid and thorough.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The method is not very well-written and easy to follow. For example, in 2.1, how exactly was the representation z achieved? Is it an energy-based model? Moreover, in 2.3, how were the probability density distribution of subjects and representations achieved? f^ are not introduced. And the distribution consistency loss is not well motivated.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The model architecture was not introduced.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    As in weakness above.

    Was the proposed method significantly better than other methods?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Mainly the clarity issue.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose a graph-based latent space learning approach that addresses the challenge of missing data in multimodal diagnosis, eliminating the requirement for paired data in incomplete datasets. Additionally, they develop a subject similarity graph to capture the integrity relationship within the neighborhood. Furthermore, they introduce a latent space clustering method to enhance the separability between classes. They compare their results with state-of-the-art approaches, and the results demonstrate promise.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The novelty lies in introducing loss functions for multimodal reconstruction, including the graph embedding loss and distribution consistency loss.
    2. Efficiency is demonstrated in eliminating the need for paired data in incomplete datasets, in comparison with traditional missing data imputation methods.
    3. The paper provides performance results of the approach with various combinations of modalities and comparison with state-of-the-art methods.
    4. Applicable to clinical settings where multimodal data are common with limited paired data.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The utilization of spectral clustering in AD-oriented latent clustering is not novel.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    There is no mention of reproducibility in the paper. Additionally, there is no provided access date for the ADNI dataset.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    There is a lack of clarity in Fig 1 regarding how subject-specific latent representations are utilized both for creating reconstructed multimodal data and for generating subject similarity graph embeddings simultaneously. The relationship between the reconstructed multimodal data and the other modules is ambiguous. Minor comments: The ADNI dataset is not properly referenced, and the access date is missing. It is encouraged to provide access to the code for reproducibility purposes.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper introduces novelty by proposing three distinct modules within the graph-embedded latent space representation framework, each accompanied by unique loss functions. Furthermore, it conducts comparisons with state-of-the-art methods and includes an ablation study, demonstrating the effectiveness of the proposed approach. Additionally, the paper efficiently addresses the challenge of eliminating paired data in multimodal settings, particularly crucial in medical applications.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Accept — should be accepted, independent of rebuttal (5)

  • [Post rebuttal] Please justify your decision

    Same decision as before rebuttal.




Author Feedback

We thank reviewers (R3, R4, R5) for their constructive comments and acknowledgment of our novel method in addressing the challenges of incomplete multimodal learning tasks.

Q1: Model details and reproducibility (R3, R4, R5) A1: Each subject is assigned a randomly initialized and trainable subject-specific latent representation. Then, this latent representation is optimized by a multimodal reconstruction module to integrate information from all available modalities, further refined by a subject-similarity graph embedding module and AD-oriented clustering to enhance diagnostic capacity. The multimodal reconstruction module consists of four layers of upsampling and residual blocks. The first two layers follow the sequence {Conv3D-Upsample-Residual-Upsample-Residual} with shared parameters across modalities, while the last two layers follow {Upsample-Residual-Upsample-Residual-Conv3D} with modality-specific parameters. We will upload all codes and subject ID lists once accepted.

Q2: Distribution consistent loss (R3, R4) A2: To improve class separability in the latent space, the AD-oriented clustering module uses a distribution consistent loss to preserve the cluster structure of subjects in their subject-specific latent representations. Specifically, we employ a spectral clustering algorithm on the latent representations to obtain the clustering results. Then, we estimate the probability density distributions of the original subject space and the latent space using kernel density estimation. Finally, distribution consistency loss between these two distributions is employed to guide the update of the latent representations. The improvement in Table 4 (from 61.88 ACC to 64.43 ACC) validates the effectiveness of the proposed module.

Q3: Classification error and performance (R4, R5) A3: The classification error is primarily due to two main reasons: (1) a large amount of irrelevant information misleading the classifier, and (2) significant similarities observed between some MCI subjects and those with AD or NC. To address the first issue, we employ a latent space learning approach to project different modalities of subjects into subject-specific latent representations, thereby reducing the generation of irrelevant information. For the second issue, our method introduces a subject-similarity graph embedding module and an AD-oriented clustering module to enhance class separability. The notable improvement over the second-best approach, i.e., with an increase of 2.14% in ACC and 3.03% in F1S, validates the effectiveness of our model.

Q4: The chosen similarity metric (R5) A4: All PET images are preprocessed using a standard pipeline, involving (1) intensity normalization based on the cerebellum and (2) smoothing with a Gaussian kernel to reduce inherent noise. While cerebellar distributions may vary, this preprocessing ensures comparability among PET images. Additionally, as the chosen similarity metric assesses global similarity, the cerebellum’s influence on computing relative neighborhood relationships is minimal due to its small proportion. Future optimization efforts will target disease-relevant regions.

Q5: Hyper-parameters and competitive models (R5) A5: We prioritize equal training importance for multimodal reconstruction and neighborhood relationship learning (lambda 1 as 1). For lambda 2, we set a smaller value, as the classification task converges more easily than the reconstruction task. Related research has shown that using synthetic PET can enhance performance, while its effectiveness is constrained by noise. Thus, in Section 3.3, our focus shifts to comparing different types of methods in addressing the incomplete multimodal diagnosis task. We ensure that the methods compared in Table 3 are implemented either using released codes or following descriptions in original papers and that all network sizes are within the same range. This consistency guarantees fair comparison of performance metrics. We will add these details.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper presents a novel and effective graph-based latent space learning approach for handling missing data in multimodal Alzheimer’s Disease (AD) diagnosis. This innovative method introduces unique loss functions for multimodal reconstruction and efficiently manages incomplete datasets without requiring paired data, demonstrating superior performance compared to state-of-the-art methods. Strengths from reviewers include (1) the method’s applicability to clinical settings; (2) the solid evaluation on the ADNI dataset and the novel use of cross-subject similarity and distribution consistency loss for feature learning; (3) the solid evaluation and appropriate modality layers, supported by a comprehensive comparison with existing methods and an ablation study. Despite some concerns about the subject similarity metric and preprocessing steps, the strengths in the novel methodology and robust evaluation outweigh the weaknesses. Thus, I would suggest accepting this paper.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The paper presents a novel and effective graph-based latent space learning approach for handling missing data in multimodal Alzheimer’s Disease (AD) diagnosis. This innovative method introduces unique loss functions for multimodal reconstruction and efficiently manages incomplete datasets without requiring paired data, demonstrating superior performance compared to state-of-the-art methods. Strengths from reviewers include (1) the method’s applicability to clinical settings; (2) the solid evaluation on the ADNI dataset and the novel use of cross-subject similarity and distribution consistency loss for feature learning; (3) the solid evaluation and appropriate modality layers, supported by a comprehensive comparison with existing methods and an ablation study. Despite some concerns about the subject similarity metric and preprocessing steps, the strengths in the novel methodology and robust evaluation outweigh the weaknesses. Thus, I would suggest accepting this paper.



back to top