Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Asymptomatic neurocognitive impairment (ANI) is an early stage of HIV-associated neurocognitive disorder. Recent studies have investigated magnetic resonance imaging (MRI) for ANI analysis, but most of them rely on single modality, neglecting to utilize complementary information derived from multiple MRI modalties. For a few multimodal MRI fusion studies, they usually suffer from ``modality laziness’’, where dominant modalities suppress weaker ones due to misalignment and scale disparities, limiting fusion efficacy. To address these issues, we propose Uncertainty-aware Multimodal MRI Fusion (UMMF), a novel framework integrating structural MRI, functional MRI, and diffusion tensor imaging for ANI identification. The UMMF employs modality-specific encoders with an uncertainty-aware alternating unimodal training strategy to reduce modality dominance and enhance feature extraction. Moreover, a random network prediction method is designed to estimate uncertainty weights for each modality, enabling robust uncertainty-aware fusion that prioritizes reliable modalities. Extensive experiments demonstrate UMMF’s superior performance over SOTA methods, achieving significant improvements in prediction accuracy. Additionally, our approach can help identify critical brain regions associated with ANI, offering potential clinical biomarkers for its early intervention. Our code is available at https://github.com/IsaacKingCzg/IK_MICCAI25_UMMF.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2517_paper.pdf

SharedIt Link: https://rdcu.be/eG4Ec

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05182-0_61

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/IsaacKingCzg/IK_MICCAI25_UMMF

Link to the Dataset(s)

N/A

BibTex

@InProceedings{CheZig_UncertaintyAware_MICCAI2025,
        author = { Chen, Zige AND Qin, Haonan AND Wang, Wei AND Zhou, Zhongkai AND Zhao, Chen AND Fang, Yuqi AND Shan, Caifeng},
        title = { { Uncertainty-Aware Multimodal MRI Fusion for HIV-Associated Asymptomatic Neurocognitive Impairment Prediction } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15974},
        month = {September},
        page = {627 -- 637}
}

Reviews

Review #1

Please describe the contribution of the paper

The authors propose a new way to perform three-way modality merging to predict HIV-Associated Asymptomatic Neurocognitive Impairment. Specifically, the authors combine sMRI, DTI, and fMRI data into
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors provide a good visualization of the ANI-associated brain regions they find with their model. Providing these interpretations is important to place the work into the context of other ANI works, and verify that the learned features are valid. The authors should be commended for the attempted ablation study, although as discussed below, I don’t find the experimental conditions very convincing.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1) Only three fold validation, with such a small dataset, can lead to bad statistical estimates of performance 2) The authors only use a single dataset that is extremely small and not open-source. Although the prediction task is quite specific, I don’t think the methodological choices are tested thoroughly enough with just this singular dataset. 3) The authors do not explain their architectural choices well, why are the architectures for the sMRI and DTI data different, and why did the authors use this specific GNN for the fMRI data? 4) Did the authors look at gradient cancelation in the shared layer for their alternating training? Since each modality updates the shared layer independently, noisy gradients may be introduced into the updates of the shared layer. 5) For equation 4, both I and phi are undefined 6) Although the authors list many baselines in Table 1, they do not cite important works in this field: https://pmc.ncbi.nlm.nih.gov/articles/PMC8001877/ or use their method as a baseline. 7) The authors do not do a thorough hyperparameter search, which may heavily affect performance on such a small dataset. Nor do the authors provide enough information to reproduce their work.

Some small grammer/spelling mistakes or inconsistencies: Page 4: “esitmating” -> “estimating” Page 4: “… employs random network prediction method” -> “… employs the random network prediction method” Page 4: “For modality m, its data is donated as” -> Use a different word instead of donating Page 5: “… while g is a shared single-layer perceptron …”, does this mean a perceptron with one hidden layer or just a single linear layer? Page 5: “… denote the trainable parameters” -> denotes Page 8: “… we introduced a Uncertainty aware” -> an
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The authors present a method that is potentially impactful, but the current evaluation does not make me confident enough that their methodological choices actually improve performance or that performance improvements are spurious (the dataset they use is small) or due to baseline models without optimal hyperparameters.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper proposed a uncertainty-aware multimodal MRI fusion for HIV-Associated asymptomatic neurocognitive impairment prediction. It combines three different modes with uncertainty perception, has good interest.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. It combines sMRI, fMRI, and DTI for ANI prediction
2. It introduces uncertainty estimation to improve robustness.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. the experiment was validated using only one dataset, which is weakly persuasive;
2. lack of careful parameterization results in questionable reproducibility
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper integrates multi-modal data combined with uncertainty estimation for ANI prediction, and the experiment has achieved good results and reasonable analysis. This paper has a certain novelty, but it lacks detailed experimental Settings, such as each model encoder and the details of the comparison method
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

This paper proposes an uncertainty aware multimodal fusion framework for ANI diagnosis using fMRI, sMRI and DTI. To address the modality laziness problem often encountered in joint optimization of multimodal learning, the authors introduce an alternating unimodal training strategy. In contrast to existing methods that treat all modalities equally, the paper further proposes an uncertainty-based weighting mechanism to adaptively adjust the contribution of each modality based on its estimated uncertainty.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The paper is well-organized and generally easy to follow. The methodology is clearly explained. Each of the contribution is well-motivated by gaps in existing studies. The experimental validation is thorough to support the effectiveness of the method.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

One concern is that actually three modality-specific encoders are trained independently, and fusion occurs only at the logit level. This late fusion strategy may limit the model’s ability to capture complementary information across modalities Could this affect performance?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

One minor comment: If space permits, it would be helpful to directly annotate in Table 1 which modality or modalities are used by each comparison method
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper is well-organized and well-motivated.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The comparison with ASFF does not necessarily suggest that the late fusion strategy is superior to mid-level fusion, as ASFF only uses two modalities, whereas your method uses three. In fact, among the ten comparison methods, only one (Masked GNN) incorporates all three modalities, making it the only fair baseline for direct comparison with your proposed method. It may be helpful to include more multi-modal baselines that leverage all three modalities in future work to make your experimental results more convincing. Besides, I also agree with other reviewers that the small dataset size is a concern.

Author Feedback

We thank all reviewers for their constructive feedback, such as “good results and reasonable analysis” (R1), “good visualization” (R2), and “well-organized; methodology is clearly explained; experimental validation is thorough” (R3). We have addressed main concerns below and will release the source code later.

Regarding dataset size and model effectiveness (R1Q1, R2Q1, R2Q2) In our study on HIV-associated ANI, gathering sMRI, DTI, and fMRI from the same patient is extremely challenging. Note that our dataset currently is the largest one worldwide to include all three modalities for ANI patients. We used 3-fold cross-validation, repeated 5 times to reduce partitioning bias. Our model is evaluated with multiple metrics (AUC, ACC, F1, SEN, SPE, PRE) in Table 1, and the experimental results demonstrate its superiority and robustness over existing works, despite the dataset size. Future work will explore more diverse datasets for further model evaluation.

Architectural choices for different modalities (R2Q3)

For sMRI and DTI: Considering that sMRI provides high-resolution 3D anatomical images, we use 3D CNNs to capture its spatial patterns. DTI, however, is treated as a brain connectome after preprocessing, and thus we use a graph-based method (BrainNetCNN) to extract topological features through specialized layers (e.g., node2edge, edge2graph layers).

For fMRI: fMRI naturally models spatial structure and temporal dynamics of the brain, and is usually preprocessed as a graph (ROIs as nodes, connectivity as edges). Here, we use GIN and Transformer to capture its spatial and temporal patterns, respectively. Choosing GIN lies in its WL-test-level expressiveness, lightweight MLP-based design suited to datasets with small sample size (as in our case), and choosing Transformer lies in its ability to capture long-range temporal dependencies.

Shared layer gradient cancelation (R2Q4) In this work, we apply the Recursive Least Squares-based correction from [1] before each shared-layer update to orthogonalize the gradient against previous modality features, mitigating gradient cancellation and preserving all learned cross-modal information. We will include the related contents in the final version. [1] Multimodal representation learning by alternating unimodal adaptation (CVPR 2024)

Missing citation of a competing method [2] (R2Q6) Thank you for highlighting [2], but its focus differs from ours. Specifically, [2] targets HIV-associated neurocognitive impairment with obvious cognitive symptoms, while our study addresses a much more challenging asymptomatic stage. Although [2] isn’t cited as a baseline, our competing method (ASFF) outperforms the SVM used in [2] (AUC: 55.65 ± 3.96%), and our approach further surpasses ASFF (Table 1), indirectly demonstrating superiority over [2]. [2] DOI: 10.1007/s13365-020-00930-4

Hyperparameter setting and model reproducibility (R1Q2, R2Q7) We conducted thorough hyperparameter tuning: training epochs: 70 , optimizer: Adam, batch size: 6, learning rate: 6×10-4, λ in Eq. 4: 0.0001. Encoder configurations are detailed in Sec. 2.2. To ensure fair comparison, all baseline methods were also carefully tuned for optimal performance. Due to space limits, their details are omitted but will be included along with source code in the revised version.

Late fusion strategy (R3) Although we use late fusion, a shared MLP is used to capture cross-modal complementarity at the feature level. Moreover, compared with methods using early (MaskGNN) and middle (ASFF) fusion, our approach achieves superior results. Future work will explore combinations of different fusion strategies.

Minor revisions

Symbol definitions (R2Q5): In Eq. 4, φ and I are typos; they should be ψ and N. Function g is a single fully connected (linear) layer.

Grammar (R2): We will correct all grammatical issues in the final version.

Modality annotation (R3): We will annotate the modalities used by each method in Table 1.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Reject
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This paper proposes an uncertainty-aware multimodal fusion framework for the prediction of HIV-associated asymptomatic neurocognitive impairment (ANI) using sMRI, DTI, and fMRI. The authors introduce modality-specific encoders, alternating unimodal training, and uncertainty-based weighting, aiming to address modality imbalance and fusion challenges in low-sample multimodal learning. While the dataset size is limited and the paper lacks public code at submission, the methodological design is sound, and the rebuttal adequately addresses reviewers’ main concerns. I support acceptance.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Uncertainty-Aware Multimodal MRI Fusion for HIV-Associated Asymptomatic Neurocognitive Impairment Prediction

Author(s):