Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Brain decoding is a pivotal topic in neuroscience, aiming to reconstruct stimuli (e.g., image) from brain activity (e.g., fMRI). However, existing methods rely on subject-specific modules and flatten 3D voxel grids, limiting generalization and discarding spatial information. To address these issues, we propose MindLink, a scalable cross-subject brain decoding framework designed to link multiple subjects into a single model by extracting subject-invariant features while preserving the spatial structure of 3D fMRI data. This is achieved by parcellating 3D fMRI into standardized cubic patches processed by a 3D Vision Transformer for informative representations. Domain adversarial training enhances cross- subject generalizability by extracting subject-agnostic features within a single model structure. We also introduce a two-level alignment strategy that effectively bridges fMRI and stimuli image embeddings through instance-level consistency and flexible token-level matching. MindLink achieves comparable or even better performance over state-of-the-art methods on the NSD dataset with a constant parameter size across subjects and demonstrates strong adaptability to new subject.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/5263_paper.pdf

SharedIt Link: https://rdcu.be/eHc7d

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05162-2_45

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

NSD dataset: https://naturalscenesdataset.org

BibTex

@InProceedings{JunSun_MindLink_MICCAI2025,
        author = { Jung, Sungyoon AND Lee, Donghyun AND Kim, Won Hwa},
        title = { { MindLink: Subject-agnostic Cross-Subject Brain Decoding Framework } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15971},
        month = {September},
        page = {469 -- 479}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper presents MindLink, a method for linking multiple subjects into a single model by extracting subject-invariant features and preserving the spatial structure of 3D fMRI. The method uses standardized cubic patches and domain adversarial training to improve cross-subject generalizability. A two-level alignment strategy bridges fMRI and image embeddings through instance-level consistency and flexible token-level matching. Experimental results on NSD show that MindLink achieves comparable performance across subjects and adapts well to new subjects.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) The paper introduces MindLink, which enables linking multiple subjects within a single model to extract subject-invariant features while preserving the spatial structure of 3D fMRI.

(2) The proposed two-level alignment strategy bridges fMRI and image embeddings through instance-level consistency and flexible token-level matching.

(3) The method achieves competitive or superior performance on NSD, while maintaining a constant parameter size across subjects and demonstrating strong adaptability to new subjects.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
(1) The key concept of subject-agnostic learning requires further clarification. Specifically, how does it differ from methods like MindEye2 and UMBRAE, which use subject-specific layers to map representations into a shared space? What are the concrete advantages of adopting a subject-agnostic approach? A stronger validation would be to integrate the proposed method into the MindEye2/UMBRAE framework by replacing their encoders with the proposed one. If this results in performance improvements, it would provide solid evidence supporting the benefits of the subject-agnostic strategy. Additionally, visualizing the representations of different subjects using UMAP for all three methods would further highlight their differences and show the advantages.

(2) The paper lacks critical implementation details, such as the structures of the 3D fMRI encoder, decoder, fMRI projector, and subject classifier. Providing these details is essential for ensuring reproducibility and clarity.

(3) The proposed pipeline is complex and introduces training challenges. The method employs multiple loss functions, increasing the difficulty of balancing them during training. There also lacks a dynamic balancing mechanism to adjust these losses could hinder stable convergence.

(4) The experiment on New Subject Adaptation needs to include a comparison with existing methods, especially MindEye2, to better assess its effectiveness.

(5) The paper claims that MindLink extracts “subject-invariant features while preserving the spatial structure of 3D fMRI.” However, there is no clear experimental evidence to support these claims. To validate this, the following experiments are needed:
- Subject-Invariance: Perform representation similarity analysis (RSA) or UMAP visualizations across different subjects to demonstrate that the extracted features are indeed consistent across individuals.
- Spatial Structure Preservation: Conduct attention or activation maps to show that the model maintains spatial relationships in fMRI data. Evaluating spatial correlation metrics before and after transformation could also help substantiate this claim.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper demonstrates good potential, but additional experiments and analyses are required to further substantiate its claims and enhance its overall presentation.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper introduces MindLink, a cross-subject brain decoding framework designed to reconstruct visual stimuli from fMRI data without relying on subject-specific modules. They have: 1) Preserving spatial structure of 3D fMRI data via standardized cubic patching with 3D Vision Transformers. 2) Domain adversarial training to extract subject-invariant features and support cross-subject generalization in a single shared model. 3) A two-level alignment strategy (instance-level and token-level) to align fMRI embeddings with image embeddings from pretrained vision models, enabling image reconstruction via a Stable Diffusion model without further fine-tuning.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) The authors target a significant limitation in current brain decoding methods—poor generalization across subjects—and propose a practical solution for scalability and applicability in broader settings. 2) Instead of flattening the voxel grid, which is common in existing work, the paper preserves 3D spatial structure via cubic patching and processes it with a 3D Vision Transformer, which is a thoughtful architectural design choice. 3) The proposed method maintains a constant parameter size across subjects and performs competitively, even outperforming some larger models on specific metrics. This supports claims of scalability and efficiency. 4) The inclusion of new subject adaptation experiments (both quantitative and qualitative) adds practical value, demonstrating generalization ability with limited target subject data.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1) While the paper states that it is the first to perform single-model cross-subject fMRI reconstruction, similar directions have been explored in prior work, such as [Xiong et al. 2024]. The proposed cube patching approach is a reasonable and effective way to preserve spatial structure across subjects. However, other established standardization strategies also exist: such as surface-based mappings (e.g., unit disk mapping from [Xiong et al.], the HCP pipeline by [Benson et al. 2018]) or aligning to a common template like the averaged brain model (the fsaverage brain already preprocessed by NSD [1]). A discussion of these alternatives, or a comparison via ablation, would have strengthened the justification for the chosen strategy.

2) Although cross-subject generalization is presented as a core contribution, the current experiments are conducted on the same four NSD subjects for both training and testing (01, 02, 05, 07). This is consistent with protocols in prior work (e.g., MindBridge [21]), but limits the strength of the generalization claim. A key value proposition of cross-subject models is the ability to transfer to new, unseen subjects. Demonstrating performance on an unseen subject—whether within NSD or from another dataset such as NOD [Gong et al. 2023]—would offer much stronger support for the model’s practical generalizability and potential clinical applicability. While such an evaluation may be beyond the current scope, it represents an important direction for validating subject-agnostic approaches.

3) In terms of methodology, the paper builds on a number of well-established techniques—including domain adversarial training, masked voxel modeling, and pretrained diffusion-based image generation. While the integration is well executed, the novelty lies more in system design than in the introduction of new algorithmic principles. This limits the contribution from a foundational machine learning perspective. That said, demonstrating strong generalization to unseen subjects—as mentioned above—would significantly elevate the contribution and help differentiate it from prior work like [21] or [23].

4) Since cube pathcing and subject classifier in the model is relatively novel than other parts, it remains unclear how much of the performance gains stem from this particular choice, versus the overall architecture. A clearer comparison with conventional representations (e.g., flattened voxels, or spatial averaging) would help isolate the impact of this design. Additionally, applying cube patching to prior single-subject methods could help contextualize its benefits and clarify whether the full MindLink framework is necessary to realize the observed improvements.

5) The model includes several moving components and requires careful tuning and training. However, the paper does not mention plans for code or model release. This may limit reproducibility and make it difficult for other researchers to validate or extend the work.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper tackles an important challenge in brain decoding and presents a well-designed system that integrates spatially-aware fMRI processing, adversarial training, and multi-level alignment for cross-subject reconstruction. The results are promising, and the approach is technically solid.

However, the main concerns include limited novelty—similar cross-subject frameworks have been proposed and a lack of validation on unseen subjects, which weakens the generalization claims. Key design choices like cube patching are not compared against other established alternatives, and the contribution of individual components is not fully disentangled. Finally, the absence of code or reproducibility discussion is a concern given the model’s complexity.

These limitations collectively lead to a weak reject recommendation, though I believe the work has strong potential with further validation and clarification.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors feedback considerablely solved most questions from me and other reviwers, making me to give an accept for final decision.

Review #3

Please describe the contribution of the paper

1) A cross-subject brain decoding framework called MindLink is proposed to link multiple subjects into a single model by extracting subject-invariant features while preserving the spatial structure of 3D fMRI data. 2) By domain adversarial training, features unrelated to subjects are extracted from a single model structure, which enhances the ability to generalize across subjects. 3) The two-level alignment strategy learns instance-level consistency and flexible marker-level matching, effectively connecting fMRI and stimulus image embedding.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

1) MindLink innovatively combines domain-adversarial training with 3D spatial preservation to extract subject-invariant fMRI features, effectively addressing generalization challenges. 2) By integrating visual experience database retrieval into the decoding process, The proposed method enriches low-level and high-level visual representations, providing more comprehensive insights into brain activity. 3) Innovative design of spatial structure retention mechanism, strict maintenance of fMRI 3D topological characteristics when extracting subject invariant features, to achieve the interpretability of neural representation and cross-subject performance collaborative optimization.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

1) The description of the model is not specific enough. Although the author gives a specific optimization function and mathematical model, it is not detailed due to space limitation. And there is some confusion, for example, what is the specific difference between MSE used for reconstruct loss and MSE loss? 2) Less detailed discussion of the experiment. For example, if the final objective function contains four loss terms, but lamda3 is set small compared to the other three, will it work in the model? The authors do not discuss the four hyperparameters.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This article presents a novel and biologically inspired approach to brain decoding that significantly improves the accuracy and interpretability of visual stimulus reconstruction from fMRI data. The framework aligns with theoretical psychological principles and demonstrates superior performance over existing methods. Its potential applications in neuroscience and related fields are substantial, making it a valuable contribution to the field.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The method in the original manuscript is reasonable and the experiments are relatively sufficient. Moreover, during rebuttal, the authors provided sufficient responses and supplementary explanations to most of the contents in the review comments. I think it can be accepted.

Author Feedback

[All] Reproducibility A) We will publicly release our code upon acceptance as some of the details were left out due to page limit.

[R1] Confusion between MSE-based losses A) Both losses use MSE but serve different purposes—one for reconstruction, the other for instance-level alignment. We will rename L_mse as L_align.

[R1, R3] Ablation study A) (1) Token-level alignment loss (L_local): There was a typo in our paper—λ₄, not λ₃, is set to 0.01. The reviewer’s point still holds—this value is smaller than the others. We set λ₄ to a small value due to larger gradients from L_local, but despite its small weight, it remains essential—removing it (λ₄ = 0) caused severe performance drops.: PixCorr: 0.019, SSIM: 0.228, Alex(2): 51.6%, Alex(5): 51.1%, Incep: 50.4%, CLIP: 49.7%, EffNet-B: 0.986, SwAV: 0.710. (2) Spatial encoding (cube patching): Replacing 3D cube input with flattened voxels led to substantial performance degradation: PixCorr: 0.216, SSIM: 0.326, AlexNet(2): 91.2%, AlexNet(5): 96.6%, InceptionV3: 91.9%, CLIP: 92.4%, EffNet-B: 0.685, SwAV: 0.375. To further support this, attention maps from our 3D ViT-based fMRI encoder exhibit consistent spatial locality, suggesting that the model learns and preserves meaningful spatial structure. These results were left out due to page limit but will be included in the revision.

[R1,R2, R3] Lack of implementation details A) We will release the code for reproducibility. Meanwhile, the fMRI encoder is a 16-layer Transformer, the fMRI decoder is an 8-layer Transformer, the fMRI projector is a 4-layer Perceiver, and the subject classifier is a 2-layer MLP. We used a 50-epoch warm-up phase with only the reconstruction loss, which stabilized the encoder before introducing other objectives. Loss weights were tuned based on gradient magnitudes, and this combination led to stable convergence across training runs. We appreciate the reviewer’s suggestion and will consider dynamic balancing in future work.

[R2] Subject-agnostic learning clarification A) The novelty of our subject-agnostic learning lies in explicitly enforcing subject invariance through domain adversarial training, beyond simply mapping data to a shared space. This approach not only improves performance (Table 2 row 2 & row3) but also enables a single model architecture regardless of the number of subjects, offering strong scalability and parameter efficiency. Our subject-agnostic strategy combines encoder design, standardized input, and subject-adversarial loss, and cannot be validated through encoder replacement alone. We have tSNE figures where subject-wise clustering was alleviated when the subject loss is applied, which were left out due to page limit but will be included in the revision.

[R3] Standardization strategy A) We are aware of prior work using surface-based mapping strategies [Xiong et al. ISBI 2024], [Gu et al. MIDL 2023]. As a preliminary experiment, we applied Gu’s method to MindLink but observed worse results: PixCorr: 0.176, SSIM: 0.295, Alex(2): 82.5%, Alex(5): 89.1%, Incep: 84.2%, CLIP: 84.3%, EffNet-B: 0.794, SwAV: 0.456. We attribute this to substantial information loss during surface mapping: ~16k voxels within the nsdgeneral ROI are reduced to 3.8k vertices in the fs_LR 32k surface space. As our model removes subject differences via domain adversarial training, additional alignment steps (e.g., fsaverage) become redundant and may cause unnecessary information loss. Therefore, we adopt a minimal preprocessing strategy to preserve input integrity.

[R3] Cross-subject generalization A) The cross-subject generalization experiment, already included in Sec. 3.3 (New Subject Adaptation), compares a model trained on subjects 01, 02, and 05 and fine-tuned on unseen subject 07, to a model trained solely on subject 07 (Table 3, Fig 3). As the available data from the new subject decreases, the performance gap between the two settings increases—demonstrating the model’s generalization ability in low-data regimes.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

MindLink: Subject-agnostic Cross-Subject Brain Decoding Framework

Author(s):