Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Traditional multi-modal medical image fusion methods typically employ a hierarchical feature fusion strategy. However, due to inconsistencies among features at different scales, these approaches often introduce unanticipated deformations during the fusion process. Such deformations accumulate through successive registration steps and ultimately result in oscillatory distortions at the fine-detail level. To address this challenge, we propose a progressive image reconstruction framework that is guided by multi-scale deformation fields. Specifically, the input images are first mapped into feature spaces at multiple scales and a deformation field prediction strategy is employed to generate multiple deformation fields that capture both local and global transformation trends simultaneously. Notably, the deformation fields generated across all scales possess the intrinsic capability to directly perform image registration. This capability eliminates the need for sequential propagation of registration outcomes and effectively mitigates cumulative deformation issues. In the image reconstruction phase, we adopt a progressive coarse-to-fine strategy, leveraging multi-scale deformation fields to achieve accurate structure restoration and fusion. Extensive experimental results demonstrate that the proposed method significantly enhances image alignment accuracy and fusion quality across multiple datasets, offering an efficient and robust solution for multi-modal medical image processing.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1799_paper.pdf

SharedIt Link: https://rdcu.be/eHwMQ

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04937-7_5

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LonNuo_BiMSRec_MICCAI2025,
        author = { Long, Nuoer AND Yang, Kaiwen AND Xie, Xinyu AND Yu, Zitong AND Tan, Tao AND Sun, Yue},
        title = { { BiMSRec: A Progressive Image Reconstruction Framework for Medical Image Fusion Guided by Multi-Scale Deformation Fields } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {46 -- 55}
}

Reviews

Review #1

Please describe the contribution of the paper

The paper proposed an image fusion network, BiMSRec, to accomplish high-quality medical image fusion by coordinating a registration subnetwork and a fusion subnetwork. For registration, multi-levels of ‘optic flow’ are calculated in two directions, to enable more accurate registration performance. The fusion was progressively done across multiple scales, coorperating with the registration result at each level. The network reached overall good performance compared with other models, and the effectiveness of bidirectional registration field calculation was proved by ablation study.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The bidirectional registration mapping helps the network to have more knowledge for higher level registration, with good performance on test data. The paper is well written and organized.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The network structure is not very innovative. Bidirectional/ multi-level registration have been applied by previous work.
2. The HOFA design is questionable. Where flows of two directions of registration (flow_i and flow_{11-i}) are both computing with same image (F_{Bi}). It makes little sense to warp an image with the deforamtion field for an opposite task ( warp A by the field for B->A),so the authors should make a clear justification.
3. The loss function of Fusion part is vaguely described. Authors should give mathematical expression as the registration loss
4. In the quantitative comparison table 1 and 2, BiMSRec is not reaching top SSIM most of the time, while SSIM is a very important indicator of the performance. The selection of metrics should also have more explanation, SD is not a popular choice, and intuitively SD is supposed to be smaller the better?
5. I would recommend ‘warping field’,’deformation field’ instead of ‘optic flow’, since the current wording is misleading that this paper is based on the optic flow algorithm.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Even tho the authors propose the network structure with descent performance, considering the innovation of this network is limited, besides the performance is not at an outstanding level.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The author stated my questions well. They justified the innovation of their method from existing ones, also had a clearer explanation on the function of bidirectional registration. Now I am less skeptical and feel this work has good innovation, while the authors have a clear design of the structure.

Review #2

Please describe the contribution of the paper

A fusion method based on multi scale optical flow is presented. The method consists of a feature extraction network, an optical flow registration network and a network that reconstructs the fused image using the obtained optical low maps. The method is evaluated on an open dataset.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Seemingly robust method. Sound approach, due to the multi scale and feature extraction based approach.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Data handling is not clear - are the obtained results from data independent of the data used for method development/training? It is not clear how the competing methods are optimized (hyperparameters and training)?
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

More information on data usage and implementation of competing methods is needed.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

My assessment agreed to a large extent with he other reviewers and comments have been taken into consideration.

Review #3

Please describe the contribution of the paper

The main contribution of the paper is their methodological contribution, which comprises of two components; M2FReg and PFRecon, which together allow bidrectional, multi-scale, and multimodal optical flow registration for accurate alignment. The paper showcases the advantages of their novel method by demonstrating the improved robustness of the proposed method when compared to conventional nonlinear registration methods, on a variety of modalities.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

I believe the experimental setup is well done, in particular with the inclusion of ablation study that help justify and demonstrate the effectiveness of the individual components (e.g. forward vs reverse flow contributions to the performance) of the model. The experiments suffice to convince the readers the bidirectional and multiscale mechanisms contribute to the performance. In addition, the experimental results showcasing the robustness and accuracy of the method to various different modalities help promote versatility and wide applicational potential of the method.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

I believe an external validation set would help make an argument for the robustness of the method to site differences. There isn’t any explanation for the interpretability of the proposed method. Also, given that the authors note that separate registration and fusion methods in the past may have performance and computational complexity disadvantages to their proposed method, a discussion about computational cost and efficiency of the proposed model would be beneficial.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I think the explanation of the proposed method and the experimental setup are sound and well thought out. The performance in the results section show good performance.
Reviewer confidence

Not confident (1)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank all reviewers for their helpful and thorough feedback. Our responses are included below:

Q1 Data, hyperparameter and training, external validation, computational cost (R1, R3) A1: Thank you for the feedback. We strictly followed the KPSFusion [1] protocol to keep training and test sets fully independent CT–MRI: 144/20, PET–MRI: 194/55, SPECT–MRI: 260/77. Test data were never used during training. In journal extension, we will incorporate multi-center and protocol-diverse data for external validation. The hyperparameters and training were already discussed in Section 3.1 on page 7. We’ll add content about computational cost in the revision. [1]Tang et al. “Rethinking the necessity of image fusion in high-level vision tasks: A practical infrared and visible image fusion network based on progressive semantic injection and scene fidelity.” Information Fusion 99, 101870 (2023)

Q2 Bidirectional/ multi-level registration (R4) A2: Our method’s bidirectional and multi-level aspects are fundamentally different. Previous multi-level methods typically divide the network into several levels. Each level only predicts a “deformation component.” These components need to be stacked level by level to form the final deformation field. A single deformation component alone cannot achieve accurate registration. In our method, at each scale (level), we independently predict a complete deformation field, not just local sub-components. We drawing inspiration from ensemble learning, to assign weights to the complete deformation fields at different scales. Small-scale fields are better at aligning details, while large-scale fields excel at global consistency. This dynamic weighting and fusion retain the multi-level strategy’s sensitivity to different information granularities. The bidirectional of the previous method is reflected in swapping the roles of the fixed and moving images: A moving → B fixed, B moving → A fixed, learning deformations separately. Our bidirectional method defines “forward (flow_i)” as large-to-small scale feature extraction and “backward(flow_{11-i})” as small-to-large scale paths. A complete deformation field is independently predicted at each scale in both paths. Finally, these are integrated through a fusion module.

Q3 HOFA design (R4)
A3: In the registration setup for this paper, only image B undergoes a random elastic affine transformation to generate a perturbed image, causing it to be spatially mismatched with the original image A. Therefore, the registration process is essentially a process of “restoring image B”, without transform A. Forward flow_i (large-to-small scale) and backward flow_{11-i} (small-to-large scale) restore the “distorted image B” to be aligned with image A.

Q4 Loss function (R4) A4: Thank you for your valuable suggestions. We will provide detailed mathematical formulas for the fusion loss and registration loss in the revised manuscript.

Q5 Metrics (R4) A5: There are significant structural differences between modalities, such as fusing CT and MRI. When SSIM reaches 1, it means the fusion has failed. Therefore, an excessively high SSIM may be due to redundant enhancement or interference in the fused image [2] and needs to be combined with other metrics to jointly measure performance. The other metrics are commonly used in image fusion and have been adopted by various previous image fusion methods. SD (standard deviation) here is not the traditional mathematical standard deviation but measures the dispersion of image pixel values. A higher value indicates higher image contrast and more prominent details, such as edges and textures. [2]Fang et al. “A cross-modal image fusion method guided by human visual characteristics.” arXiv:1912.08577 (2019).

Q6 Optic flow misleading (R4) A6: Thank you for the suggestions. To avoid misleading, we will replace all instances of “optical flow” with “deformation field” in the revised manuscript to more accurately reflect the method used in this paper.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The paper proposes a bidirectional, multi-scale registration and fusion framework for multimodal medical image fusion. The method integrates deformation field estimation at different scales with a progressive fusion mechanism, achieving strong results across CT–MRI, PET–MRI, and SPECT–MRI datasets. The main contributions lie in how complete deformation fields are estimated independently at multiple scales and fused using learned weights, as well as the combined registration–fusion strategy that demonstrates robustness across modalities.

Reviewer 1 found the approach sound and the results promising, although initially sought more clarity regarding dataset partitioning and optimization procedures for competing methods. These were addressed clearly in the rebuttal, leading to an upgraded recommendation. Reviewer 3 appreciated the structured evaluation and ablation studies, noting strong performance and practical applicability, though called for additional insight into computational efficiency and interpretability. Reviewer 4 initially raised concerns about the novelty of the architecture, potential confusion in the use of optical flow terminology, and metric selection. However, the rebuttal provided detailed clarifications regarding the unique aspects of the proposed multi-scale and bidirectional design, justified the metric choices (particularly the interpretation of SSIM and SD in a fusion context), and agreed to revise terminology to better reflect the method.

Overall, the authors have demonstrated both a technically robust method and the ability to address criticism in a constructive and well-reasoned way. While the architecture may not be radically innovative, it is thoughtfully composed, effective, and evaluated with care. Following the reviewers scores and careful check of this AC, AC recommend acceptance.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

BiMSRec: A Progressive Image Reconstruction Framework for Medical Image Fusion Guided by Multi-Scale Deformation Fields

Author(s):