List of Papers Browse by Subject Areas Author List
Abstract
Medical Hyperspectral Imaging (MHSI) has emerged as a promising tool for enhanced disease diagnosis, particularly in computational pathology, offering rich spectral information that aids in identifying subtle biochemical properties of tissues. Despite these advantages, effectively fusing both spatial-dimensional and spectral-dimensional information from MHSIs remains challenging due to its high dimensionality and spectral redundancy inherent characteristics. To solve the above challenges, we propose a novel spatial-spectral omni-fusion network for hyperspectral image segmentation, named as Omni-Fuse. Here, we introduce abundant cross-dimensional feature fusion operations, including (1) a cross-dimensional enhancement module that refines both spatial and spectral features through bidirectional attention mechanisms; (2) a spectral-guided spatial query selection to select the most spectral-related spatial feature as the query; and (3) a two-stage cross-dimensional decoder which dynamically guide the model’s attention towards the selected spatial query. Despite of numerous attention blocks, Omni-Fuse remains efficient in execution. Experiments on two microscopic hyperspectral image datasets show that our approach can significantly improve the segmentation performance compared with the state-of-the-art methods, with over 5.73\% improvement in DSC. Code available at: https://github.com/DeepMed-Lab-ECNU/Omni-Fuse.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2362_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{ZhaQin_OmniFusion_MICCAI2025,
author = { Zhang, Qing and Pei, Guoquan and Wang, Yan},
title = { { Omni-Fusion of Spatial and Spectral for Hyperspectral Image Segmentation } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15960},
month = {September},
page = {474 -- 484}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper presents an innovative method for the fusion of multispectral and hyperspectral images, effectively integrating both spatial and spectral information to perform segmentation on hyperspectral images (HSI). The proposed algorithm is structured in multiple stages, each with specific tasks, such as the Mamba block, Swin transformation, Cross-Dimensional Feature Enhancement, Spatial-Spectral Decoder, and Mask Refinement.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The main strengths of the study are highlighted as follows:
- The paper is clearly written and effectively uses the included figures to support its arguments.
- The experimentation is comprehensive and provides comparisons with a wide range of state-of-the-art methods.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The following areas for improvement have been identified:
- The text is largely composed of blocks previously published in the state-of-the-art literature, which limits the level of innovation presented in the work.
- In relation to the previous point, the authors state: “we are the first to achieve abundant multi-dimensional feature fusion for MHSIs”, a claim that could be considered somewhat ambiguous since the literature contains numerous methods for image fusion across different spectral ranges. Moreover, this aspect is not thoroughly evaluated or analyzed in the text, as the main focus is primarily on segmentation.
- Additionally, the paper suggests that the proposed method helps mitigate redundancy in neighboring wavelengths, but the evaluation mainly focuses on the method’s ability to enhance class separability, without directly addressing the reduction of redundancy.
- While the results obtained are promising, they show considerable similarity between them, and no statistical tests were performed to evaluate their significance (consulting the authors’ guidelines).
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
I recommend highlighting the differences between the blocks used from the state-of-the-art methods and the proposed approach, emphasizing the innovations presented
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
My decision is mainly based on the limitations outlined in the weaknesses section
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
Based on the response provided in the rebuttal, I believe the authors have adequately addressed the reviewers’ comments. However, I suggest that the revised version explicitly specify the type of statistical analysis performed, as well as include a comparison for each of the reported metrics.
Additionally, it would be important for the authors to clarify how the spectral redundancy analysis was conducted. While the rebuttal states that “The proposed CFE block effectively reduces the spectral redundancy from 0.5755 to 0.46328 on MDC and 0.6450 to 0.5154 on GPCC,” it does not detail the methodology used to obtain these values, which is essential to validate the effectiveness of the proposed block.
Furthermore, I recommend clarifying the specific innovation introduced in the designed blocks, as the current presentation may suggest that the approach is mainly a combination of previous works within the same research line. This distinction is particularly relevant in the context of the conference.
Despite these points, and considering the responses provided in the rebuttal, I find it appropriate to recommend the acceptance of the paper for presentation at the conference.
Review #2
- Please describe the contribution of the paper
The authors propose a novel spatial-spectral omni-fusion network for hyperspectral image segmentation, which fully fuses the spatial-spectral information via comprehensive attention mechanisms.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1) two branches for spatial and spectral information, respectively. 2) use spectral information to guide spatial features. 3) a two-stage cross-dimensional decoder which dynamically guide the model’s attention towards the selected spatial query
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The writing is lacking, for example, 1)the phrasing “Though [6]” is inappropriate. 2)X_{spec} \in R^{BHW*S}, the authors do not explain what is B.
- What is the motivation for using CNN in the spatial branch and Mamba in the spectral branch?
- How is the X_{spec} flattened to T_{prispec}
- In Spatial-Spectral Decoder, as T_{spa}^{‘} is derived from T_{spa}, the cross attention of these two seems unreasonable.
- The authors should show the result of using only spatial information.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
see weaknesses.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
It fuses spatial and spectral information, and achieved sota performance.
Review #3
- Please describe the contribution of the paper
This paper presents Omni-Fuse, a spatial-spectral fusion network for medical HSI segmentation. It employs bidirectional attention for cross-dimensional feature enhancement and spectral-guided spatial selection to reduce redundancy. The method demonstrates superior performance (5.73% DSC improvement) with comprehensive experiments. While comparisons with recent spectral-specific methods could be expanded, the work offers significant technical advancement and clinical potential.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The proposed Omni-Fuse network introduces an innovative bidirectional attention mechanism for effective spatial-spectral feature fusion in medical HSI segmentation, coupled with a spectral-guided selection strategy to reduce redundancy. Its coarse-to-fine decoder architecture demonstrates superior performance (5.73% DSC improvement) through comprehensive validation on both public and private datasets, supported by thorough ablation studies and clear visualizations that enhance interpretability. The well-documented methodology ensures reproducibility while advancing the field with both technical novelty and clinical applicability.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The bidirectional attention and two-stage decoder, while effective, are not novel (similar to Transformer/Mask2Former designs), with insufficient differentiation from prior work. The paper lacks theoretical justification for why bidirectional attention optimally fuses spatial-spectral features, missing mathematical/visual support. Comparisons with recent Transformer/Mamba-based methods (e.g., SpectralFormer, Mamba-UNet) are absent. Ablation studies omit critical variants (e.g., multi-head vs. deformable attention), limiting architectural insights.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
While the paper presents a technically sound approach with promising results (5.73% DSC improvement), its incremental novelty over existing attention-based architectures and lack of thorough comparisons/analysis limit its impact. The work meets baseline acceptability but requires stronger differentiation from prior art and expanded validation to demonstrate significant contribution.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have sufficiently addressed my previous comments and clarified the issues raised. I support acceptance of the paper.
Author Feedback
We thank all reviewers and the AC for constructive feedback. First, we design an omni-fusion framework for MHSI segmentation. Unlike previous work, we treat spatial and spectral as two modalities, and design a series of cross-dimensional fusion blocks distributed throughout the network. Next, we will address the major concerns one by one.
1) @R1 @ R2 @ R3: the novelty 1.1) We argue our method is novel. Existing methods perform fusion across different spectral ranges but fail to capture their subtle change. In this paper, we treat spatial and spectral dimensions as two modalities to comprehensively analyze the full spectral profiles of different pathological regions and their inter-regional differences. Primary feature extraction block is not our novelty, with Mamba from MDN [11] for spectral feature and a simple CNN for spatial feature. Then, we design multiple feature fusion blocks throughout the network to enhance spatial-spectral representation learning: (1) a cross-dimensional feature enhancement (CFE) block refines spatial and spectral features by incorporating complementary modality information, (2) a feature selection block identifies the most representative features based on spectral feature to serve as decoder queries, and (3) a two-stage spatial-spectral decoder progressively transforms the compacted features into a coarse segmentation mask by cooperating with enhanced spatial and spectral features. Unlike Mask2Former, our decoder differs both in input and structure. 1.2) SpectralFormer achieves DSC of 82.68% on MDC and 79.47% on GPCC and Mamba-UNet achieves DSC of 81.63% on MDC and 77.53% on GPCC, further proving Omni-Fuse’s superiority.
2) @R1: redundancy The proposed CFE block effectively reduces the spectral redundancy from 0.5755 to 0.46328 on MDC and 0.6450 to 0.5154 on GPCC, indicating its ability to suppress redundant information across adjacent spectral bands.
3) @R1: significance evaluation Statistical significance tests on DSC metrics show that all p-values below 0.05 on both datasets (QSQL-FL [6] vs. Omni-Fuse: p-value=0.0463), indicating that our model ‘s improvements over existing SOTA methods are statistically significant.
4) @R2 @R3: ablation 4.1) bidirectional attention Fig. 3 presents t-SNE visualizations of features enhanced by CFE with bidirectional attention mechanism, showing improved discrimination between positive and negative regions. 4.2) deformable attention (DA) vs. multi-head attention (MHA) We replace DA in CFE with MHA, resulting in a DSC drop of 3.8% on MDC and 4.35% on GPCC. This is because DA effectively models spatial dependencies by focusing on a sparse set of key regions. 4.3) pure spatial vs. hyperspectral We explore Omni-Fuse using pseudo-color image (420nm, 495nm, and 625nm mapped to RGB) to represent pure spatial information. DSC value decreases 9.74% on MDC and 8.21% on GPCC, highlighting the superiority of hyperspectral image and the proposed omni fusion strategy.
5) @R3: details 5.1) We revise “Though [6]” into “Though FL [6]”. 5.2) B in R^{BHWS} is batch size. 5.3) To flatten X_{spec} to T_{prispec}, we first merge the batch (B) and spectral (S) dimensions and apply depthwise convolution. Then, we re-arrange the features with (BS)L_{spec}HW into (BHW)SL_{spec}, i,e, T_{prispec}. 5.4) In Spatial-Spectral Decoder, T_{spa}^{‘} is derived from both T_{spa} and T_{spec} by calculating their relevance matrix in Equation (1). The top N_q indices in the matrix represent the most informative spatial-spectral regions and are selected as queries. They are then used in a staged cross-attention process with enhanced spectral and spatial features as key/value. Spectral-based attention captures fine spectral differences and spatial-based one models context-aware features. The “selection+spectral focusing+spatial fusion” strategy effectively drives spectral-spatial attention in stages, enabling efficient, discriminative, and context-aware feature for MHSI segmentation.
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
The paper proposes Omni-Fuse, a novel network for medical hyperspectral image segmentation. First-round reviews questioned the novelty of its bidirectional attention blocks. After rebuttal, all reviewers agreed that concerns about novelty, ablation studies, and architectural justifications were addressed and upgraded to “Accept.” Because these issues are resolved and the method demonstrates clear empirical improvements, I recommend acceptance.