List of Papers Browse by Subject Areas Author List
Abstract
Accurate segmentation of myocardial lesions from multi-sequence cardiac magnetic resonance imaging is essential for cardiac disease diagnosis and treatment planning. However, achieving optimal feature correspondence is challenging due to intensity variations across modalities and spatial misalignment caused by inconsistent slice acquisition protocols. We propose CAA-Seg, a composite alignment-aware framework that addresses these challenges through a two-stage approach. First, we introduce a selective slice alignment method that dynamically identifies and aligns anatomically corresponding slice pairs while excluding mismatched sections, ensuring reliable spatial correspondence between sequences. Second, we develop a hierarchical alignment network that processes multi-sequence features at different semantic levels, i.e., local deformation correction modules address geometric variations in low-level features, while global semantic fusion blocks enable semantic fusion at high levels where intensity discrepancies diminish. We validate our method on a large-scale dataset comprising 397 patients. Experimental results show that our proposed CAA-Seg achieves superior performance on most evaluation metrics, with particularly strong results in myocardial infarction segmentation, representing a substantial 5.54% improvement over state-of-the-art approaches. The code is available at https://anonymous.4open.science/r/CAA-Seg-2025.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2655_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{GaoYif_AComposite_MICCAI2025,
author = { Gao, Yifan and Rui, Shaohao and Su, Haoyang and Xiang, Jinyi and Wu, Lianming and Wang, Xiaosong},
title = { { A Composite Alignment-Aware Framework for Myocardial Lesion Segmentation in Multi-sequence CMR Images } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15960},
month = {September},
page = {2 -- 12}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper introduces a novel framework for multi-sequence CMR segmentation, with a selective slice alignment approach and a hierarchical alignment network. This framework effectively integrates heterogeneous cardiac imaging sequences and achieves state-of-the-art performance in myocardium segmentation, myocardial edema delineation, and myocardial infarction segmentation.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The paper addresses the inherent differences in acquisition protocols and spatial distribution of multi-sequence CMR by designing a phased alignment and fusion scheme, solving the challenges encountered in the integration of multimodal images in clinical practice. The integration of a selective slice alignment approach and a hierarchical alignment network provides a new perspective;
-
Experiments demonstrate that validation on a dataset of 397 patients shows good performance improvements, particularly significant in myocardial infarction segmentation;
-
The paper’s visualizations show good segmentation results for key challenges in segmenting myocardial lesions, such as small lesion size.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- In the introduction, the author first raises the issue of anatomical mismatches caused by sampling differences, and then discusses artifacts caused by resampling. However, based on the text, resampling distortion is presented as an independent challenge rather than an alternative approach introduced to solve the alignment problem, leading to confusion in causal relationships and unclear writing logic.
- The paper’s writing is not coherent enough and needs improvement in writing flow;
- Key concepts such as “composite alignment” are merely mentioned without clear definitions and explanations, making it difficult to accurately understand;
- The key mathematical symbols used in the paper (such as ϕₖ) lack clear definitions, and the basis for parameter settings (such as λ, window size w, parameters N and M) is unclear, with insufficient in-depth discussion of the role of these parameters in the overall model;
- This paper has limited discussion of existing work, especially lacking specific analysis of the problems and limitations of current mainstream methods (such as MyoPS). The paper fails to clearly explain the core improvements and practical advantages of the proposed method compared to existing technologies, resulting in insufficient demonstration of innovation and necessity;
- The ablation study in the paper does not deeply explore the impact of key parameters on model performance and robustness. For example, there is a lack of systematic experimental analysis of different λ values, window sizes, etc., which limits understanding of the model;
- The authors do not discuss situations where the proposed method performs poorly or fails, limiting awareness of the method’s scope of application and potential limitations;
- Although the challenges pointed out in the introduction are clear, and the method section proposes selective slice alignment and hierarchical alignment, there is a lack of sufficient argumentation as to why these structures can specifically address the aforementioned problems. The connection between method design motivation and theoretical foundation is weak;
- The authors use deformable convolution and pixel-wise modulation to address misalignment issues, but lack explanation. Similarly, the specific role of cross-attention in high-level feature fusion is not fully analyzed, making the rationality of the entire method seem insufficient.
- Please rate the clarity and organization of this paper
Poor
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
- The paper fails to clearly describe how the proposed method improves upon or differs from existing approaches such as MyoPS. This weakens the justification of both its novelty and necessity;
- While the method incorporates components such as deformable convolution, pixel-wise modulation, and cross-attention, the paper lacks clear motivation for these design choices;
- Several important concepts, such as composite alignment, are introduced without clear definitions, and key symbols and parameters (e.g., ϕₖ, λ, w) are used without adequate explanation or details of their settings;
- The overall writing flow and logic require improvement. Some transitions between problem description and method design are abrupt, and certain parts of the text are unclear or confusing to follow;
- The ablation study is limited in depth, offering little insight into how key parameters influence performance or robustness. The paper does not discuss failure cases or limitations of the method, which restricts understanding of its applicability boundaries.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have addressed most of my concerns and provided reasonable justification for their design choices. While deeper ablations would be preferred, their acknowledgments and code release support reproducibility. Despite some issues in clarity, the work presents valuable contributions with clinical potential.
Review #2
- Please describe the contribution of the paper
This manuscript proposes a composite alignment-aware framework for myocardial lesion segmentation in multi-sequence CMR images, which consists of a selective slice alignment method and a hierarchical alignment network. The proposed method has been conducted on a private dataset with superior performance.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1)This manuscript proposes a method that combines a selective slice alignment technique with a hierarchical alignment network. 2) It conducts experiments on a large-scale dataset. 3)The manuscript has a clear structure and easy readability.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1)There are several studies on multi-sequence CMR alignment. Could the authors elaborate on the advantages of their proposed method compared to these existing approaches? 2)The description of the ablation experiment method is not complete. What does baseline refer to? 3)Why did the authors introduce MvMM in the ablation experiment, and what is the significance of its introduction? 4)The manuscript does not specify the loss function. 5) The size of the sliding window is an important parameter. How to set it? Please provide an explanation. 6) Please provide more information about Ptask. 7)Could the authors verify that the proposed method achieves alignment?
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1)The description of the ablation experiment method is not complete. 2)Only do experiments on a private dataset to validate the effectiveness of the proposed method. 3)Important parameters are not provided with detailed explanations.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have addressed the concerns
Review #3
- Please describe the contribution of the paper
Authors propose novel two-stage framework for myocardial lesion segmentation from multi-sequence CMR images, explicitly addressing cross-sequence spatial misalignment and intensity variation challenges. Authors conduct comprehensive experiments on a large-scale in-house dataset of 397 patients and achieves significant performance improvements over previous approaches
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- Propose of novel CAA-Seg methodology which contains selective slice alignment and hierarchical alignment network is well-structured.
- Authors’ use of selective slice alignment is a novel, practical solution to avoid unnecessarily forced alignment that can cause interpolation artefacts.
- HA-Net’s combination of low-level geometric correction and high-level semantic fusion is methodologically sound and backed by solid reasoning.
- Authors have benchmarked against many baselines and the ablation study is informative and shows the contribution of each component.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
One of the main questions that I have is what happens in cases where no reliable slice pairs can be found. Does the method gracefully handle patients with highly irregular acquisition protocols or how is it going to happen?
-
The proposed framework assumes access to three CMR sequences (LGE, T1m, T2m). How would authors method perform if one of the sequences is missing or of poor quality?
-
Since deformable convolutions and cross-attention blocks can be memory-intensive, what is the typical inference time, GPU memory consumption and additional computational cost compared to the standard method?
-
The regularization term R(ϕ) is mentioned but not detailed — what specific form of regularization do authors apply?
-
I believe writing can be improved by some sentences being too long and could be made crisper. For example, “General frameworks were evaluated using only LGE images due to performance degradation with multi-sequence input, while cardiac-specific approaches utilized multi-sequence data with their dedicated fusion mechanisms” to “General frameworks were tested with only LGE images because they perform poorly with multi-sequence inputs. In contrast, cardiac-specific methods used all sequences with their own fusion strategies.”
-
The dataset is said to be “IRB-approved, large-scale in-house,” but authors do not describe any demographics, lesion distribution, or whether data imbalance (between infarction and edema cases) exists. That could be crucial to understanding performance especially the data imbalance.
-
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper offers a strong technical contribution with real potential for clinical translation. Selective slice alignment, in particular, addresses a very practical issue often overlooked in deep learning-based CMR studies. This framework is generalizable and could benefit other multi-sequence or multi-modal segmentation tasks. However there are few questions that I need answers to including computational overhead, missing sequences, dataset description and etc.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
I read the rebuttal and authors have addressed my concerns. Given the page limitations on the main paper, not availability to add additional explanation on supplementary material and limitations on the rebuttal phase, I am satisfied with the given answers.
Author Feedback
We warmly thank the reviewers for their constructive comments and recognition of our method’s novelty (R1,R2,R3), well-structured methodology (R2), and good performance (R3). We will incorporate clarifications into the updated version. Key responses are summarized below:
Advantages of CAA-Seg and Explicit Alignment (R1&R3): Our framework’s primary advantage is its explicit and robust handling of inter-sequence alignment. Unlike methods that force alignment of all slices or use generic fusion, our SSA stage dynamically identifies and aligns only anatomically corresponding slice pairs, crucially preventing artifact propagation from mismatched regions. This alignment then enables our HA-Net to effectively fuse features.
- Clarification of Methodological Details (R1&R3):
- Ablation baseline & MvMM (R1): Baseline refers to a naive alignment approach. Images are directly registered after resampling. MvMM was included as a strong alignment method to benchmark SSA.
- Ptask (R1): Ptask is a learned task embedding concat with bottleneck features. It enables the task-aware controller to modulate features based on input task id.
- “Composite alignment” (R3): This refers to our two-stage approach: SSA for robust inter-sequence geometric correspondence, followed by HA-Net for global-local feature alignment.
- Loss function (R1): Standard nnU-Net (Dice + CE) loss was used.
Mathematical Symbols and Parameter Sensitivity (R3): ϕk represents the composite transformation (rigid, affine, deformable) optimized for the k-th slice. The ‘sliding window w’ (R1&R3) is a conceptual term for the dynamic search range [j_k−1,N−M+k], adaptive to data and not a hyperparameter. For regularization R(ϕk) and λ (R2&R3), we used ANTs package SyNRA default setting. Therefore, SSA has no new tunable params.
Motivation of HA-Net Components (R3): The HA-Net hierarchically addresses residual misalignments and intensity variations post-SSA. For low-level features, where local geometric errors dominate, deformable convolutions correct spatial discrepancies, and pixel-wise modulation adjusts intensities. For high-level features, with reduced intensity variations, cascaded cross-attention fuses semantic relationships, ensuring targeted corrections at appropriate semantic depths.
Verification of Alignment (R1): Our SSA is designed for reliable inter-sequence alignment. Figure 1 illustrates our successful alignment results and the consequent improvements (Table 1).
Clarity in Introduction of Challenges (R3): The primary challenge is anatomical misalignment. Attempting to fix this via naive resampling leads to: 1) resampling distortion and 2) failure to resolve gross underlying anatomical mismatches. We appreciate the feedback on writing. We will revise the manuscript for the updated version to improve the logical flow.
- Robustness, Limitations, and Dataset (R2&R3):
- Handling unreliable slices (R2): If SSA cannot find reliable T1/T2 mapping correspondences, CAA-Seg defaults to using only the LGE sequence. This provides a robust fallback, as LGE itself is essential. Our task-aware controller is designed for such flexibility, enabling the model to process inputs ranging from all three modalities down to LGE-only, thus ensuring adaptability to varied clinical data quality.
- Computational Cost (R2): Inference times per case: CAA-Seg: 0.71s (1.8GB GPU), nnU-Net: 0.52s (1.1GB), UMamba: 0.62s (1.3GB). Our method is slightly more intensive due to the explicit alignment and hierarchical fusion, but it remains competitive.
- Dataset Details (R2): Our dataset (397 patients): all have Myo/ME annotations, 208 have MI. Full statistics will be in a journal extension due to page limits.
- Failure cases (R3): Performance may degrade with extreme misalignments or very small/indistinct MI lesions. We will discuss this and add examples in a journal version.
We would like to express our gratitude to the reviewers for your valuable suggestions!
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
This paper introduces a phased alignment and fusion scheme to address key challenges in multi-sequence CMR segmentation, particularly those arising from differences in acquisition protocols and spatial distributions. The proposed approach combines a novel selective slice alignment strategy with a hierarchical alignment network, offering both a technically robust and clinically relevant solution. The work represents a significant technical contribution with strong potential for clinical translation. The authors have addressed all reviewer comments thoroughly, and the paper is therefore recommended for acceptance.
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A