Abstract

Dynamic MR images possess various transformation symmetries, including the rotation symmetry of local features within the image and along the temporal dimension. Utilizing these symmetries as prior knowledge can facilitate dynamic MR imaging with high spatiotemporal resolution. Equivariant CNN is an effective tool to leverage the symmetry priors. However, current equivariant CNN methods fail to fully exploit these symmetry priors in dynamic MR imaging. In this work, we propose a novel framework of Spatiotemporal Rotation-Equivariant CNN (SRE-CNN), spanning from the underlying high-precision filter design to the construction of the temporal-equivariant convolutional module and imaging model, to fully harness the rotation symmetries inherent in dynamic MR images. The temporal-equivariant convolutional module enables exploitation the rotation symmetries in both spatial and temporal dimensions, while the high-precision convolutional filter, based on parametrization strategy, enhances the utilization of rotation symmetry of local features to improve the reconstruction of detailed anatomical structures. Experiments conducted on highly undersampled dynamic cardiac cine data (up to 20X) have demonstrated the superior performance of our proposed approach, both quantitatively and qualitatively.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0893_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zhu_SRECNN_MICCAI2024,
        author = { Zhu, Yuliang and Cheng, Jing and Cui, Zhuo-Xu and Ren, Jianfeng and Wang, Chengbo and Liang, Dong},
        title = { { SRE-CNN: A Spatiotemporal Rotation-Equivariant CNN for Cardiac Cine MR Imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15007},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The work proposes a spatiotemporal rotation-equivariant convolutional layer. The proposed method was validated for cardiac CINE MR image reconstruction with the aim to improve morphological delineation.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The work proposes an interesting concept which has not been studied so far in the context of dynamic cardiac MR reconstruction. The authors achieved qualitatively sharper images especially at the myocardial to blood pool border.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The authors lack to clearly present the spatiotemporal rotation-equivariant filters. The basic idea is nicely illustrated in Fig. 1, but the derivations towards Equation 3, especially the breakdown into the individual layers is not clear.
    2. Why are there input, intermediate and output equivariant filters?
    3. What is the purpose of the cyclic channel shift?
    4. Please reference the following related works for dynamic MR DL reconstruction: https://doi.org/10.1038/s41598-020-70551-8 https://doi.org/10.1002/mrm.28917
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Please provide a link to the source code.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. Please clearly clarify that experiments were performed on retrospective subsampled data.
    2. Please cite the references based on their order of appearance.
    3. Please clarify if individual subjects were used for training, validation and test.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The method can be of merit, however in its current stage is poorly presented and difficult to comprehend.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The authors have addressed the aspects of novelty, but the method is still challenging to comprehend. Source code will be provided to improve reproducibility. Further experimental data would need to be provided.



Review #2

  • Please describe the contribution of the paper

    In this manuscript, the authors apply equivariant CNNs to improve dynamic MRI by utilizing the spatial and temporal rotational symmetry in dynamic MR images. Specifically, a temporal-equivariant convolutional module is proposed to preserve the global equivariance of the rotation symmetry, and a high-precision filter parametrization strategy based on 1D and 2D Fourier series expansion is introduced to improve the reconstruction of detailed features in dynamic MR imaging.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    1)The authors have developed a novel temporal-equivariant convolutional module that links two spatial-equivariant convolutional layers, thereby preserving global equivariance to rotational symmetry across the 2D+t CNNs. 2)An high-precision filter filter parametrization strategy, utilizing 1D and 2D Fourier series expansion, is introduced to enhance the precision of convolutional filter representation and accuracy. 3)Experimental results demonstrated the proposed architecture can improve the performance on highly undersampled dynamic cardiac cine data.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    1)Ablation study should be conducted to illustrate the effectiveness of the proposed module and strategy. 2)Existing work [4,8] have applied equivariant CNNs to MR image reconstruction tasks, therefore the novelty of this manuscript is limited.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please conduct an ablation study to illustrate the effectiveness of the proposed module and strategy.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Although this paper is not the first to use equivariant CNNs for MRI reconstruction tasks, its authors proposed to utilize temporal equivariant in dynamic MR images and achieved detailed reconstruction through Fourier expansion, resulting in significant improvements. This paper has certain reference significance for the research on dynamic MR images.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors propose an unrolled iterative neural network to accelerated cardiac CINE MRI, which, alternates between a data consistency term and a spatiotemporal rotation-equivariant CNN (SRE-CNN), which has been introduced by Xie et al in [25]. The method exploits anatomical similarities under different rotation transformations. Retrospectively undersampled data were used to train/validate/test the proposed method. Results were compared to those obtained with L+S and other unrolled MRI reconstruction networks, such as MoDL and DL-ESPIRiT. The authors conclude that the proposed method has superior performance, both quantitative and qualitatively.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. I believe that the main novelty is the use of a spatiotemporal rotation-equivariant CNN-based block within the unrolled MRI reconstruction model and its application to cardiac CINE MRI. The method is interesting, and the concept seems suitable for the proposed application.
    2. The proposed method was compared with other DL methods for MRI reconstruction, including MoDL and DL-ESPIRiT. From what I understand, the proposed method is based on DL- ESPIRiT, i.e., it is an unrolled neural network where the CNN block is replaced by an SRE-CNN block. This enables to directly investigate the impact of the SRE-CNN on the reconstruction.
    3. Results outperform state-of-the-art DL methods for MRI reconstruction in terms of PSRN and SSIM for acceleration factors up to R = 20.
    4. Fully sampled CINE data were collected for the study.
    5. Overall, the manuscript is nicely structured, and the novelty is clearly stated.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Quantitative left ventricular (LV) function assessment: The suitability of the reconstructed images for quantitative cardiac function assessment was not investigated. CINE MRI is used to assess left ventricular function and thus it is essential to evaluate the ability of the proposed method to estimate functional parameters, for example, end-diastolic volume, end-systolic volume and ejection fraction left ventricle compared to the reference. Figure 3 shows a single slice and single cardiac phase. Further comparisons are required to evaluate image quality.
    2. Dataset: 2.1 Relatively small CINE dataset consisting of 29 subjects was used. It is not clear why a larger number of cases was not used, for example, from the UK Biobank dataset, since as indicated in Figure 2, the input of the unrolled reconstruction model are retrospectively undersampled images.
      2.2 It is not clear where the 800 2D-t datasets come from. If data were acquired from 29 healthy subjects and, for each subject, 10 to 13 SAX were obtained, then how can there be 800 2D-t datasets for training, 30 for validation and 118 for testing? Was the training performed coil by coil or on coil-combined images? Does the method handle complex-valued data? 2.3 It is not unclear whether slices from the same patient were mixed in the training/validation/test sets. 2.4 Given the relatively small dataset, cross-validation should be used to assess the performance of the method. 2.5 Even though CINE MRI data were specifically collected for this study, the proposed method was not tested on prospectively undersampled data.
  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The authors provide details about the acquired data and proposed model in the paper, but do not provide code or data. Parts of the code can be obtained from other publications. Does not seem straightforward to reproduce.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Overall, the paper introduces an interesting method to accelerate CINE imaging and it is well written. However, I have a few suggestions to improve the paper:

    1. In the paper it reads: ““The loss function is constructed as l1 differences in pixel-wise sense”. Please provide more detail on the loss function to facilitate reproducibility.
    2. Perform cross-validation study to assess the performance of the proposed method.
    3. Report the training time of the different DL models and comment.
    4. Indicate if the differences between the methods are/are not statistically significant.
    5. Comment on the use of data augmentation strategies versus the proposed method.
    6. The error maps in Fig. 3 do not have a scale, please add.
    7. I suggest that the authors compare the proposed method with related work, such as A. Kofler, et al, “Spatio-Temporal Deep Learning-Based Undersampling Artefact Reduction for 2D Radial Cine MRI With Limited Training Data,” in IEEE Transactions on Medical Imaging, vol. 39, no. 3, pp. 703-717, March 2020, doi: 10.1109/TMI.2019.2930318.
    8. Please provide more model and experimental details. How many iterations were used for the other DL methods? What were the parameters?
    9. What were the parameters for L+S? The method should perform well for at least R=12.
    10. Indicate the clinical motivation behind the work. For example, does the proposed method enable acquisitions in a single breath-hold?
    11. What is the filter size p and how was it selected? Comment on how this parameter affects the results.
    12. Define acronym SOTA: state-of-the art (SOTA)
    13. Check references, for example, page 7, MoDL should be [1] instead of [15 ].
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper provides an interesting solution for accelerated CINE MRI. The method is reasonably well described, and results are promising. However, the paper would benefit from cross-validation performance analysis, quantitative analysis of LV function and statistical analysis.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

Thank all reviewers for the constructive comments.

1.Novelty(R4): The proposed method is different from existing methods [4,8] in the following aspects: 1) Methods [4,8] directly applied conventional equivariant CNN (ECNN) to improve the model’s robustness to drifts in scale of the whole objects caused by the variability of patient anatomies, but they lack specialized designs for reconstructing image fine details. To tackle this, we designed a temporal-equivariant convolution module in ECNN framework to exploit rotation symmetry in images, significantly improving the detailed anatomical structure in reconstructed images (see Fig.3). 2) The myocardial edge region (crucial for clinical diagnosis of heart diseases) is evidently better reconstructed by our method, while not by [4,8]. 3) Our method is specially designed for dynamic imaging while [4,8] are not suitable for such scenarios.

  1. Method(R3): As ECNN has been well explained in previous works [23,25], we only described the novelty parts in detail due to page limit. We will release our code for reproductivity. Equivariant CNN is designed based on the observation that when the relative orientation of an image and a convolution kernel changes, the extracted features will be different. 1) To ensure the feature robustness to input image rotation, the input equivariant layer creates a set of weight-sharing convolution kernels by copying and rotating a kernel. For instance, in Fig.2b, when the input is rotated 120° clockwise, the features that were originally extracted by red-box convolution kernel are now extracted by green-box kernel for orientational consistency. 2) After multiple convolutional layers, the extracted features cannot be still robust to image rotation as CNNs are not inherently equivariant to rotation. Intermediate equivariant layer is designed to address this, i.e., to maintain the relative orientation with the original input and allow the kernel continuously tracking the matching features through 2D group rotations and channel-wise cyclic shift operations on the convolutional kernels. As shown in Fig.2b, when the input is rotated 120°, the channel position of the pink convolution kernel (originally convolved with red-box feature map) needs to be shifted to match the corresponding relative orientation and convolve with green-box feature map. 3) Output equivariant layer is to reduce the additional channels extended by rotation group. The combination of the input, intermediate, and output layers achieves the utilization of the spatial symmetry prior depicted in Fig.1. 4) A temporal-equivariant layer is specially designed for dynamic imaging. Specifically, the 1D convolution kernel is extended using the similar group as that in other equivariant convolution layers, but without rotating the 2D plane. Then, a cyclic shift is performed on the channels of the 1D convolution to ensure the global equivariance of the network. This extended set of 1D convolution kernels jointly extract the temporal symmetry priors (shown in Fig.1) in dynamic images.

  2. Ablation Study(R4): DL-ESPIRiT(R2plus1D) is exactly our backbone and represents an ablation study, which shows that the essential embedding of rotation symmetry prior helps SRE-CNN outperform R2plus1D evidently. Further ablation studies are omitted due to page limit.

  3. Dataset, more experiments (R1, R3): No slices from the same patient were mixed in training/validation/test sets. Data augmentation using rigid transformation-shearing was applied to produce 800-30-118 (training/validation/test) samples. The network is trained with coil-combined images. Complex-valued data are converted into two real-valued channels. We tested our model on OCMR and MICCAI Challenge datasets which has raw k-space data, showing SRE-CNN’s excellent generalization ability. Due to page limit, we cannot provide more visualization of prospective study and left ventricular function assessment, but we’ll deeply explore them in our journal work




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    Justification regarding novelty was addressed by the authors during the rebuttal period. The proposed temporal-equivariant convolution may contribute to various dynamic imaging tasks in MRI.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    Justification regarding novelty was addressed by the authors during the rebuttal period. The proposed temporal-equivariant convolution may contribute to various dynamic imaging tasks in MRI.



back to top