Abstract

Most existing deep learning-based medical image registration methods estimate a single-directional displacement field between the moving and fixed image pair, resulting in registration errors when there are substantial differences between the to-be-registered image pairs. To solve this issue, we propose a symmetric normalization network to estimate the deformations in a bi-directional way. Specifically, our method learns two bi-directional half-way displacement fields, which warp the moving and fixed images to their mean space. Besides, a symmetric magnitude constraint is designed in the mean space to ensure precise registration. Additionally, a deformation-inverse network is employed to obtain the inverse of the displacement field, which is applied to the inference pipeline to compose the final end-to-end displacement field between the moving and fixed images. During inference, our method first estimates the two half-way displacement fields and then composes one half-way displacement field with the inverse of another half. Moreover, we adopt a multi-level strategy to hierarchically perform registration, for gradually aligning images to their mean space, thereby improving accuracy and smoothness. Experimental results on two datasets demonstrate that the proposed method improves registration performance compared with state-of-the-art algorithms. Our code is available at https://github.com/QingRui-Sha/HSyN.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3036_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: N/A

Link to the Code Repository

https://github.com/QingRui-Sha/HSyN

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Sha_Hierarchical_MICCAI2024,
        author = { Sha, Qingrui and Sun, Kaicong and Xu, Mingze and Li, Yonghao and Xue, Zhong and Cao, Xiaohuan and Shen, Dinggang},
        title = { { Hierarchical Symmetric Normalization Registration using Deformation-Inverse Network } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper describes an approach for symmetric image registration, where each image is mapped to a half-way space where the similarity loss is calculated. The displacement fields are inverted using a learned deformation inversion network. This framework is posed with a multi-level architecture to allow more flexible warping.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Breaking the transformation into two steps and composing these leads to more flexible transformations, which are shown to be beneficial in terms of accuracy. The results indicate the method can generate accurate registration with a fairly low level of folding in brain MRI and chest CT datasets.

    The ablation study indicates that all the developed components contribute to the overall performance of the method.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I’m not certain what equation 3 is doing: the displacement field is being warped by an inverse mapping based on the same displacement field (it’s not clear if some form of parallel transport, e.g. [1] is used to correct for rotations) - the inverse of the same displacement field is then added to this result - geometrically - what does this result in?

    There is an assumption in equation 1 that displacement fields can be meaningfully added together, rather than composed as transformations. It also don’t follow how geometrically this loss leads to the desired result - are the voxelwise sum of displacements meaningful in some way? This is very different from penalising the magnitude of the displacement fields and trying to make them equivalent.

    The lack of an explanation for the hyper-parameter choice is surprising. It would also be beneficial to elaborate on the number of parameters in the network and how this compares to previous works.

    Refs: [1] Lorenzi, Marco, Nicholas Ayache, and Xavier Pennec. “Schild’s ladder for the parallel transport of deformations in time series of images.” Biennial international conference on information processing in medical imaging. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Limited detail of the model architecture and no mention of code

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The definition of \phi only appears in the bottom of figure 2, which made it quite difficult to follow.

    MSE in equation 4 should just be the equation for mean squared error?

    Could the deformation inverse network be pretrained? as in ref 19?

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The geometric interpretation of some parts of the loss function does not seem reasonable - but the authors can perhaps clarify something about this in their rebuttal? The improvement in accuracy likely stems from the increased flexibility in estimating two displacements fields, which is not a bad idea at all, but some mention should be made how in the limit of more fields this becomes similar to LDDMM type approaches, which aren’t mentioned.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I think the authors did a sufficient job in answering the reviewers criticisms



Review #2

  • Please describe the contribution of the paper

    Formulates pairwise registration in terms of the composition of a half-way deformation and the inverse of another. These original half-way warps are estimated using a neural network, as is the inverse of one of the deformations.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The work considers inverse consistency when registering images of two individuals.

    • This gives better DSC (and TRE when appropriate) than the baseline methods, and less folding (negative Jacobains) than most of them.

    • The ablation studies showed that the proposed ideas (multi-scale registration and inverse-consistent formulation) were of benefit.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • The role of pairwise registration between subjects was not especially well motivated. How does it fit into pre-operative planning? Is it for multi-atlas labelling? It’s role in follow-up studies would be a within-subject task, but there was no validation based on this (although I would expect a similar benefits from the proposed model).

    • I’m not convinced that a NN is needed to compute an inverse deformation.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    No mention is made of code availability in the submission, and there is little said about the actual network architectures used.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    • Not every problem is best solved by neural networks (“When all you have is a hammer…”). While it is nice to know that a U-Net can be used to tackle the problem, inverse warps can be obtained in other ways too.

    • I’m not sure I understand the $L_{smc}$ (or $L_{mc}$) loss in equation 1. Why penalise the sum of the displacements? Why not the sum of squares or the sum of the absolute values? This would make $\mathbf{u}{m_f}^{(0.5)}$ more similar to $-\mathbf{u}{f_m}^{(0.5)}$. If this is what was intended, then why not use a single $\mathbf{u}^{(0.5)}$ and penalise $L_{sim}(I_m \circ (\text{id} + \mathbf{u}^{(0.5)}), I_f \circ (\text{id} - \mathbf{u}^{(0.5)}))$?

    • The manuscript refers to “geodesic path”, where this may not be appropriate because the warps do not actually follow geodesics. Obtaining geodesics involves using one of the diffeomorphic deformation frameworks, where the shortest distance will minimise the energy of a series of small deformations, which are composed together to obtain a large deformation. In practice, this is sometimes done using a variational approach (the original LDDMM algorithm), but can also be done using the equations for a dynamical system that lead to a geodesic (geodesic shooting approaches).

    • A minor cosmetic point is that it is nicer to see functions written in \mathrm font.

    • Say “composes”, rather than “composites”.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript contains a few nice ideas, the proposed method seems to work fairly well, and the description of how it works is not terrible.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    The rebuttal has not changed my opinion about this work.

    Unless the square brackets in equation 1 mean “take the absolute value”, then I still have reservations about the use of $L_{smc}$. What’s to stop both $u_{m_f}^{(0.5)}$ and $u_{f_m}^{(0.5)}$ from both going strongly negative?

    Inverse warps can easily be generated from warps that were not generated using scaling and squaring. You could simply push an identity transform using the deformation to it’s new location, and divide this by pushed a field of ones. The pushing would be by the adjoint (transpose) of the usual warping procedure. See e.g. Davatzikos, et al, 2001. “Voxel-based morphometry using the RAVENS maps: methods and validation using simulated longitudinal atrophy” for a simple explanation of how the pushing might work.



Review #3

  • Please describe the contribution of the paper

    The submission proposes a hierarchical symmetric registration network that performs symmetric spatial normalisation/alignment. The method is evaluated on the OASIS MRI brain and a chest CT dataset and demonstrates high performance.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The motivation and description of the method is clear, relating the approach to ANTs SyN makes sense Applying the method to not only MRI inter-subject brain registration is a big plus, since the clinical impact and technical challenge for chest CTs is much higher. Formulating inverse consistency in a multi-level (hierarchical) framework is useful and to my knowledge new for learning-based registration (but widely employed for optimisation based techniques)

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    Important related work is missing, in particular the MICCAI 2023 paper on Inverse consistency by construction for multi-step deep registration The use of a soft constraint rather than directly enforcing symmetry / inverse consistency by design is questionable and its potential impact should be verified in an ablation study The motivation for training a network to learn an inverse rather than simply using the negative scaling-and-squaring approach is not clearly conveyed
    While the numerical results are promising, the hyper-parameter optimisation strategy is not detailed. A comparison with SotA on some public benchmark, e.g. Learn2Reg (LungCT/NLST), DIRLAB (COPD) or Lung250M-4B is missing.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The description of the chest CT dataset is largely missing. Why isn’t a public dataset, such as Learn2Reg-NLST used? The OASIS dataset seems to be used as prepared for Learn2Reg (Hering et al) et no citation to the summary paper is provided. No code is provided or promised (edit: the authors promised release of code in rebuttal!). The equations are overall clear, yet many network details are missing, and hyper parameters e.g. the window size for LNCC are not mentioned. It is not reproducible in this form.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Iglesias and Greer (references below) proposed inverse consistency by design, e.g. by using the scaling-and-squaring method in combination with a dual use of the symmetric network and obtain inverses by negating the velocities, I.e. v(x) = g[F(x),M(x);\theta(x)] - g[M(x),F(x);\theta(x)]; T(x) = V[v(x)] and T^-1(x) = V[-v(x)]. This is much more intuitive than training another network that merely estimates such an inverse given a forward transform as input and guarantees inverse consistency as a hard constraint. The authors should discuss and ideally experimentally validate their choice. Eq. 3 only explains how a “one-step” inverse consistent composite is obtained, however, the main contribution is multi-step in three hierarchies. Greer pointed out that special care needs to be taken to perform such a multi-step inverse consistency appropriately, hence a detailed discussion would be advisable. Furthermore, recent approaches that performed well on the Learn2Reg Challenge (Siebert) included a computational procedure for enforcing inverse consistency based on two asymmetric transforms without an additional loss or network. While it is good to use a second dataset and chest-CT alignment is more challenging than MRI-brain registration, the evaluation on the in-house dataset is slightly disappointing. Very limited details are provided on, e.g. the number and placement of manual landmarks, the volumetric changes in lung respiration and the tuning of hyper-parameters for the compared approaches. The visual results are very small and improvement of alignment are not clear. It would be advisable to add experiments for the NLST or DIRLAB-COPD to make the results more convincing. A numerical evaluation of the achieved inverse consistency is missing. Furthermore details on memory, compute resources, as well as model parameters for the compared methods are missing.
    References: Greer, H., et al. Inverse consistency by construction for multistep deep registration. MICCAI Siebert, H., et al. Fast 3D registration with accurate optimisation and little learning for Learn2Reg 2021. MICCAI Workshop Iglesias, J. E. (2023). EasyReg: A ready-to-use deep learning tool for symmetric affine and nonlinear brain MRI registration. Scientific Reports Hering, A.,et al. Learn2Reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning. IEEE Transactions on Medical Imaging, 42(3), 697-712. Falta, F., et al. Lung250M-4B: a combined 3D dataset for CT-and point cl

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I currently see many potential flaws that make a rebuttal absolutely necessary. In particular the missing benchmark comparisons on public lung data and the unclear contribution over the recent strongly related MICCAI 2023 paper reduce the otherwise very good impression of the work. Since I cannot select borderline in the system I chose weak accept, but would like to note that I do not recommend early acceptance without a rebuttal.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • [Post rebuttal] Please justify your decision

    I recognise the efforts the authors made to clarify some shortcomings of the initial submission in the rebuttal - and I am more inclined towards acceptance. The promise of releasing the source code is helpful as it will enable the use of public benchmarks (Learn2Reg, DIRLAB-COPD) to better put the approach in research context. While I know understand the reasoning for the “inverting network” compared to scaling-and-squaring (improved accuracy) those findings should somehow be added in a final version (and not just by giving references). I would, however, stand by my point that Greer (MICCAI 2023) did propose a similar mid-way transform within their two-step inverse consistent approach, but am confident the authors will address this in an accepted version of the paper.




Author Feedback

We appreciate positive feedback from reviewers R1, R3, and R4, i.e., highlighting clear motivation (R3), high registration performance (R1, R4), significant clinical impact (R3), and flexible transformations (R4). Our code will be made publicly available. Below, we organize the main questions (Q) and corresponding answers (A).

Q1: Necessity of using neural networks to compute the inverse displacement field (R1, R3, R4). A1: Scaling and squaring the negative velocity field is faster for calculating the inverse displacement field than the iterative method. However, registration accuracy with a potential velocity field is lower than without a velocity field [6, 11, 15, 21]. Therefore, we propose a velocity field-independent method to compute the inverse displacement field with neural networks. We rigorously assess the accuracy of the inverse deformation field through qualitative and quantitative experiments. Besides, our deformation-inverse network can be pre-trained.

Q2: Design of symmetric magnitude constraint loss (L_smc) (R1). A2: Proposed L_smc, a soft constraint, encourages alignment of anatomy to the middle of the deformation manifold for symmetry. Penalizing the sum of the absolute values or squares of displacement fields falls short of symmetry requirements. Another alternative constraint involves using a single displacement field with its negative counterpart. This hard constraint led to diminished performance, likely due to the limiting flexibility of transformation, as revealed by experimental results.

Q3: Difference with mentioned work (R3). A3: Siebert proposed an Adam-based instance optimization registration without symmetry requirement. Iglesias and Greer’s approaches involve inverse consistency registration between image pairs. Our approach symmetrically and progressively registers image pairs to intermediate space. Namely, our registration network does not include inverse consistency. The independent deformation-inverse network handles inverse consistency and is used in the inference stage to obtain the final inverse-consistent displacement field. Therefore, we choose Ants (SyN) and SYMNet as the benchmark. We will discuss differences between our framework and the above-mentioned work in the final paper.

Q4: Geometric interpretation of composed transform (R4). A4: Eq. 3 composes two mutually-inverse displacement fields defined in different coordinate spaces, resulting in an identity field. Geometrically, for the composition process of two displacement fields $u_1$ and $u_2$, firstly, according to the registration field of the first transformation, denoted as $\phi_1=Id+u_1$, we know that the displacement of the second transformation in the first coordinate system should be $u_2 \circ \phi_1$. Then, the composed displacement field is the summation of the two displacements, i.e., $u_1 + u_2 \circ \phi_1$. In Eq. 1, two fields, both defined in the same coordinate space, can be directly summed, aiming to facilitate symmetric registration.

Q5: Selection of hyper-parameters, network details, implementation, and reproducibility (R1, R3, R4). A5: We train multiple networks with different weights of losses. We select the networks and weights to optimize Dice scores on the validation set and report the results on the test set. Additional details can be found in the open-source code.

Q6: Selection of datasets and registration pair (R1, R3). A6: Our study incorporated results from both in-house datasets and OASIS, while excluding NLST due to page limit. These two datasets specifically address intra-subject registration for follow-up studies and inter-subject registration for multi-atlas labeling, thus comprehensively evaluating our method.

Q7: Missing discussions and citations (R3, R4). A7: We will include those missing references and discussions in the final paper.

Q8: Issues of presentation, formatting, and grammar (R1, R3, R4). A8: We appreciate all these suggestions, which will be included in the final paper.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    After rebuttal, all reviewers agree to acceptance.

    Even nicer, all reviewers give constructive feedback which I feel can be used to improve the first version of the paper

    Please use this to Improve the C.R. There is good insight to discuss at the conference here.

    Congratulations

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    After rebuttal, all reviewers agree to acceptance.

    Even nicer, all reviewers give constructive feedback which I feel can be used to improve the first version of the paper

    Please use this to Improve the C.R. There is good insight to discuss at the conference here.

    Congratulations



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    All reviewers recommended weak accept for this paper after authors’ rebuttal

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    All reviewers recommended weak accept for this paper after authors’ rebuttal



back to top