Abstract

Deep learning methods for unsupervised registration often rely on objectives that assume a uniform noise level across the spatial domain (e.g. mean-squared error loss), but noise distributions are often heteroscedastic and input-dependent in real-world medical images. Thus, this assumption often leads to degradation in registration performance, mainly due to the undesired influence of noise-induced outliers. To mitigate this, we propose a framework for heteroscedastic image uncertainty estimation that can adaptively reduce the influence of regions with high uncertainty during unsupervised registration. The framework consists of a collaborative training strategy for the displacement and variance estimators, and a novel image fidelity weighting scheme utilizing signal-to-noise ratios. Our approach prevents the model from being driven away by spurious gradients caused by the simplified homoscedastic assumption, leading to more accurate displacement estimation. To illustrate its versatility and effectiveness, we tested our framework on two representative registration architectures across three medical image datasets. Our method consistently outperforms baselines and produces sensible uncertainty estimates. The code is publicly available at \url{https://voldemort108x.github.io/hetero_uncertainty/}.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/1085_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/1085_supp.pdf

Link to the Code Repository

https://voldemort108x.github.io/hetero_uncertainty/

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Zha_Heteroscedastic_MICCAI2024,
        author = { Zhang, Xiaoran and Pak, Daniel H. and Ahn, Shawn S. and Li, Xiaoxiao and You, Chenyu and Staib, Lawrence H. and Sinusas, Albert J. and Wong, Alex and Duncan, James S.},
        title = { { Heteroscedastic Uncertainty Estimation Framework for Unsupervised Registration } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15002},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents a new method to calibrate the similarity loss used by image registration models by correcting for the assumption of uniform noise levels made by the SSD similarity metric.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The proposed feature would be useful if easy to plug into existing models for deformable image registration.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. Poor comparison to existing literature, e.g. missing mention of custom or learnt similarity metrics, as well as other methods for the calibration of the similarity loss, e.g. virtual decimation [1].
    2. Evaluation: i. Lack of evidence that the comparison to baseline methods is fair, e.g. are all hyperparameters kept constant when training with and without the proposed feature? ii. No details on the statistical significance testing; iii. No discussion of the impact of the proposed feature on the output transformation smoothness;
    3. Discussion of the uncertainty estimates is missing some common sense indicators of good uncertainty calibration, e.g. is it higher in regions with smaller intensity gradient magnitudes? Are the registration error and the uncertainty magnitude correlated?

    [1] I. Simpson, J. A. Schnabel, A. Groves, J. Andersson, and M. Woolrich. “Probabilistic inference of regularisation in non-rigid registration”. In: NeuroImage 59.3 (2012).

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Abstract: not all common similarity metrics make the assumption of uniform noise levels, e.g. LCC or MI.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Reject — should be rejected, independent of rebuttal (2)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper touches on an important subject but does not deliver a method or a result that would justify its presentation at the conference.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposed a probabilistic heteroscedastic noise modeling framework to reduce the influence of regions with high uncertainty for unsupervised image registration. The displacement field and noise variance are estimated, and an adaptive-exponentiated relative SNR weighting strategy is proposed for loss calibration.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper aims to deal with the uniform noise assumption in unsupervised registration. It analyzed the disadvantage of conventional noise modeling in unsupervised registration and proposed a probabilistic unsupervised registration using data-driven estimation of heteroscedastic uncertainty. This paper utilized separate objectives for the displacement and noise variance, with highly collaborative information exchange. It addressed the undersampling issue using the exponentiated relative SNR. The beta-NLL objective is optimized for the variance. Experimental results show statistical improvements over the baselines while providing sensible uncertainty maps.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The motivation for separated displacement and noise variance estimation should be discussed in detail. The collaborative information exchange should also be explained further. Moreover, the characteristics of exponentiated relative SNR and beta-NLL require more explaination. Representations should be improved.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. What does it mean “this is in contrast to our proposed data-driven SNR-based weighting scheme”?
    2. Representation requires improvement. For example, “For 3D echo data, we show quantitative results in Table 3 and qualitative results in the Supp. Mat. Note: Improvements come solely from smoother optimization and does not incur additional complexity during inference.” What does the author want to express?
    3. Why does Fig. 3 show that “This corroborates the validity of our heteroscedastic noise assumption (Eq. 2) and shows the effectiveness of our proposed variance estimator “
    4. Discussion of Fig.3 is not clear. The authors should rewrite it clearly.
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper’s motivation is novel. The conventional method generally optimizes both displacement and variance simultaneously. Perfectly estimated variance might result in undersampling of higher-intensity regions. Separating the estimation of displacement and variance is beneficial for improving registration. This paper deal with the above issue and validates its effectiveness in different datasets.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The authors tackle varying noise levels (heteroscedasticity) in medical image registration, highlighting inaccuracies from assuming uniform noise distribution. They introduce two modules:

    1. Displacement Estimator: Estimates alignment between images.
    2. Variance Estimator: Predicts input-specific noise levels, recognizing noise heterogeneity. Contributions:
    3. Identifying limitations of simplistic heteroscedastic noise modeling.
    4. Introducing a data-driven probabilistic framework for uncertainty estimation.
    5. Proposing a novel strategy using adaptive SNR weighting for loss calibration.
    6. Validating the method’s effectiveness across diverse datasets.
  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The strengths of the paper are as follows:

    1. The paper is very well structured.
    2. The approach is novel.
    3. Heteroscedastic uncertainty estimation is still an open and challenging area in registration, which is discussed in this paper.
    4. The evaluation section clearly shows the advantages of the proposed method
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    I strongly recommend that the authors provide their code for public access.

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The submission lacks sufficient information for reproducibility, particularly in terms of code availability. Access to the codebase is crucial for other researchers to replicate the results and validate the proposed method. Therefore, I strongly recommend that the authors provide their code for public access. Making the code available will not only enhance the reproducibility of the study but also facilitate further research and collaboration in the field.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Adaptive Weighting Strategy: To effectively use the predicted noise levels, they introduce a novel strategy based on the relative exponentiated signal-to-noise ratio (SNR). This strategy adjusts the weight given to each region during training based on its SNR, ensuring that regions with higher SNR contribute more to the learning process.

    Collaborative Training Strategy: They emphasize the importance of collaborative training, where the two estimators improve each other through information exchange and loss calibration. This ensures that the displacement estimator learns to adapt to the varying noise levels predicted by the variance estimator.

    Validation: The effectiveness and versatility of their proposed framework are validated through extensive experiments using two neural network architectures and three cardiac datasets. They demonstrate consistent improvements over baseline methods while providing meaningful uncertainty maps reflecting spatially varying noise levels.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper provides a comprehensive discussion on heteroscedastic uncertainty estimation, an area that remains open and challenging within registration. Through thorough examination and analysis, the paper contributes significantly to the understanding of this complex aspect of registration methods

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We are delighted that the reviewers found our motivation “novel” (R1) and that our method “contributes significantly” (R3). We address all reviewer comments below and will incorporate all feedback in the next revision. We will open-source the code upon acceptance at the earliest time.

R1: Clarification on motivation. Separating displacement and variance estimation is necessary because they cannot be optimized with a single objective. A joint objective from heteroscedastic variance estimation leads to undersampling higher-intensity regions as analyzed in Sec. 3 of the main text, degrading performance (Tables 1, 3; Fig. 2). Our method allows separate objectives for each estimation by alternating optimization during training, where performance improvements are validated in Table 1.

Intuitions on exponentiated SNR. The exponentiated term gamma balances the trade-off between full SNR weighting (gamma=1, MSE as the data term) and constant weighting (gamma=0). We select gamma=0.5 from our experiments in Table 4.

Clarification on “this is in contrast to …” AdaReg and AdaFrame don’t explicitly model noise variance, whereas ours learns the noise variance from the data.

Clarification on “Improvements … during inference” One advantage of our method is that it maintains the same computational complexity as the baseline during testing, requiring only the forward pass of the displacement model. The improved performance results from our proposed collaborative learning strategy during training.

Clarification on Fig. 3 In column 3 of Fig. 3, labeled “Estimated Noise Variance,” red indicates higher uncertainty. Note that we assume the fixed image is a noisy observation of the reconstructed image (Eqn. 2). These red areas accurately correspond to locations with high discrepancies between the fixed and reconstructed frames (see columns 1, 2), validating our heteroscedastic noise formulation and variance estimator optimization.

R2 Poor comparison to literature We disagree. The suggested virtual decimation paper has key differences from our work and thus cannot be regarded as a comparison; nonetheless, we are happy to cite it. The uncertainty estimates provided by their work focus on the epistemic uncertainty (i.e. model parameters uncertainty) rather than aleatoric uncertainty (i.e. uncertainty intrinsic in the data) characterized by our heteroscedastic noise assumption.

We have also conducted extensive experiments to compare our approach with existing literature. We detailed two lines of work in Sec. 2: (a) heteroscedastic uncertainty estimation and (b) adaptive weighting schemes. We implemented NLL, beta-NLL, AdaFrame, and AdaReg as baseline frameworks on two registration architectures (Voxelmorph, Transmorph) across three different datasets (ACDC, CAMUS, 3D Echo). Our proposed approach consistently outperforms them.

Implementation details The network architecture and optimization hyperparameters were kept constant, across all experiments.

Details on the significance testing We used scipy.stats.ttest_rel to compute the p-values between ours and the second-best method by conducting a paired t-test.

Deformation smoothness To evaluate smoothness, we compute the percentage of pixels with a negative Jacobian determinant. In ACDC, we achieve 0.24% compared to 0.29% for the vanilla Voxelmorph, demonstrating smoother deformations.

Lack of estimated uncertainty evaluation We disagree. In Sec. 5.2, we provided a detailed analysis including common sense indicators for both quantitative (sparsification error, Fig. 3 right) and qualitative (Fig. 3 left) evaluations of our uncertainty estimates.

Comparison to robust losses We acknowledge that not all objectives assume uniform noise, which is why we state “often rely” in our abstract. Additionally, in Table 1, we compare our method with NCC and MI across two registration architectures and two datasets, and improve over both.

R3 Code availability We will open-source the code after acceptance.




Meta-Review

Meta-review #1

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The authors work addresses a significant gap in existing methods by challenging the uniform noise assumption and offers a substantial improvement over baseline models. The clear experimental validation across various datasets further underscores the robustness and potential of your approach. While the paper could benefit from more detailed explanations in certain areas, such as the mechanisms of collaborative information exchange and the specifics of the SNR and beta-NLL strategies, the overall contribution has merit.

  • What is the rank of this paper among all your rebuttal papers? Use a number between 1/n (best paper in your stack) and n/n (worst paper in your stack of n papers). If this paper is among the bottom 30% of your stack, feel free to use NR (not ranked).

    The authors work addresses a significant gap in existing methods by challenging the uniform noise assumption and offers a substantial improvement over baseline models. The clear experimental validation across various datasets further underscores the robustness and potential of your approach. While the paper could benefit from more detailed explanations in certain areas, such as the mechanisms of collaborative information exchange and the specifics of the SNR and beta-NLL strategies, the overall contribution has merit.



back to top