Abstract

Velocity field estimation, or motion tracking, is the key to characterizing tissue function in ultrasound imaging. Current velocity field estimation remains challenging in cross-range motion tracking due to the less sensitivity of ultrasound in this dimension. In addition, there is a lack of a uniform framework for different imaging schemes, such as linear array with rectangular scanning, phased array with sector scanning, and matrix array with volumetric scanning. This paper proposes a uniform multi-mode fused framework for tissue velocity field estimation. This framework integrates multiple modes of pair-wise optical flows, Doppler, and speckle consistency in ultrasound to improve the accuracy of cross-range velocity estimation. Furthermore, the uniform framework is adapted to different arrays and imaging schemes for various application scenarios. Extensive in-silico experiments on homemade and public datasets demonstrate the effectiveness of the proposed framework and the outperformance of our method when compared with a window-based method and an energy function optimization-based method. Particularly, our method improves the accuracy of cross-range velocity estimation by 8.84%, 19.21%, and 10.94% in three cross-sectional views of the public cardiac dataset when compared with the energy function optimization-based method.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2308_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiHai_AUniform_MICCAI2025,
        author = { Li, Hailong and Wang, Liansheng and Chen, Yinran},
        title = { { A Uniform Multi-mode Fused Framework for Velocity Field Estimation in Ultrasound Imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {54 -- 64}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors propose a multi-modal method for dense motion estimation in Ultrasound imaging, and demonstrate its efficacity in-sillico comparing to established (although dated) baselines. To the best of my knowledge, this is the first time such a multi-modal approach has been proposed, making this interesting and relevant for the community. However the paper lacks of clarity and detail, making it (in my opinion) unsuitable for publication.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • a novel approach to motion using multi-modal inputs
    • adapted to multiple probe types
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • Substantial lack of clarity, which results in impossibility of reproducing the experiments or results
    • the method is somewhat ad-hod to a certain type of acquisitions only possible with some research programmable equipment, but detail on that equipment is missing.
    • The results are described superficially, with no indication of confidence intervals or statistical significance.
    • Baselies used are a bit dated. Authors should look into more recent motion estimation methods, particularly those using deep learning such as EchoTracker or CoTracker. Also a registration-based approach would be a relevant baseline in this case.
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    The methods section lacks an overview section or description of the different modules and how they interact; as a result, it is very difficult to get the big picture of the approach and specifically to understand the adaptation step to different probe / acquisition types. For instance, authors go directly into describing the modified Doppler estimation, which uses a custom beamforming / transmission sequence; this would require a research / programmable system which is not described anywhere. Only on the next page (4) in the “Brightness Compensation” subsection authors mention plane or diverging waves but again not sure which one they are using. Possibly (not clear in the text) authors are uing Plane Wave Imaging, which would run at much higher fps than standard ultrasound, significantly easing the challenge of motion estimation. Another example of lack of experimental detail is table 1. they list the probes but for example “Vermon” is a manufacturer that makes probes for different vendors. L11-4v, P42-v seems a very specific probe type - but no details on the system it belongs to (possibly a Verasonics).

    The in-silico data is somewhat limited and not representative of clinical cases (cardiac imaging). I understand that the available cardiac datasets (CAMUS, Echonet Dynamic, etc) do not have the non-standard acquisitions needed by the authors, but even a couple of real cases aquired with theiir system would have boosted the relevance of the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (2) Reject — should be rejected, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The lack of clarity and lack of detail is beyond what reasonably can be discussed in the rebuttal phase: the paper needs an overall description of the methods, and a high level description of how the components interact; a description of the acquisition system, acquisition modes in enough detail to be reproducible (perhaps publishing the Fields II simulations alongside); enough information to re-generate the synthetic dataset and a justification on how this data is relevant to a real scenario.

    In addition, the baselines used for comparison are quite dated and the results are covered very superficially, without confidence interval or statistical significance tests.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    My main concerns were clarity, lack of detail for reproducibility, unsuitable baseline choice and superficial results description. Authors have made an effort in their rebuttal to address the confidence intervals in their results, however they have avoided the aspects about detail (equipent used etc) and baseline comparison.

    Because they did an effort I would be happy to move from a reject to a weak reject.



Review #2

  • Please describe the contribution of the paper

    This paper presents a novel motion tracking framework that avoids traditional B-mode tracking methods. Instead, it introduces a low-level, physics-inspired approach based on a round-trip emission image formation scheme. This scheme simultaneously estimates a Doppler field and computes pairwise optical flow between image pairs. The proposed optical flow estimation is particularly sensitive to cross-range motion, owing to a dedicated 1D motion correction strategy.

    The framework is designed to be adaptable across multiple ultrasound acquisition geometries, including Cartesian (rectangular) arrays, polar sector probes, and 3D volumetric sectors, demonstrating its flexibility and broad applicability.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is scientifically sound and introduces a novel and conceptually distinct approach to motion tracking in ultrasound imaging. Unlike conventional methods that rely directly on B-mode data or predefined tracking schemes, this work first reconstructs two complementary modalities—namely, a Doppler-like field and pairwise optical flow fields—before deriving motion estimates. To the best of my knowledge, this two-stage image formation strategy is original and has not been previously explored in the ultrasound motion tracking literature.

    The proposed method is evaluated on two datasets, and the results are both quantitatively and qualitatively convincing. The performance gains observed suggest that the intermediate representations contribute meaningfully to improved motion estimation. Overall, the approach is innovative and shows strong potential for further development and clinical translation.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The authors did not acknowledge that several prior studies have successfully leveraged highly realistic simulated or augmented datasets to train deep learning-based motion tracking models. These earlier approaches have demonstrated that training on such data can lead to robust and generalizable models, especially when paired with careful domain adaptation or validation strategies.

    Moreover, the evaluation in the paper would have been significantly strengthened by including results on real-world clinical data. While such data typically lack ground truth motion fields, surrogate clinical metrics—such as Global Longitudinal Strain (GLS)—can serve as valuable proxies for assessing the physiological plausibility and accuracy of the predicted motion tracks. Incorporating GLS-based validation would have provided a more clinically meaningful measure of performance and helped assess the method’s translational potential.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In my opinion, this paper does not suffer from any major flaws. The manuscript is clearly written, logically structured, and accessible to both specialists and readers with general expertise in medical image analysis. The mathematical formulations are sound and well-motivated, and the methodology is presented with sufficient rigor and clarity.

    Moreover, the proposed approach is original and addresses a relevant problem in ultrasound motion tracking using a novel low-level framework. The experimental results, although limited to two datasets, are convincing and support the authors’ claims. Given its conceptual novelty, solid technical foundation, and potential impact, this paper makes a meaningful contribution to the field and merits acceptance.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a framework for cross-range velocity estimation in Ultrasound B-mode imaging for elastography. The paper unifies Doppler imaging, Optic Flow imaging with different imaging modes including linear array, phased array and volumetric scanning. The results on numerical and publicly available datasets show the efficacy of the approach.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is very well written and easy to follow. The details are concise and the paper though being concise, covers the related background work on ultrasound elastography and doppler imaging techniques.

    2. The method introduced for motion correction, brightness compensation, optic flow constraints, etc. are unified nicely via convex loss functions which can be optimized iteratively.

    3. The overall framework is nicely unified and is flexible enough to support different constraints based on the geometry of the scan such as linear, phased array and volumetric imaging.

    4. The method considerably improves the nRMSE and coefficient of determination (R^2) over the comparison methods including window-based cross-correlation and energy function based optimization methods.

    5. The ablation studies conducted on numerical phantoms showcase the efficacy and benefits of different components

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The paper does not mention much details about the reproducibility of different things including the equalization of the image to a quasi-constant brightness image. I feel that the authors could mention more details here. Furthermore, the authors could clarify more about how exactly the optimization problem for calculating the optimal velocity is solved. If possible, the authors could make their code open-source to ensure reproducibility.

    2. The comparison with a suitable deep learning based velocity estimation is missing, in order to quantify the performance gains of the method accurately.

    3. The paper does not report the computational times and efficiency of the approach, and compares its efficiency as compared to other methods.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    I feel that the authors could compute the strain properties such as the elastic modulus of the tissue based on the velocity, to highlight further the clinical utility of the approach, in sheer wave elastography, for instance.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I feel that the paper proposes a useful and novel contribution to image cross-range velocity in B-mode US imaging. The framework unifies across different US probes including linear array, phased array and 3D volumetric imaging. The method could be clinically adopted to such different settings and the constraints introduced for computing the optimal velocity improve the cross-range velocity estimation and showcase their utility. Furthermore, the method compares strongly against the baseline window-based and energy based optimization methods.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have adequately addressed all the minor concerns that were raised. Furthermore, the authors have promised to make their code open-source upon acceptance. I feel that the paper presents a valuable contribution to Ultrasound motion tracking.




Author Feedback

We greatly appreciate the reviewers’ insightful and constructive comments to improve the quality of our paper. Particularly, we are greatly encouraged as all the reviewers highly recognized the innovativeness of our method.

The overlapping comments. Q1: Comparison with deep learning methods. A1: There are several deep learning methods for ultrasound motion estimation, such as RAFT-USENet and PWC-Net for linear array, EchoTracker and CoTracker for phased array, and NeuralCMF for matrix array. However, we find difficulties in performing such comparisons. Firstly, reproducing or adapting the abovementioned networks is non-trivial since they either work in RF-data or B-mode domain. Secondly, for the methods that only provide networks without training data, we cannot provide sufficient data for reproduction. Lastly and most importantly, a unified deep learning method that can be adapted to different imaging schemes is currently missing. Adopting different networks for different imaging schemes in the comparison may bias the take-home message of this paper, making it long and tedious.

Q2: Details regarding reproduction. A2: We apologize for the lack of detailed parameters regarding reproducibility. In this version, we struggled with the limited space since we wanted to clarify the uniform framework and its adaptations while presenting as many results as possible. After substantial compression, we had to maintain the completeness of the methodology by sacrificing some details, such as the probe configurations (pitch and bandwidth), imaging parameters (maximal steering angle of plane/diverging wave), and delay-and-sum beamforming. We would like to provide all the details and make the code public upon acceptance.

Reviewer#1 Q1: The clinical data. A1: The STRAUS dataset was derived from high-precision electromechanical simulations grounded in real clinical data. This dataset has demonstrated strong similarity to authentic clinical ultrasound acquisitions. Currently, we are making homemade clinical datasets for the in-vivo validations.

Reviewer#2 Q1: The applicable equipment for the method. A1: Our method applies to multiple types of equipment for the following reasons. (i) The only difference between the round-trip sequence and the other conventional sequences is the activation order of the plane /diverging waves. For an ultrasound system that allows wide-beam transmission and modified authorization, the round-trip sequence is easy to implement. (ii) The uniform framework also works for an ultrasound system equipped only with a sequential plane/diverging wave sequence with the following modifications. First, the modified Doppler field is replaced with conventional multi-angle Doppler fields corresponding to the beam angles. Second, the pair-wise OF fields have the same time intervals. Such modifications are theoretically feasible. However, further experiments are required to fully validate this feasibility.

Q2: Confidence intervals of the results. A2: We apologize for not providing the confidence intervals for the results in this version. In fact, we independently repeated the experiments ten times and recorded the standard deviations for Table 2. For example, the cross-range nRMSE (%) of our method is 9.75 ± 0.09 for LAX #1, 12.6 ± 0.35 for LAX #2, and 14.26 ± 0.49 for SAX, which still outperforms the other methods.

Q3: Transmitted beams and challenges in motion estimation. A3: Plane wave was used for linear array, whereas 2D and 3D diverging waves were used for phased array and matrix array, respectively. With the high-frame-rate imaging, ultrasound motion estimation is still challenging for the following reasons. (i) Plane/diverging waves are limited in lateral resolution, beam penetration, and SNR of echo signals due to the lack of transmit focusing. (ii) For the rapidly moving events, such as motion of valves or ejection of blood flow, the frame rate is still insufficient even with plane/diverging wave.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The paper presents a framework for motion estimation in ultrasound imaging. Reviewers 1 and 3 express more favorable views and acknowledge the originality and technical soundness of the framework. However, Reviewer 2 raises substantial concerns regarding clarity, reproducibility, and baseline comparisons, noting that essential methodological and acquisition details are lacking and that the evaluation setup relies on outdated baselines. These issues were only partially addressed in the rebuttal, leaving key weaknesses unresolved. Therefore, I agree with Reviewer 2 and recommend rejection in its current form.



back to top