Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Ultrasound Computed Tomography (USCT) has emerged as a cutting-edge imaging modality, offering quantitative acoustic parameter maps to enhance disease diagnosis. Full-waveform inversion (FWI), a mainstream reconstruction method, enables high-resolution imaging of the speed of sound (SOS) from USCT measurements. However, its strong sensitivity to the initial model and the anatomical distortions caused by cycle-skipping artifacts significantly hinder its application in complex clinical scenarios. In this paper, we propose P2INR-FWI, a polar coordinate-based implicit neural representation framework with structural prior, to achieve unsupervised, subject-specific SOS reconstruction. Departing from conventional Cartesian coordinate-based neural representations, our method introduces a polar coordinate encoding mechanism aligned with the geometry of the USCT ring array, which substantially accelerates convergence and improves reconstruction accuracy. Furthermore, we develop a reflected signal-derived structural prior extraction method to guide the reconstruction process toward clinically critical regions, thereby enabling fine-structure restoration. Experiments conducted on numerical phantom, breast-mimicking phantom, and in vivo data demonstrate that our method outperforms traditional approaches in both reconstruction quality and quantitative metrics, without requiring additional regularization constraints.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2163_paper.pdf

SharedIt Link: https://rdcu.be/eHwNq

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04937-7_40

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{WanZes_P2INRFWI_MICCAI2025,
        author = { Wang, Zesong AND Yan, Weicheng AND Liu, Zhaohui AND Yuchi, Ming AND Qiu, Wu},
        title = { { P2INR-FWI: an Implicit Neural Representation Method for Speed of Sound Image Reconstruction in Ultrasound Computed Tomography } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15961},
        month = {September},
        page = {420 -- 430}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes a novel method called P2INR-FWI, a full waveform inversion framework based on polar-coordinate implicit neural representation. By integrating Radial-Angular Coordinate Embedding (RACE) and structural priors, the framework aims to achieve unsupervised and individualized reconstruction of speed of sound (SOS). Targeting the circular sensor array geometry of ultrasound computed tomography (USCT), the method introduces a polar coordinate encoding mechanism that significantly accelerates convergence and improves reconstruction accuracy. In addition, anatomical prior information is extracted from reflected signals to guide the reconstruction process towards clinically relevant regions, enhancing the recovery of fine structures. Experimental results demonstrate that P2INR-FWI outperforms traditional methods in terms of reconstruction quality and quantitative metrics across numerical simulations, breast phantoms, and in vivo data—all without requiring additional regularization constraints. This approach provides more reliable imaging support for breast disease diagnosis, effectively mitigates the cycle-skipping issue commonly seen in conventional full waveform inversion, and reduces dependence on the initial model.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The author addresses an important and potentially high-impact problem. The methodology is well-justified and clearly presented. The proposed RACE based on polar coordinates, combined with structural priors, enables the reconstruction of SOS in an unsupervised and individualized manner. This approach effectively alleviates the cycle skipping problem commonly encountered in traditional Full Waveform Inversion (FWI). Both simulation and real-world experiments demonstrate significant improvements compared to the proposed baseline, while also reducing the need for large datasets, thereby lowering the cost of dataset creation.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Encoding Method: The author mentions that the proposed method requires more time to converge (7.1 seconds vs. 5.6 seconds for traditional FWI). However, to my knowledge, previous studies have shown that using hash encoding can effectively accelerate the convergence process, potentially achieving better results in even less time than traditional FWI. Additionally, since the SIREN network is already capable of learning high-frequency components, I am uncertain about the necessity of applying Fourier encoding before feeding inputs into the network. While it may offer slight improvements, it could also further reduce the processing speed. This seems to be a trade-off, and I believe the author should explain the rationale behind this design choice.

Filter: From the author’s pipeline, I noticed that both the observed and predicted data are filtered before computing the loss. In my opinion, this step might be unnecessary, as applying a filter could increase the under-determined nature of the problem. Since the raw data is already the observed data, further processing may not be needed. The author’s experiments also support this view, as the improvement brought by the filtering step is minimal—only a 0.064 reduction in RMSE.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

Overall, I believe this method is innovative in the field of ultrasound sound speed imaging, and the results clearly demonstrate its advantages over traditional approaches. Based on my experience, I would suggest that the authors consider comparing their method with supervised approaches to further validate its effectiveness. In addition to the improvements mentioned earlier, the authors might explore additional refinements that could further enhance the performance of the method. Lastly, it could be worthwhile to consider embedding the prior directly into the Implicit Neural Representation (INR), rather than incorporating it solely in the loss function. This might lead to further improvements, as in my view, the speed of sound in water—even within the same temperature range—can vary slightly in different regions. Allowing the INR to optimize for such subtle variations could potentially yield better reconstruction quality.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This method is innovative in the field, and the experiments have demonstrated that it outperforms some traditional algorithms.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

The paper proposed an INR ( Implicit Neural Representation )-based full waveform inversion method for USCT, aimed ai high-resolution SOS reconstruction. The innovations of this method are twofold: First, it proposes the Radial-Angular Coordinate Embedding (RACE) module to encode the spatial features of circular transducer arrays. Second, it incorporates structural priors into the INR FWI inversion process.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The proposed RACE module in this paper is designed to better extract features from circular transducer arrays. Additionally, by incorporating structural prior constraints, it can more effectively avoid the Cycle-skipping phenomenon.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

The paper incorporates structural priors into the INR FWI process. However, since the input of the INR method is discrete spatial data points, integrating structural priors is challenging. The paper does not clearly explain how to incorporate prior structural guidance.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

FWI under the circular array condition is relatively simple because the data contains transmitted waves. However, can we further discuss how to perform FWI in ordinary linear array synthetic aperture ultrasound?
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper effectively proposes solutions to the identified problems and achieves better results compared to existing methods.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The primary contribution of this paper is the development of an implicit neural representation (INR)-based method for reconstructing speed-of-sound (SoS) maps in ultrasound computed tomography (UCT).
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A notable strength of the proposed model is its effective use of the INR framework to estimate SoS maps in an unsupervised manner. This approach also addresses limitations associated with Cartesian coordinate-based reconstructions. The method is thoroughly validated using numerical phantoms, a breast-mimicking phantom, and in vivo data, demonstrating both feasibility and robustness.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

There are some representational and methodological details that require clarification. Specifically, the manuscript should describe the data acquisition system (e.g., whether it is a custom-built or commercialized device) and explain the strategy for sampling spatio-temporal positions during training. Additionally, the reconstruction time appears to be equivalent to the model’s training time, which may pose a limitation for practical deployment.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper presents a novel application of INR to SoS reconstruction in UCT, which, to the best of my knowledge, is the first attempt in this context. The method is technically sound and experimentally validated with real-world data, supporting its potential impact. While a few implementation details could be better elaborated, the contribution is solid and relevant to the field.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We appreciate the reviewers’ thoughtful feedback and suggestions. All raised points have been carefully considered: substantive comments are addressed below and editorial suggestions will be incorporated in the final version. Reproducibility (R2, R3): To ensure full reproducibility, we will release all code associated with this work as open source upon publication. Data Acquisition System (R3): We use a custom-built setup consisting of a 0.22 m-diameter circular array with 512 evenly spaced sensors. Each element operates at a center frequency of 500 kHz with a 400 kHz bandwidth. Data are acquired at a sampling rate of 25 kHz. We sincerely apologize for the earlier typographical error listing 2048 sensors—in fact, the system employs 512 sensors, and we will correct this in the final version. Spatio-Temporal Sampling Strategy (R3): During training, we discretize time and space to satisfy both stability and resolution requirements of our finite-difference forward model. Specifically, the grid spacing Δx and time step Δt must obey the Courant–Friedrichs–Lewy condition Δx < min(c) / (10f) and the spatial resolution criterion Δx < min(c) / (10f), where c represents the sound-speed field and f the highest signal frequency. In our simulations (using a 500 kHz source), we set a 592 × 592 spatial grid over a 0.23 m domain and 3000 time steps over 0.2 ms. For real-data experiments, the time step is fixed by the sampling rate (Δt=1/fs=1/25MHz), and we employ an 896 × 896 spatial grid to maximize accuracy without unduly increasing computational cost. Encoding Method (R1): We have attempted to feed raw, unencoded coordinates directly into SIREN, but this has resulted in significantly slower and less stable convergence. Our current Fourier encoding parameters are chosen empirically through iterative tuning to strike the best balance between accuracy and speed. Since the forward-modeling step of FWI dominates total runtime, the added overhead from Fourier encoding is negligible. To further reduce iteration time, we have optimized the image-patch rearrangement and stitching routines, bringing our per-iteration cost close to that of conventional FWI. We have also experimented with hash encoding: although it has delivered modest speed gains, it has degraded reconstruction quality. We plan to explore refined or hybrid hash-based schemes in future work, but any additional acceleration will remain limited by the cost of the forward simulation itself. Filter (R1): We employ filtering as part of our multi-frequency FWI approach, decomposing the signal into distinct bands to mitigate cycle-skipping. Compared to conventional FWI, the multi-frequency method delivers markedly better reconstruction quality, confirming the benefit of this step. Although our method under single-band conditions already produces excellent results in simulations, real-data experiments on both phantom and in vivo datasets show that the multi-frequency strategy yields smoother and more stable convergence. Consequently, we report only multi-frequency FWI reconstructions for those cases. Structural Prior Integration (R2): We embed structural priors directly in the loss function. First, we extract the outer contour of the region of interest from reflected-signal data to generate a binary mask that labels each spatial sample as foreground or background. We then constrain the predicted background sound-speed values toward known empirical values by applying a penalty term on background points.

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

All three reviewers agree to recommend acceptance of the paper in its current form. The work deserves recognition for leveraging the power of implicit neural representations (INR)—a latest neural network technique—to effectively mitigate the cycle-skipping phenomenon, a long-standing challenge in UCT. While INR is not entirely new to the MICCAI community, its application in ultrasound computed tomography remains limited and would benefit from greater exposure.

There is also consensus on the paper’s novelty, particularly in how it integrates new modules into the INR framework to improve reconstruction accuracy and accelerate convergence. The experimental results are convincing, supported by validations across simulation, phantom, and in vivo data, which strengthens its potential for clinical translation.

Based on the above, I recommend provisional acceptance. Before final submission, please address the minor issue raised by R3 regarding the disclosure of the acquisition system, and improve reproducibility as suggested by all reviewers.

back to top

P2INR-FWI: an Implicit Neural Representation Method for Speed of Sound Image Reconstruction in Ultrasound Computed Tomography

Author(s):