Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Ultrasound imaging is widely applied in clinical practice, yet ultrasound videos often suffer from low signal-to-noise ratios (SNR) and limited resolutions, posing challenges for diagnosis and analysis. Variations in equipment and acquisition settings can further exacerbate differences in data distribution and noise levels, reducing the generalizability of pre-trained models. This work presents a self-supervised ultrasound video super-resolution algorithm called Deep Ultrasound Prior (DUP). DUP employs a video-adaptive optimization process of a neural network that enhances the resolution of given ultrasound videos without requiring paired training data while simultaneously removing noise. Quantitative and visual evaluations demonstrate that DUP outperforms existing super-resolution algorithms, leading to substantial improvements for downstream applications.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3415_paper.pdf

SharedIt Link: https://rdcu.be/eHwPu

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04947-6_8

Supplementary Material: Not Submitted

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{CheChu_Blind_MICCAI2025,
        author = { Chen, Chu AND Cui, Kangning AND Cascarano, Pasquale AND Tang, Wei AND Piccolomini, Elena Loli AND Chan, Raymond H.},
        title = { { Blind Restoration of High-Resolution Ultrasound Video } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15962},
        month = {September},
        page = {77 -- 87}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes Deep Ultrasound Prior (DUP), a self-supervised framework for ultrasound video super-resolution that enhances spatial resolution and removes noise without requiring paired training data. DUP is inspired by Deep Image Prior (DIP) and adapts it to the ultrasound video domain.
Experiments show that DUP outperforms compared approaches.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The proposed Weight Inheritance (WIn) mechanism is new. It leverages temporal continuity in US videos. By reusing the CNN parameters from previous frames, the model achieves faster convergence and enhances temporal consistency for video restoration.
- DUP incorporates a combination of total variation (TV) and higher-order (HO) regularization terms to strike a balance between preserving anatomical details and suppressing noise/artifacts. This dual regularization design improves SR quality.
- The paper is quite easy to follow and extensive ablation studies were conducted.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The contribution of the work is incremental. The core idea is largely based on existing Deep Image Prior (DIP)-style approaches and follows similar principles as prior video extensions such as Deep Video Prior (DVP) and Recursive Deep Prior Video (RDPV). The work is built upon known ideas rather than introducing a fundamentally new optimization or self-supervised learning paradigm for blind SR.
- The paper does not clearly explain the key differences between the proposed approach and existing methods like DVP and RDPV.
- The DUP method requires per-video optimization and up to 3000 iterations per frame. It may not be practical in real-time settings.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The overall contribution feels incremental, as the core formulation closely follows existing DIP methods and prior video extensions such as DVP and RDPV. However, the proposed approach demonstrates some merit through extensive experiments and solid empirical results. Overall, this feels like a borderline paper to me.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper proposes a self-supervised framework named Deep Ultrasound Prior (DUP) for ultrasound video super-resolution. The DUP eliminates the need for paired training data and incorporates a Weight Inheritance (WIn) strategy and dual regularization to accelerate convergence and achieve information sharing while removing noise and artifacts. Quantitative experiments demonstrate the effectiveness of the proposed framework.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The major strengths are:
1. A self-supervised framework that eliminates the need for paired training data for ultrasound video super-resolution. The proposed framework achieves accelerated convergence and is insensitive to the noise.
2. Quantitative experiments demonstrate that the proposed framework outperforms existing SR techniques while achieving robustness. Besides
3. The proposed framework achieves the lowest error by evaluating EF prediction from restored cardiac videos, which demonstrates its potential to improve clinical diagnostic accuracy.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The main weaknesses are:
1. The authors claim that the WIn strategy can improve convergence speed; however, there is no convergence speed-related discussion and visualizations in the ablation study.
2. For the ablation study in Table 1, where the DUP without HO and TV underperforms the baseline DIP (w/o WIn). And there is no justification for this phenomenon.
3. There is no ablation study about the impact of hyperparameters of the two regularizations: HO and TV. Are the current hyperparameters the most suitable ones? In addition, different \sigma values for preventing CNN from undergoing lazy training are not experimented with. It is more appropriate to study these crucial hyperparameters, which are certainly related to the performance.
4. In Fig. 4, the first row of the DUP 2x result, the mouse icon is a misleading error.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Please refer to major strengths and weakness.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

This paper investigate ultrasound video super-resolution in a self-supervised fashion. They introduce the way of deep image prior into medical ultrasound scenarios and design a more practical training strategy to improve its restoration results and downstream results.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. This paper introduce the spirit of deep image prior to solve the deficiency of paired ultrasound data and improve ultrasound video super-resolution.
2. They propose two regularization terms to adapt DIP to ultrasound scenarios.
3. Downstream tasks are also evaluated to validate its effectiveness.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. While the author argue that DIP’s ability to fit fine details can also lead to overfitting, where noise and artifacts are misinterpreted, they should also provide some visual evidence to validate this statement.
2. The author should explain more about Eq. 4. Why the higher-order term R2 effective for the piece-wise constant artifacts. Please give a straightforward clarification.
3. The proposed WIn Strategy is simple and just use the shared weight when training on consecutive frames. The author should explain more on its technical contributions and implementation details.
4. Observed from Table 1, the baseline of DIP (w/o WIn) outperforms the baseline of DUP. Please analyze this result. And the improvement of HO seems not significant based on both methods.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This paper is in good organization and experimental analysis.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We are grateful for the acceptance of our paper and the reviewers’ valuable feedback. Below, you will find the detailed response to each comment.

R1 (Incremental contribution, relation to prior work). We acknowledge that the core framework builds upon existing DIP principles. However, our key innovations include: 1. Weight Inheritance (WIn); 2. Dual Regularization; 3. Clinical Validation in Ultrasound Videos. Although they are all DIP-based video processing methods, there are fundamental differences between DUP and DVP/RDPV. DVP only trains on a few key frames and applies the trained network to infer other input images. However, the key frame strategy assumes temporal stability, low noise, and reusable priors—conditions rarely met in ultrasound imaging. US videos are dominated by noise and stochastic artifacts caused by acoustic interference, and exhibit rapid anatomical deformations and probe movements. These patterns vary spatially and temporally, even between consecutive frames. For US videos, frame-specific optimization is more effective to address noise, motion, and non-stationary content. Additionally, super-resolution is not considered a task of DVP. While RDPV applies frame-by-frame training with a similar weight reusing strategy, it does not optimize with the dual regularization, which is specifically designed for reducing US artifacts, or the Lanczos downsampler that DUP used for performing the degradation process, which align with the real-world US image degradation characteristics involving sensor noise, speckle patterns, and motion artifacts. Lanczos approximates these physical limitations more accurately than linear interpolators, as it introduces controlled high-frequency suppression without over-smoothing [1]. We will emphasize this in the final version.

R1 (Optimizations) DUP only requires 3,000 iterations only for the first frame, subsequent frames use early stopping after 1,000 iterations. DUP is not for real-time processing, but rather the adaptiveness of unsupervised learning, and is used as a preprocessing for US videos to improve quality and enhance the accuracy of downstream algorithms and diagnosis.

R2/R3 (Iteration Efficiency, Performance) The term “convergence speed” in our work refers to the number of iterations required per frame during video processing. Since DIP lacks the WIn strategy and early-stopping mechanisms, we followed prior DIP protocols and fixed its iterations to 3,000 per frame for all frames in the comparison experiments. For DUP, the first frame also uses 3,000 iterations. However, subsequent frames leverage WIn, and reduce iterations by at least 1,000 per frame. The observation about DIP’s performance, “sacrificing iterations,” is insightful. Indeed, DIP’s results rely on exhaustive per-frame optimization. DUP achieves comparable results with significantly fewer iterations and superior performance with the presence of dual regularization. Due to space limits, our ablation focused on WIn and dual regularization, the core contributions of our work. The hyperparameter analysis is valuable. However, we observed low sensitivity and plan to explore it further in future work.

R2 (Figure Correction) The “mouse icon” error in Fig. 4 was a visualization oversight. We will correct this in the final version.

R3 (Overfitting Issue of DIP, HO Term) The issue of DIP overfitting to noise and artifacts has been extensively discussed in prior works [2]. Due to space constraints, we did not elaborate on this phenomenon in our manuscript. However, in future work, we will include visual evidence to explicitly demonstrate how DUP mitigates overfitting in ultrasound-specific scenarios. The HO term encourages the smoothness of the solution, and one can control \lambda_2 for the shrinking or expanding of the homogeneous regions in the restored image, as explored in prior works [3].

[1] DOI: 10.1007/10704282_23

[2] DOI: 10.1109/CVPR.2018.00984

[3] DOI: 10.1137/120867068

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

Blind Restoration of High-Resolution Ultrasound Video

Author(s):