Abstract

Ultrasound image segmentation plays a critical role in medical-assisted diagnosis but suffers from inherent limitations, such as high noise, artifacts, and morphological diversity. Existing methods struggle to generalize with small-sample data due to feature contradictions from varying acquisition angles, limiting multi-center clinical use. To address these issues, we propose a dual prior-guided two-stage segmentation framework. In the first stage, the prior classification of small-sample data guides domain adaptation pretraining on large-scale datasets, employing dynamic class balancing to mitigate data distribution bias. The second stage features a multi-level feature fusion architecture with three core modules: First, we design a Multi-branch Convolutional Parallel Attention (MCPA) module that extracts contextual features via parallel dual attention to adaptively select multi-scale features. Next, we propose a Multi-scale Fusion Dilated Convolution (MFDC) module that enhances the encoder’s capability to capture lesion boundaries across different receptive fields through hierarchical dilated convolutions. Finally, we introduce an Enhanced Feature Decoding module (EFD) in the decoder, embedding a cross-layer compensation mechanism using shallow high-resolution features to recover spatial details lost. Furthermore, we propose an interactive dual-stream architecture that bridges prior-guided classification and segmentation tasks, where complementary features are fused through cross-task attention to optimize holistic semantic consistency and robustness. Experiments on the public dataset demonstrate our method’s superiority over mainstream approaches. Ablation studies validate the effectiveness of our method, providing a solution for high-precision, high-availability small-sample ultrasound image segmentation. Code is on Github: https://github.com/notchXie/DPGS-Net.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4164_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/notchXie/DPGS-Net

Link to the Dataset(s)

N/A

BibTex

@InProceedings{ZhaWei_DPGSNet_MICCAI2025,
        author = { Zhang, Weijie and Xie, Lingfeng and Zeng, Kun and Luo, Xiaonan and Gong, Yongyi},
        title = { { DPGS-Net: Dual Prior-Guided Cross-Domain Adaptive Framework for Ultrasound Image Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15966},
        month = {September},

}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper proposed a novel segmentation framework to generalize with small training data. The framework includes 2 stages, where stage I the target domain images were clustered and divided into multiple classes based on image features, and the target-test set was selected so that all minority classes were well represented in the test set. The base model was pretrained on the source dataset and finetuned on the target-train set, using a segmentation loss. In stage II, a second fine-tuning was performed with feature fusions from a classification model. The cross-task feature fusion in allows dynamic adjustment in the decoder to enhance segmentation. Three different modules were included in the model: MCPA, MFDC, and EFD. These modules enhance segmentation performance.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper introduced a novel way to treat domain-adaptation problem to improve segmentation performance on ultrasound images, which are highly heterogenous and noisy.
    2. The authors compared their methods with many existing methods and showed superior performance with a considerably large improvement, even without the domain adaptation pretraining.
    3. Ablation study was done to compare the proposed pre-training method and general pre-training, and the results proved the effectiveness of the proposed method.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. It seems to me that Ref [26] is also a well-performing domain-adaptation framework that addresses the issue of thyroid ultrasound heterogeneity. Both methods are also using the same TN3K dataset as source data, but the authors did not address the strength of their framework over Ref [26].
    2. The 3 model flowcharts are confusing. Fig.3 was the network details for training Stage II, and I believe the MCPA module refers to the lilac block in ‘multi-level feature fusion segmentation network’? If so, there’s no need for the blocks in Stage II in Fig.2. Combining Fig. 3 into the right side of Fig. 2 will make the information clearer. Besides, I suppose Fig. 4 refers to the MCPA module, maybe noting that on the figure will be helpful. Since MFDC, MCPA, and EFD modules are all sub-modules in the network for Stage II, putting them in the same figure, and the rest of the training framework in a separate figure will be helpful. The current images require a lot of back-and-forth to understand.
    3. The authors have not proved the necessity of MFDC and EFD modules. Although ablation study partially shows the necessity of MCPA modules, it is not clear how the module is important as the bottleneck.
    4. Details: 1) The authors have not mention what ‘surface position information’ meant in the text. 2) The fusion between classification and segmentation in MCPA was not clearly depicted in Fig. 4, if I understood the figures correctly. There was no depiction of segmentation and classification features, and it’s not clear what the ‘Feature Input’ refers to, even though the text has some description about this in Sec 2.3.
    5. Notations: 1) It is not clear what’s the meaning of E_pos in Sec 2.1
  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The manuscript needs some work to clarify the different modules and the training process. The authors should put more work on the flowchart and the method section to make sure readers understand which part of the flowchart they are referring to in the text. Justifications on the necessity of the 3 newly introduced are needed to prove the effectiveness of the addition.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    N/A

  • [Post rebuttal] Please justify your final decision from above.

    N/A



Review #2

  • Please describe the contribution of the paper

    This paper proposes a dual prior-guided two-stage segmentation framework that combines domain adaptation pretraining strategies with a multi-level feature fusion architecture and a multi-task dual-stream feature interaction mechanism. This framework addresses feature contradictions, insufficient cross-domain generalization, and data distribution bias in small-sample datasets of ultrasound image segmentation tasks.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. A general domain adaptation pretraining strategy is proposed, guided by prior classification knowledge and leveraging large-sample datasets of the same category to address feature contradictions and domain adaptation challenges in small-sample scenarios.

    2. A multi-level feature fusion segmentation network is designed, integrating MCPA, MFDC, and EFD modules to significantly enhance feature extraction and segmentation accuracy.

    3. A dual-stream interaction mechanism is constructed for prior classification and segmentation tasks, adaptively guiding the segmentation network through cross-task feature fusion to optimize robustness and generalization.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. This paper describes using the surface positional prior information to construct a prior classification task. However, it does not elaborate on how the classification labels are specifically generated and how the consistency of these labels is ensured. These aspects need to be explained in more detail.

    2. This paper lacks comparisons with classic networks for ultrasound image segmentation that have proposed in the past two years in the comparative experiments.

    3. The loss function in this paper is relatively complex, involving several weight parameters. However, there is not much elaboration on these aspects.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
    1. This paper describes using the surface positional prior information to construct a prior classification task. However, it does not elaborate on how the classification labels are specifically generated and how the consistency of these labels is ensured. These aspects need to be explained in more detail.

    2. This paper lacks comparisons with classic networks for ultrasound image segmentation that have proposed in the past two years in the comparative experiments.

    3. The loss function in this paper is relatively complex, involving several weight parameters. However, there is not much elaboration on these aspects.

    4. The segmentation results should include standard deviation values to further demonstrate the superiority of the proposed method.

    5. It is recommended that the used datasets and code be open-sourced to improve the reproducibility of the proposed method.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The proposed method in this paper is interesting, but it still lacks detailed elaboration in certain aspects.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The author has essentially addressed my question. Overall, the paper is innovative and I find it acceptable. However, I still hope it can be open-sourced.



Review #3

  • Please describe the contribution of the paper

    This paper proposes DPGS-Net, a dual prior-guided two-stage framework for ultrasound image segmentation. The framework consists of: (1) a prior classification-guided domain adaptation pretraining strategy that leverages larger datasets to address feature contradictions and data scarcity in small-sample scenarios, and (2) a multi-level feature fusion architecture that integrates three modules (MCPA, MFDC, EFD) within a U-shaped network, enhanced by a dual-stream feature interaction mechanism for both classification and segmentation tasks. The authors evaluate their approach on the DDTI thyroid nodule dataset, showing superior performance over existing methods.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • The paper addresses a significant clinical challenge in ultrasound image segmentation, focusing specifically on the problems of noise artifacts, limited training data, and feature contradictions caused by acquisition variations.
    • The domain adaptation pretraining strategy is well-designed and effectively leverages larger datasets (TN3K) to improve performance on small-sample scenarios (DDTI), using prior classification knowledge as guidance.
    • The proposed approach shows significant quantitative improvements over state-of-the-art methods, achieving 86.52% Dice compared to 81.95% for DeepLabv3+ and 80.03% for TRFE+. These gains are consistent across multiple metrics (IoU, Precision, Recall, MPA).
    • The ablation studies are comprehensive and clearly demonstrate the contribution of each component to the overall performance, with domain adaptation pretraining providing the most significant improvement (+4.21% Dice over the baseline with multi-task learning).
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The novelty of the individual components is limited; the primary contribution appears to be in the integration of existing techniques (attention mechanisms, dilated convolutions, multi-task learning, domain adaptation) rather than fundamentally new methodological advances.
    • The evaluation is limited to a single dataset (DDTI), which constrains the authors’ claims about cross-domain generalization and “universal” applicability.
    • The overall architecture is quite complex with multiple specialized modules, raising questions about whether this complexity is necessary or whether simpler approaches could achieve similar results. The computational cost and inference time are not discussed. This point combined with the previous one (only one dataset) raises concerns about this solution overfitting to this data in particular.
    • The paper claims to address “feature contradictions” but does not thoroughly explain or quantify this concept, making it difficult to assess how effectively the method addresses this specific challenge.
  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper presents a comprehensive approach to ultrasound image segmentation that shows meaningful improvements over state-of-the-art methods. The domain adaptation pretraining strategy is particularly effective and could be valuable to the MICCAI community. The multi-level feature fusion architecture and dual-stream interaction mechanism are well-designed, and the ablation studies provide convincing evidence that each component contributes to the overall performance. However, concerns about the limited evaluation scope, complexity of the approach, and incremental novelty prevent a stronger recommendation. Hyperparameters are missing to reproduce the work, such as the ones from the loss function, and there is no mention to sharing the source code.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I keep my recomendation to accept the paper. The authors answers were satisfactory.




Author Feedback

The authors sincerely thank the reviewers for recognizing the contributions of our work and for their constructive suggestions. We will release all the code and documents on GitHub upon paper acceptance (R1-Q9, Rn-Qm denotes Reviewer #n’s Question m). In the following, we provide detailed point-by-point responses to the reviewer’s comments:

  1. Annotation Accuracy and Reliability(R1-Q7.1,Q10.1) The classification labels were established based on medical knowledge of thyroid morphology. These labels were independently annotated by two experts, one serves as the master and the other as validator. Inter-rater agreement was achieved using the kappa coefficient as follows: for each sample’s kappa coefficient (1) if it falls below the cutoff kappa, this sample is sent back to both experts to achieve consensus; (2) otherwise, we use the label by master expert.

  2. Method Comparison(R1-Q7.2,Q10.2; R3-Q7.2,Q7.3) As presented in Table 1, we conducted a comprehensive comparison between our proposed method and several SOTA approaches, including DAC-Net (2024), BPAT-UNet (2023), and TRFE+ (2023), across which our method consistently demonstrated superior performance.

  3. Loss Function(R1-Q7.3,Q10.3) We optimized our total loss function based on [2, 10, 24] as follows: L_total = λ1L_seg + λ2L_CE where the segmentation loss is: L_seg = μ1L_BCE + μ2L_dice with hyperparameters: μ1 = 0.4, μ2 = 0.6, λ1 = 0.5, λ2 = 0.5 (Section 2.3 Loss Function). It can be simplified after a rescaling: L_total = μ1L_BCE + μ2L_dice + L_CE.

  4. Definitions of Surface Position Information, Feature Contradictions, and E_pos(R2-Q7.4.1,Q7.5; R3-Q7.4) · Surface Position Information: Refers to the morphology of the thyroid gland in an ultrasound image acquired at a specific probe angle. · Feature Contradictions: These are inconsistencies in feature representations induced by morphological variability related to different positioning. As shown in Fig 1, the same lesion’s varying morphologies cause contradictions in the feature space. · E_pos: This denotes a spatial position feature extraction operation used to derive the spatial distribution feature F_pos from a target domain image x_jt.

  5. Comparison with SHAN Ref 26 SHAN[26] achieves good results without data augmentation by assuming elliptical nodule shapes prior, but this limits generality. In contrast, our approach uses a classification-prior-guided pretrained strategy to reduce output-label discrepancy, improving domain adaption and feature consistency, especially for small-sample and irregularly shaped nodules.

  6. Optimization and Explanation of Illustrations(R2-Q7.2,Q7.4.2) We acknowledge the reviewer’s suggestion regarding the clarity of visual representations and will revise the layout and accompanying descriptions accordingly to improve readability.The feature input to the MCPA module (Figure 4) is formed by concatenating the segmentation network’s bottleneck feature F_bottleneck with the classification network’s high-dimensional feature F_cls. Additional technical details are provided in Section 2.3.1.

  7. Ablation Experiments and Method Innovation(R2-Q7.3; R3-Q7.3,Q7.1) We appreciate the reviewer’s interest in our framework. While some components build on existing methods, our core contribution lies in task-driven optimization for clinical ultrasound, addressing small-sample data and feature inconsistencies through a classification-prior-guided pretrained strategy and a dual-prior-guided segmentation framework. Each module is purposely designed to tackle challenges such as noise, vague boundaries, and limited annotations. As reported in Table 2, removing any module (e.g., MCPA) significantly reduces performance, the MFDC and EFD modules were also validated. Our model achieves 64.12 FPS on RTX 4090, comparable to current SOTA models.

We thank the reviewers again and will revise the paper accordingly to further improve it.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Reject

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper is a really borderline case, with one WR and two WAs prior to the rebuttal process. After reading the paper, reviews and the rebuttal, I am inclined toward the rejection of this work based on several concerns. First, the novelty of this approach is marginal, if any, and authors do not respond to how the proposed method is different from existing approaches. Furthermore, the empirical validation is not convincing, with only one dataset employed for evaluation, and many compared methods being outdated (i.e, UNet, FCN, DeepLab, UNet++, TransUNet, etc), which gives a wrong impression of a large/comprehensive evaluation. Furthermore, I am also concerned about the high number of hyper parameters just in the loss function, which may also hamper its generalizability to more datasets (this remains unknown). Thus, despite this work has some merits, I believe it falls a bit below the acceptance threshold for MICCAI.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper proposes a simple regularization method that mitigates over-concentrated attention in MIL. Please try to revise the camera ready paper according to the promise in the rebuttal.



back to top