Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

PET-CT lesion segmentation is challenging due to noise sensitivity, small and variable lesion morphology, and interference from physiological high-metabolic signals. Current mainstream approaches follow the practice of one network solving the segmentation of multiple cancer lesions by treating all cancers as a single task. However, this overlooks the unique characteristics of different cancer types. Considering the specificity and similarity of different cancers in terms of metastatic patterns, organ preferences, and FDG uptake intensity, we propose DpDNet, a Dual-Prompt-Driven network that incorporates specific prompts to capture cancer-specific features and common prompts to retain shared knowledge. Additionally, to mitigate information forgetting caused by the early introduction of prompts, prompt-aware heads are employed after the decoder to adaptively handle multiple segmentation tasks. Experiments on a PET-CT dataset with four cancer types show that DpDNet outperforms state-of-the-art models. Finally, based on the segmentation results, we calculated MTV, TLG, and SUVmax for breast cancer survival analysis. The results suggest that DpDNet has the potential to serve as a valuable tool for personalized risk stratification, supporting clinicians in optimizing treatment strategies and improving outcomes. We plan to make the code publicly accessible.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2627_paper.pdf

SharedIt Link: https://rdcu.be/eHdSA

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04978-0_16

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/XinglongLiang08/DpDNet

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LiaXin_DpDNet_MICCAI2025,
        author = { Liang, Xinglong AND Huang, Jiaju AND Han, Luyi AND Zhang, Tianyu AND Wang, Xin AND Gao, Yuan AND Lu, Chunyao AND Cai, Lishan AND Tan, Tao AND Mann, Ritse},
        title = { { DpDNet: An Dual-Prompt-Driven Network for Universal PET-CT Segmentation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15965},
        month = {September},
        page = {163 -- 172}
}

Reviews

Review #1

Please describe the contribution of the paper

The authors propose a dual-prompt strategy along with prompt-aware segmentation heads to capture both shared and cancer-specific patterns, thereby enhancing the detection of small targets. The proposed model exhibits strong generalizability in breast cancer survival analysis, underscoring its potential for risk stratification.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Introducing a dual-prompt framework with adaptive segmentation heads to capture both shared and cancer-specific patterns, thereby enhancing the recognition of small lesions.
2. The model’s strong performance in breast cancer survival analysis demonstrates its potential for broader clinical applications.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The proposed model is developed based on UniSeg, however, the manuscript lacks a thorough comparison highlighting the differences and advantages of this design choice.
2. The reason for not selecting the strongest baseline, 3DUX-Net, as the backbone for universal training is not provided. At the very least, the authors should have adopted STU-Net-Base, which outperforms the utilized STU-Net-S.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The lack of a clear description of the novelty and insufficient motivation for the experiments are the primary reasons for my decision to give a “Weak Reject”.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

This reviewer appreciates the authors’ efforts in addressing the raised concerns. However, the use of a weak backbone (STU-Net Small) significantly limits the potential impact of the work on the community. In 3D medical segmentation, performance typically takes precedence over efficiency. Moreover, STU-Net Base is not prohibitively large for training and inference. Providing upper-bound results using a strong backbone is essential to fully demonstrate the effectiveness of the proposed methods.

Review #2

Please describe the contribution of the paper

The paper proposes DpDNet, a method for PET-CT lesion segmentation that incorporates dual learnable prompts. Compared to a standard UNet, DpDNet introduces prompt-aware heads and utilizes both common prompts—to capture cancer-specific patterns—and universal prompts—to capture shared characteristics across different cancer types. The authors validate the method on segmentation tasks across four cancer types (lung, lymphoma, melanoma, and breast cancer), showing that DpDNet outperforms both existing non-prompt-based and prompt-based methods. Also, survival analysis was conducted using parameters derived from the segmentation, showing that DpDNet’s output led to the best survival prediction among the compared methods.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Disentanglement between cancer types: DpDNet’s main strength lies in its ability to capture both cancer-specific patterns and universal cancer characteristics, which improves generalizability. This is further illustrated in Fig. 2, where each cancer type forms a clear cluster, while the common feature is located in the middle.

The comparison of the model (DpDNet) with existing non-prompt- and prompt-based methods is sound.

The survival analysis benchmark demonstrates DpDNet’s superiority over other segmentation models when used for downstream tasks.

Comprehensive ablation study: Table 2 provides a detailed ablation study, showing that each proposed component contributes to the overall segmentation improvement.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Lack of statistical tests: No statistical significance tests are provided for the results.

The order of feature elements in the input of f_{2i} may vary across cancer types, which could lead to suboptimal feature transformation. A lightweight attention mechanism might be more appropriate in this case

The use of Softmax in the Gated Fusion Module is not well justified. What is the motivation behind applying SoftMax to F_{ini} and then multiplying it by F_{ini}? And how this can be beneficial for the model?

As the number of cancer types increases, the prompt-aware heads will require more parameters, potentially impacting scalability.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The clinical and technical motivations are clearly defined, and the gap addressed by the paper is well explained.

The methodology figure is clear and informative. However, the notation “N x” in the Gated Fusion Module is ambiguous—it could imply either “N x” stages of convolution and Softmax or something else. The same ambiguity applies to the “N x” notation in the Prompt-Aware Heads.

The quantitative and qualitative results as well as the ablation study are solid.

No statistical significance tests are provided for the results.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The author has replied to all the comments and has agreed to update the manuscript for better clarity.

Review #3

Please describe the contribution of the paper
This paper proposes DpDNet, a dual-prompt-driven network designed for universal PET-CT lesion segmentation across multiple cancer types. The method introduces:
- Cancer-specific prompts to encode features unique to each cancer type (e.g., lung, breast, melanoma, lymphoma),
- A common prompt to capture shared patterns across cancers,
- Prompt-aware segmentation heads with multi-scale branches and channel attention for better task adaptability and fine detail capture. The model achieves state-of-the-art segmentation accuracy and is further validated via survival analysis on a large breast cancer dataset, demonstrating its clinical utility.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Separates cancer-specific and shared representations, addressing the heterogeneity and correlation across cancer types effectively. Achieves the best segmentation performance across multiple datasets (AutoPET + private breast cancer), outperforming both prompt-based and non-prompt baselines. Clearly demonstrates the contribution of each proposed component (specific prompt, common prompt, and PA-heads) through quantitative improvements.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Uses only four cancer types, with the breast cancer dataset being private; generalizability to unseen or rarer cancers is unclear. The paper lacks detail on how the prompts are initialized, updated, or regularized during training, which may impact reproducibility. While CT and PET are both present, there’s little discussion or ablation on how fusion across modalities contributes to performance. While informative, the survival analysis does not include prospective or causal validation; limited by lack of pathology/clinical ground truth.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Overall, DpDNet is a thoughtfully designed, modular, and clinically relevant segmentation framework that pushes the frontier of universal PET-CT analysis by using dual prompts and prompt-aware heads. It successfully handles cancer heterogeneity and demonstrates potential for survival analysis applications. While the work would benefit from broader dataset validation and clarification on prompt mechanics, it makes a strong contribution to multimodal, prompt-driven medical segmentation research. I suggest weak accept.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors address most of my concerns.

Review #4

Please describe the contribution of the paper

This paper makes three main contributions: (1) It proposes a method to advance deep learning tasks from cancer detection to cancer recognition using the proposed dataset. (2) It improves task-specific user interfaces through the application of prompt engineering techniques. (3) It shows the extended clinical applicability of the proposed method through a clinical evaluation like survival prediction and risk stratification.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

One of the major strengths of this work is the inclusion of clinical feasibility evaluation, which distinguishes it from many other studies. In breast cancer analysis, performing personalized risk stratification and survival prediction represents a crucial step toward clinical relevance, especially in PET-CT research. Going beyond conventional deep learning performance metrics to assess clinical feasibility adds significant value and highlights the practical impact of the proposed approach.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

While this paper provides a comprehensive analysis of training and evaluation, extending even to the clinical level, one notable weakness is that the three best-performing segmentation models used for PET quantification do not show meaningful differences in quantitative metrics such as MTV, TLG, and SUVmax in Table 3.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(6) Strong Accept — must be accepted due to excellence
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Quantitative analysis from a clinical perspective is crucial in molecular fusion imaging such as PET-CT. This paper serves as a valuable contribution that guides deep learning research toward real-world clinical applications.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

Thank you all for the valuable feedback!

Q1: Statistical comparisons. (R1) A1: All comparison results yielded p<0.05(t-test), which will be included in the revision.

Q2: Feature ordering consistency.(R1) A2: We believe there may be a misunderstanding. Order consistency has been ensured in our design. Specifically, each cancer type has its own gated fusion module as Fig.1 shows (“N x” denotes the number of modules, matching the cancer types), which always receives prompts from other cancer types in a fixed order. This ensures consistent feature transformation. We will highlight it in the revision and explore lightweight attention in future work.

Q3: Softmax.(R1) A3: We acknowledge that the original formula was incorrect. In our implementation, the Softmax is applied to F_{ini}, and the resulting weights are used to compute a weighted sum over (F_{can1},…, F_{cani-1}, F_{cani+1},…, F_{canN}) —not F_{ini} itself. The Softmax provides normalized, learnable attention weights, allowing the model to adaptively highlight relevant cancer features. We will correct this formula and Fig.1 in the revision.

Q4: Parameters.(R1) A4: A single PA-head has 0.018M parameters. Therefore,with N cancer types, the total is (15.12 + 0.018 x N) M. if N=10,the total is 15.30M—still far smaller than nnU-Net (31M) and others.

Q5: Differences and advantages.(R2) A5: (1)UniSeg uses modality-level prompts for PET-CT, while our model introduces finer, cancer-specific prompts. (2)A common prompt was introduced to capture shared features across cancer types. It improves performance over UniSeg(Baseline+S-Prompt, Table.3), and its central position in Fig.2 supports its role in representing shared features. (3)We designed prompt-aware heads to mitigate early prompt-induced forgetting in UniSeg and improve task adaptability. (4)DpDNet achieved 2.7% higher Dice than UniSeg (p<0.05), with similar computation and parameters. We will highlight these differences and advantages in the revision.

Q6: Baseline selection.(R2, Meta) A6: We had considered the impact of different backbone scales, which is why we opted for the scalable STU-Net architecture. Using STU-Net-B as the backbone of DpDNet led to only a 0.6% DSC gain but incurred a 4× increase in parameters (58M vs. 15M) and computation (548Gs vs. 138Gs). 3DUX-Net, though strong, is even more costly, with 11× the computation (1509Gs).To balance performance and efficiency, we selected STU-Net-S, which still yielded the best performance. We will clarify this choice in the revision and evaluate other backbones in future work.

Q7: MTV, TLG, and SUVmax.(R3) A7: The lack of meaningful differences is likely because MTV, TLG, and SUVmax are insensitive to boundary variations. Nonetheless, DpDNet still achieved the best survival prediction via MTV (p<0.05). We will explore more precision-demanding tasks, like subtle metastasis detection important for diagnosis.

Q8: Generalizability.(R5,R1) A8: To improve diversity, we have already collected two additional types (ovarian and esophageal) cancer data—which will be included in our journal work. We aim to develop models for these common cancers(The top 10 cancers account for 63% of all cases), while data collection for rare cancers remains challenging.

Q9: Details.(R5, Meta) A9: We will release the code, as stated in abstract’s last sentence. The prompts are randomly initialized from a standard normal distribution. They are treated as learnable parameters and are updated together with the rest of the network (using SGD with lr=1e-4, weight decay=1e-3). No additional regularization is applied.

Q10: Modal Fusion, Clinical limitations.(R5) A10: Further analysis of the individual contributions and modality fusion of PET-CT would strengthen the work, and we will include such studies in our journal work. In addition, prospective or causality-aware validation is an important next step, which we plan to explore through clinical collaborations. Patients were clinically diagnosed.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

Reviewers agreed on the strength of the proposed multi-modal framework. I would like to invite the authors to clarify on two items in reviwer comments: 1) choice of baseline model (R2) and significance of performance improvement. 2) please provide more details on how the prompts are initialized, updated or regularized (R5)
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The authors have addressed the reviewers concerns and the proposed solution is interesting

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

DpDNet presents a dual-prompt framework for PET-CT lesion segmentation that disentangles cancer-specific and shared features and uses prompt-aware heads, achieving top performance across four cancer types and improving breast cancer survival prediction. Reviewers praised its clear methodology, strong empirical results, and thorough ablation and clinical evaluations; the authors’ rebuttal provided statistical significance tests, clarified the gating/SoftMax and prompt details, and justified the choice of a lightweight STU-Net-S backbone. R2’s concern about a stronger backbone is outweighed by this efficiency justification, and most reviewers moved to “Accept” post-rebuttal. Given its novelty, and clinical relevance, I recommend accepting the paper. The authors are encouraged to include these clarifications and further address the remaining concerns from R2 in the final version of the manuscript.

back to top

DpDNet: An Dual-Prompt-Driven Network for Universal PET-CT Segmentation

Author(s):