Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Organ segmentation in Positron Emission Tomography (PET) plays a vital role in cancer quantification. Low-dose PET (LDPET) provides a safer alternative by reducing radiation exposure. However, the inherent noise and blurred boundaries make organ segmentation more challenging. Additionally, existing PET organ segmentation methods rely on co-registered Computed Tomography (CT) annotations, overlooking the problem of modality mismatch. In this study, we propose LDOS, a novel CT-free ultra-LDPET organ segmentation pipeline. Inspired by Masked Autoencoders (MAE), we reinterpret LDPET as a naturally masked version of Full-Dose PET (FDPET). LDOS adopts a simple yet effective architecture: a shared encoder extracts generalized features, while task-specific decoders independently refine outputs for denoising and segmentation. By integrating CT-derived organ annotations into the denoising process, LDOS improves anatomical boundary recognition and alleviates the PET/CT misalignments. Experiments demonstrate that LDOS achieves state-of-the-art performance with mean Dice scores of 73.11% (18F-FDG) and 73.97% (68Ga-FAPI) across 18 organs in 5% dose PET. Our code will be available at https://github.com/yezanting/LDOS.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0387_paper.pdf

SharedIt Link: https://rdcu.be/eHwRL

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04947-6_54

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/yezanting/LDOS

Link to the Dataset(s)

N/A

BibTex

@InProceedings{YeZan_Self_MICCAI2025,
        author = { Ye, Zanting AND Niu, Xiaolong AND Han, Xu AND Wu, Xuanbin AND Lu, Wantong AND Lu, Yijun AND Sun, Hao AND Huang, Yanchao AND Wu, Hubing AND Lu, Lijun},
        title = { { Self is the Best Learner: CT-free Ultra-Low-Dose PET Organ Segmentation via Collaborating Denoising and Segmentation Learning } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15962},
        month = {September},
        page = {566 -- 576}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper presents a novel method named LDOS for organ segmentation using ultra-low-dose PET (ultra-LDPET) images. The proposed approach consists of a shared encoder that extracts common features and two dedicated decoders for simultaneous tasks: one for denoising and the other for segmentation. This method was evaluated on datasets involving two different tracers and demonstrated state-of-the-art segmentation performance, achieving superior Dice scores across 18 organs using PET images acquired at only 5% of the standard radiation dose.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The authors introduce a multi-task learning framework capable of jointly performing image denoising and organ segmentation, thus avoiding a sequential approach that could amplify errors.

The simultaneous execution of denoising and segmentation tasks potentially enhances feature representation, improving segmentation performance, as evidenced by state-of-the-art Dice scores achieved on ultra-LDPET datasets.

The method demonstrates clinical feasibility by performing effective organ segmentation at significantly reduced radiation doses (5%), addressing a critical clinical concern regarding patient exposure.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Questionable interpretation of low-dose PET: The authors interpret low-dose PET imaging as a pixel-level masked version of full-dose PET (FDPET), stating specifically, “for a 5% dose PET, the masking ratio k is set to 95%.” However, low-dose PET imaging involves globally reduced counts, not strictly binary masking at a pixel level. Low-dose imaging reflects uniformly lower photon counts leading to increased statistical noise, rather than isolated pixel masking. Thus, this pixel-level masking interpretation is potentially misleading.

Visualization confusion (Fig. 2): The visualization of feature stacking in Figure 2 is unclear. Initially, features from the encoder are represented horizontally (layer by layer, colored green-yellow-blue), yet the stacked features are shown vertically, creating confusion about feature aggregation.

Decoder clarification (Section 3.1): Both denoising and segmentation decoders are ambiguously labeled with the symbol “D,” causing uncertainty. Additionally, “D_c” suggests multiple decoders—one per class—yet this is not explicitly clarified in the text. Clearer notation and explicit description would resolve this confusion.

Unclear data augmentation: The misalignment data augmentation is mentioned but inadequately explained. Detailed descriptions or examples are necessary to clarify its implementation.

Loss function parameters (w^z and v^z): It is unclear whether parameters w^z and v^z in the loss function are trainable. If they are trainable, their optimization process must be clarified; if fixed, their settings require justification and clear explanation.

Insufficient dataset and reconstruction details (Section 4.1): The paper lacks explicit details on the number of images for each tracer dataset and the image reconstruction methods. Moreover, subsampling across 300 seconds is not uniformly handled, considering tracer decay and kinetic variations, which might influence results.

Limited ablation study: The hyperparameters used in the loss function are numerous, yet the paper lacks comprehensive ablation studies or clear justifications for selecting the current configurations.

Inconsistency between Table 4 and Section 4.4: The results reported in Table 4 differ from explanations provided in Section 4.4 for LDOS without weighting (LDOS/ow) and Fake-FDPET scenarios. Clarification or correction of this discrepancy is essential.

Typos and minor errors: Minor errors and typos (e.g., “ultra-LPPET” in Section 4.2) require correction for improved readability and professional presentation.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents an intriguing and clinically relevant approach with strong performance in ultra-low-dose PET organ segmentation. The multi-task learning strategy is valuable and potentially impactful for reducing patient radiation exposure. However, significant weaknesses related to conceptual interpretation (low-dose PET masking), methodology clarity (decoder definitions, feature visualization), inadequate experimental detail (data augmentation, reconstruction specifics), and insufficient ablation analyses weaken confidence in the conclusions. Addressing these points through revision and clarification is critical for improving the paper’s impact and scientific rigor.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The authors have addressed most of my questions. I agree that the semantic patterns learned through denoising can support segmentation performance, whereas the analogy to “pixel-level masking” need not be a proper statement. For clarity, the denoising and segmentation decoders may be denoted as D_d and D_s, respectively, with the class-specific output of the segmentation decoder indicated using a superscript c, i.e., D_s^c. With the clarifications provided regarding the dataset description, data augmentation strategy, and the definition of “scales” in the context of multi-resolution deep supervision, the presentation of the methodology is now more coherent.

It would be worthwhile to explore a two-stage training pipeline: (1) pretraining the encoder via the denoising task, and (2) fine-tuning the pretrained encoder for segmentation. This could further highlight the utility of semantic feature transfer between tasks and the advantage of the collaborative training strategy.

My opinion on this paper lies between acceptance and rejection.

Review #2

Please describe the contribution of the paper

The paper introduces ‘LDOS’ - a framework to segment organs in a low dose PET scan. LDOS consists of one encoder and two decoders: One used for denoising and one for segmentation. Both decoders are sharing features and are combined to give the final segmentation masks.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The method is novel and interesting. Sharing features of denoising and segmentation network is an interesting approach.
- Methods are clearly described and good to follow.
- The results underline the success of the proposed method.
- The article is well written.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- Some details about the dataset are missing. How many images were available? Were they all acquired on one scanner? Was training performed on FDG and FAPI images separately or was one training set containing both tracers used? How was the ground truth acquired? Are these manual segmentations?
- Please explain more in detail how PET/CT misalignment is included.
- Please explain in more detail how the total number of ‘scales’ in the training process is defined (I do not understand what you mean with scales).
- For how many epochs was trained? This is especially important as the weight in the loss function contains number of epochs as parameter
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The method interesting. Providing some more details on the method and the used dataset will increase the quality of the paper and will make it better understandable. Please also consider to change the size of some figures as some results are very difficult to see by eye. You could zoom in some parts to make them better visible.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

I am not totally convinced by the response of the authors. While the contribution is interesting, I feel that the results are not adequately analyzed and limitations are not mentioned. It is not 100% clear what the authors will incorporate in the final manuscript. I especially dislike that the title incorporates ‘CT-free’ while CT images are included in the training process what is a contradiction for me.

Review #3

Please describe the contribution of the paper

This paper introduces a novel self-supervised learning framework for organ segmentation in ultra-low-dose PET imaging, without relying on CT. Inspired by masked autoencoding, the authors treat the ultra-low-dose PET as a masked version of the full-dose PET and jointly optimize a denoising and segmentation network. The dual-decoder architecture enables collaborative learning, addressing noise suppression and semantic feature learning simultaneously. The method demonstrates substantial performance gains across multiple organs and datasets and is a significant step toward CT-free PET segmentation. This work represents a major step toward clinical applicability by challenging the conventional dependency on anatomical imaging, showing that PET alone can be sufficient for accurate and robust organ segmentation in certain settings.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Novel architectural design: The paper introduces a self-supervised dual-decoder framework that jointly learns to denoise and segment ultra-low-dose PET images without relying on anatomical modalities such as CT or MRI. This design is conceptually original, as it leverages PET itself as a masked input, inspired by masked autoencoding, to derive semantic structures. Original use of data: The authors reframe ultra-low-dose PET as a form of information-masked full-dose PET, creating a novel pretext task for self-supervised learning. This approach is an elegant and data-efficient solution to training without CT-based supervision. Strong generalization across tracers and organs: The method is evaluated on both FDG and FAPI PET scans, and across multiple organs, demonstrating consistent improvements and robustness. This generalization capability is critical for clinical viability. Elimination of CT dependence: By showing high segmentation accuracy without using anatomical priors, the work paves the way for more radiation-efficient and modality-independent PET pipelines. This is highly relevant for pediatric and vulnerable populations. Comprehensive ablation study: The authors provide an ablation analysis that clarifies the contribution of each component in the architecture, supporting the effectiveness of their design. Clarity and reproducibility: The manuscript is clearly written, code is shared, and results are well-supported with visual and quantitative evidence. Potential for extension: The modular nature of the proposed framework allows for easy extension to other tasks such as PET lesion detection, attenuation correction, or functional quantification. Its independence from anatomical imaging enhances its flexibility across clinical and research applications.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
While the proposed framework presents a technically elegant approach to CT-free segmentation in ultra-low-dose PET, several critical issues remain underexplored:
1. Lack of robustness analysis under extreme noise: The model is trained and evaluated on 5% dose PET images, which are expected to contain substantial noise. However, no analysis is provided to assess robustness against such noise, nor are potential failure cases discussed.
2. No per-organ performance metrics reported: Segmentation results are only reported as averages across all organs. This can obscure poor performance in clinically sensitive or small structures like the pancreas or bladder.
3. Denoising side effects not discussed: The paper does not examine whether denoising introduces artifacts or over-smoothing, particularly in cases where PET signal is low or ambiguous.
4. Ambiguity about low-dose data nature: It is unclear whether ultra-low-dose PET scans are real acquisitions or simulated by subsampling. Simulated data may not reflect realistic noise distribution and complexity.
5. No statistical validation of results: Although the model outperforms baselines, there is no statistical testing (e.g., p-values or confidence intervals) to confirm the significance of these improvements.
6. Tracer-specific performance unexplored: While both FDG and FAPI datasets are used, the paper does not analyze how the model behaves across tracers. Performance consistency or degradation with respect to tracer-specific uptake is left unexplored.
7. Ground-truth alignment unaddressed: The segmentation labels are derived from CT scans, but the method for CT–PET registration is not described. Misalignment could impact training quality, especially given that inference is CT-free.
8. Architectural justification lacking: The dual-decoder structure is interesting, but the rationale for using two decoders versus more standard alternatives like multi-task heads or shared attention is not justified. There’s also no feature analysis to show task collaboration.
9. Ambiguous CT-free claim: Although the model is presented as CT-free, the segmentation ground-truths are derived from CT-based annotations. This introduces a dependency on CT during training, which contradicts the claim of being CT-free.
10. Safety and noise implications not discussed: Using 5% dose PET raises questions about clinical reliability and safety. The paper does not address how such noisy input might compromise critical decisions in high-risk patients or sensitive organs.
11. Lack of uncertainty quantification: The results are reported with means and standard deviations, but no uncertainty quantification is performed (e.g., Monte Carlo dropout, confidence maps), limiting clinical interpretability.
12. Simplified architectural novelty: Despite being based on nnU-Net, the architecture is not significantly re-engineered. There is limited novelty in the core model beyond the dual-task setup, and no comparative analysis with state-of-the-art task-adaptive networks is provided.
13. Limited clinical relevance discussion: The paper does not convincingly discuss how this method could be integrated into clinical workflows, or whether the segmentation quality is sufficient for tasks such as treatment planning or diagnostics.
14. Lack of error visualization: The manuscript lacks qualitative examples of segmentation errors or failure regions, which could be valuable for clinical readers and model developers alike.
15. Over-optimistic simplicity claim: The pipeline is described as “simple”, but includes multiple pre-processing steps (e.g., residual stacking, masked input generation) that could limit generalization and reproducibility in diverse clinical settings.
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

This paper presents a technically interesting and clinically motivated approach to PET-only organ segmentation under ultra-low-dose settings. The joint denoising and segmentation design is compelling and well-aligned with real-world needs in resource-limited or pediatric scenarios. That said, a few technical and conceptual points arose during the review that may help strengthen future versions or the rebuttal: Dose level justification: Could the authors elaborate on why 5% dose was selected? Was this empirically optimal, or chosen based on prior standards? PET-only supervision vs. CT-derived ground-truth: While the model claims CT-free segmentation, training relies on CT-derived masks. How do the authors view this supervision paradigm in terms of true self-supervised learning? Denoising side effects in low-uptake regions: In regions like the pancreas or small bowel, noise in ultra-low-dose PET can be highly structured. Were there failure cases where denoising impacted anatomical fidelity? Loss weighting and sensitivity: The paper introduces exponential scheduling of certain loss weights (e.g., φ). Was a sensitivity analysis performed to test how robust the method is to this choice? Generalization across tracers: The model is trained on both FDG and FAPI, but how robust is it to unseen tracers? Could performance degrade if applied to e.g., PSMA or DOTATATE tracers without fine-tuning? Preprocessing and registration details: Further clarification on data preprocessing, intensity normalization, and especially PET–CT alignment (since CT masks are used for training) would enhance reproducibility. Architectural design reasoning: The dual-decoder setup is intriguing. Could the authors share insights on why this design was favored over a single-decoder multi-task output, or auxiliary loss formulation? These are not criticisms but rather opportunities for clarification, reflection, and discussion. Overall, the study is a valuable contribution and I look forward to its evolution and clinical translation in future work.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I have assigned a score of 4 (Weak Accept) to this paper because it presents a promising and clinically relevant contribution, yet several critical aspects require clarification and strengthening before full endorsement. Key strengths that support the paper’s value include: The focus on CT-free segmentation in ultra-low-dose PET is timely, practical, and potentially impactful for pediatric and low-resource settings. Reducing reliance on anatomical imaging while maintaining segmentation quality is a step forward in PET workflow simplification. The dual-task architecture (joint denoising + segmentation) reflects a well-thought-out design aligned with clinical constraints in ultra-noisy environments. The method demonstrates generalization across multiple tracers (FDG and FAPI) and anatomical regions, which enhances the potential clinical applicability and robustness of the approach. The paper is generally well-structured and clear in its exposition, and the figures aid comprehension. However, I refrain from giving a higher score due to the following unresolved concerns: CT-free claim ambiguity: While the model is PET-only at inference, the training relies on CT-derived masks for supervision. This introduces a hidden anatomical dependency, which conflicts with the “CT-free” framing. More honest positioning or alternative self-supervision strategies would strengthen the conceptual integrity. Statistical and robustness analysis is insufficient: The manuscript lacks statistical tests (e.g., p-values, confidence intervals) to confirm performance gains. There’s also no systematic evaluation of failure cases, inter-subject variability, or robustness to extreme noise levels (e.g., in 2% dose conditions). No organ-wise performance analysis: Only aggregated metrics are presented. Small, low-uptake, or morphologically variable organs (e.g., gallbladder, bowel) may behave differently and need explicit reporting to assess clinical safety. Architectural decisions lack justification: While the dual-decoder setup is interesting, the paper does not justify it compared to other alternatives such as single-decoder multitask heads, residual fusion blocks, or shared representations. No ablation on decoder coupling or feature sharing is provided. Ground-truth alignment not discussed: Given the CT-derived labels, PET–CT registration accuracy is critical. However, the paper does not mention whether alignment errors were corrected or how registration artifacts could affect label quality. No external validation or cross-site testing: The model’s generalization is evaluated across two tracers but only from the same center. It remains unclear whether the method can handle scanner/domain shifts or differences in PET protocol parameters. Loss weighting heuristics not examined: The scheduling of φ and λ loss terms lacks experimental justification or sensitivity analysis. These could influence convergence or bias learning toward denoising versus segmentation. In summary, this work is a valuable contribution and addresses a real clinical challenge. However, in its current form, it raises several conceptual and methodological questions that must be addressed or clarified in the rebuttal. With strong responses and future extensions, this work could evolve into a solid candidate for publication and real-world application. Thus, a score of 4 is appropriate at this stage — with cautious optimism.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The rebuttal successfully clarifies several core concerns raised in the initial review, including the per-organ evaluation metrics, details on low-dose PET simulation, tracer-specific analysis, and the rationale behind the dual-decoder architecture. The authors provide sufficient explanation for the CT-free claim—clarifying that it applies to inference time—and transparently acknowledge the dependency on CT-derived labels during training, while proposing clinically relevant future directions to address this. While some points (e.g., uncertainty quantification, lack of statistical significance testing, and clinical deployment discussion) remain weak or only partially addressed, these limitations do not undermine the core technical contributions of the paper. The method is both technically sound and clinically motivated, offering an innovative yet relatively simple framework for PET-only segmentation under ultra-low-dose conditions. The paper presents a clear methodological innovation—semantic self-denoising to guide segmentation—validated across two distinct tracers (FDG and FAPI) with per-organ performance and realistic low-dose simulation settings. Given the limited availability of public PET datasets and the practical challenges of low-dose imaging, this paper addresses an important and timely clinical problem. Overall, while some methodological aspects could be more thoroughly validated (e.g., uncertainty, clinical workflow readiness), the work merits publication, especially considering its potential as a foundation for future studies on PET-only learning and robust segmentation in low-dose scenarios.

Author Feedback

Thank you for the insightful comments.

Potential Misunderstandings & Clarifications

MAE Analogy (R4 Q1): We liken LDPET to a ‘naturally masked’ FDPET (distinct from MAE’s binary masks). This analogy highlights our core motivation: semantic patterns learned from reconstruction can aid segmentation. MAE-like pretraining is impractical for PET (limited data/labels). LDOS adapts this: self-denoising learns semantic features aiding segmentation, effective on smaller datasets without extensive pretraining. We will revise the ‘pixel-level masking’ description.

Fig2 & Decoder & Table4 (R4 Q2/3/8): Apologies for unclear parts. Fig2: Horizontal layout shows feature stacking. ‘Dc’ (segmentation): Single decoder’s final layer,outputs C classes. Table4: LDPET-LDOS/ow misstatement, means LDOS without L1 loss weight attenuation.

Architecture Design&Scope &Evaluation (R5 Q1/5/6/10-15): Our intentionally simple architecture (residual stacking common; no masked input [misunderstanding]) shows core contribution’s generalizability: learning LDPET self-denoising patterns to aid segmentation, no large pretraining needed. The promising results validate this. 5% dose (extreme case for method robustness validation via clinical discussion) confirms feasibility. Higher doses improve performance; practical using should balance safety/scanner noise. As this work serves as an initial validation of the core concept, complex architectures/variants & detailed clinical evaluations could be future work on this foundation.

‘CT-Free’ Claim (R5 Q9): Applies at inference currently (PET-driven Ground Truth [GT] unfeasible in current clinical workflow). Future: clinician-corrected LDOS outputs can serve as PET-derived GT and greatly reduce annotation workload, making PET GT updates feasible. Subsequent LDOS development can lessen CT training reliance, enabling progressive PET-only segmentation.

Dataset and Method Details

Dataset & GT (R3 Q1, R4 Q6): Excluding significant misalignments, we used 52 FDG & 60 FAPI scans from a UIH uEXPLORER (Total-Body) PET/CT to validate our synergistic learning strategy’s effectiveness with small samples, thereby underscoring its design for robust self-learning without reliance on extensive pre-training. GT was from TotalSegmentator, then radiologist-corrected on co-registered CT and resampled to PET resolution. Five-fold cross-validation was performed separately per tracer.

LDPET Simulation & Misalignment (R4 Q4/6, R5 Q4/7): LDPET was simulated by reducing FDPET (OSEM) scan duration from 300s (100%) to 15s (5%) (Simulations may subtly differ from real-world data; we adopted it due to real LDPET data scarcity. Time-sampling, a widely accepted simulation method, approximates true data. Reason: minimal tracer kinetics/motion effects in Total-body 300s short scan; Signal-to-Noise ratio correlates with scan time/activity). Misalignment data augmentation (Ref 12;not core contribution) was applied to all methods in preprocessing to reduce influence of GT misalignments. (R3 Q2: Respiratory & patient motion are main causes; will clarify.)

Training & Hyperparameters (R3 Q3/4, R4 Q5/7): We trained 500 epochs. ‘Scales’ refer to nnU-Net’s multi-resolution deep supervision. Loss parameters w_z, v_z were fixed (from nnU-Net). Other hyperparameters were optimized via ablation (optimal results are reported).

We have prepared a thorough revision clarifying: the ‘masking’ analogy, Fig2, Table4, the ‘CT-free’ (inference) and dataset aspects.

Other R5 Clarifications

Per-organ metrics (Q2): Per-organ results have shown in Table1 & Fig4.

Denoising side effects (Q3): ‘Self-denoising’ learns semantic features aiding segmentation, not directly restoring FDPET. This, with loss weight decay, mitigates over-smoothing.

Architectural Design (Q8): Sharing encoder is a proven approach in numerous works. Dual decoders offer dedicated denoising/segmentation path for more specialized optimization than single-decoder multi-tasking/loss design.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

Although the mixed reviews the authors have addressed the concerns and the reviewers agree on the relevance of the topic and the interest in the proposed solution

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

Self is the Best Learner: CT-free Ultra-Low-Dose PET Organ Segmentation via Collaborating Denoising and Segmentation Learning

Author(s):