List of Papers Browse by Subject Areas Author List
Abstract
Epidermal growth factor receptor (EGFR) mutation status is crucial for targeted therapy planning in lung cancer. Current identification relies on invasive biopsy and expensive gene sequencing. Recent studies indicate that CT imaging with advanced deep learning techniques offer a non-invasive alternative for predicting EGFR mutation status. However, CT scanning parameters, such as slice thickness, vary significantly between different scanners and centers, making the predicting models highly sensitive to data types, and thus not robust in clinical practice. In this study, we propose Feature Copy-Paste Network (FCPNet), an innovative and robust model for predicting EGFR mutation status using CT images. First, we propose a novel Feature Copy-Paste Consistency (FCPC) module to exchange the information from CT scans with different slice thicknesses and impose consistency constrain to make model more robust. Second, we introduce a Feature Refinement (FR) module to filter redundant features during information fusion, thereby enhancing the accuracy of mutation prediction. Extensive experiments demonstrate the outstanding performance of the FCPC and FR modules. When the trained model is tested on both thin-slice and thick-slice CT images, it achieves at least 2.6% and 2.1% improvements in AUC, respectively, indicating the models’ robustness and stability. Our code is available at https://github.com/499huangxingyu/FCPNet.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2645_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/499huangxingyu/FCPNet
Link to the Dataset(s)
N/A
BibTex
@InProceedings{HuaXin_Feature_MICCAI2025,
author = { Huang, Xingyu and Wang, Shuo and Liu, Chengcai and Sang, Haolin and Wu, Yi and Tian, Jie},
title = { { Feature Copy-Paste Network for Lung Cancer EGFR Mutation Status Prediction in CT images } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15974},
month = {September},
page = {210 -- 220}
}
Reviews
Review #1
- Please describe the contribution of the paper
This paper proposes a novel deep learning model, FCPNet, to predict EGFR mutation status in lung cancer patients using CT images with varying slice thicknesses. To address the challenge of heterogeneity between thin- and thick-slice images, the authors introduce a Feature Copy-Paste Consistency module, which enforces consistency across feature representations, and a Feature Refinement module to suppress redundant features and enhance discriminative ones. The method operates in the feature space, avoiding disruptions to anatomical structure common in image-level augmentation. The approach yields improved robustness and generalization, with up to 2.6% and 2.1% AUC improvements on thin- and thick-slice test sets, respectively, compared to strong baselines.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper addresses an important and practical challenge in medical imaging—robust EGFR mutation prediction from CT scans with varying slice thicknesses—by proposing a solution that directly tackles inter-scan heterogeneity. The proposed FCPNet shows clear performance improvements over existing methods, achieving the highest AUCs on both thin- and thick-slice test sets, which demonstrates its effectiveness across heterogeneous input domains.
The FCPC module is a notable strength, offering a well-motivated extension of image copy-paste to the feature space, thereby preserving anatomical integrity while enabling cross-domain interaction. The paper shows that introducing feature consistency constraints using mutual information loss significantly improves prediction performance (Figure 3a).
The FR module further enhances the model by filtering redundant features using orthogonal projection, which is shown to improve AUCs in ablation studies (Table 2), validating its role in refining the learned representation.
The authors also conduct comprehensive ablation studies, systematically evaluating the contribution of each module and design choice (e.g., loss type, patch size), which strengthens the empirical grounding of their method. Additionally, the model demonstrates robustness under limited supervision, maintaining relatively high AUCs even when trained on just 20% of the data (Figure 4), a crucial consideration for clinical deployment where annotated data is scarce.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The paper would benefit from a stronger and more clinically grounded motivation for addressing the EGFR mutation prediction problem. A clearer link to the potential clinical impact could significantly improve reader engagement.
- The manuscript has several formatting issues. There should be consistent spacing between text and references (e.g., “cancer [5]” instead of “cancer[5]”). In Section 2.1, “loss” should be corrected to “lose,” and the incorrect punctuation “;” should be replaced with a period. Additionally, the caption for Figure 2 could be made more descriptive to aid interpretation.
- The authors state that interaction between image features with varying slice thickness is crucial and not adequately addressed by prior work. However, [14] explicitly describes methods to model the relationship between thin and thick slices and integrate their information. The authors should clarify how their approach offers a distinct or improved handling of this interaction.
- The paper claims that feature-space copy-paste outperforms image-level copy-paste. Since [1] applies an image-level ICP strategy to CT data, a direct comparison with this method would strengthen the argument. Without such comparison, the claim that feature-level copy-paste is superior remains insufficiently supported.
- The concept of feature consistency constraints is central to the proposed method. However, the paper does not clearly explain what these constraints are or how they are formulated. A more detailed explanation would help the reader understand the mechanism and its effect on training.
- The source of feature redundancy, which motivates the use of the Feature Refinement module, is not discussed. Clarifying where the redundancy originates (e.g., modality mixing, repeated features, noise) would better justify the module’s inclusion.
- The sentence, “Here, f(x,y) represents the joint probability, and f(x) and f(y) denote the respective marginal probabilities,” should be paraphrased for improved readability and to avoid redundancy, as this explanation is standard and may not add value in its current form.
- According to Table 1, FCPNet achieves lower sensitivity and specificity scores than some baseline models. This discrepancy is not addressed in the discussion. The authors should provide possible reasons or hypotheses to explain this behavior, especially since AUC alone does not capture all performance aspects.
- The DN-thin model performs well on both thin- and thick-slice test sets, while the reverse does not hold for DN-thick. An explanation for this asymmetry would enhance the reader’s understanding of model generalizability across domains.
- The explanation accompanying Table 2 could be improved. The authors should clarify which variant of DenseNet is used as the baseline and which row represents baseline results. Additionally, they should explain why adding FCPC alone results in a drop in thin-slice AUC, and why the Feature Refinement module yields such a significant improvement.
- For Figure 3, the use of different consistency loss functions (e.g., MSE, NCC) should be discussed more clearly in the main text. Furthermore, the choice of the (180, 180) patch size should be justified beyond empirical performance—explaining the balance between information retention and distortion would provide valuable insight.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(3) Weak Reject — could be rejected, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Recommendation: Weak Reject.
The paper addresses a clinically relevant and important problem—predicting EGFR mutation status from CT scans while accounting for variability introduced by different scanners and slice thicknesses. This is a critical step toward improving model robustness and generalizability in real-world settings. However, the paper requires substantial revisions to enhance clarity and support the presented claims. In particular, the methodology and justification for key components (such as the feature consistency constraints and the refinement module) need to be more thoroughly explained. The discussion of results should also be expanded to provide deeper insights, especially in cases where performance metrics diverge. For future iterations, I recommend focusing on clearer motivation, better comparative analysis, and improved articulation of the method’s strengths and limitations.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
It is clear from the feedback that the authors are equipped with the understanding related to the weaknesses that were mentioned. However, I still recommend adding the following to the paper: 1) explanation of the lower SP + SN scores, 2) justification of DN-thin vs DN-thick generalizability, 3) explanation of the effect of FCP and FR in Table 2, and 4) justification for the use of loss in Figure 3 and patch size. I understand that there is a page limit, however, these revision will strengthen the paper and provide more clarity for the reader.
Review #2
- Please describe the contribution of the paper
The paper describes a novel architecture to predict EGFR status in patients with non-small cell lung cancer, that is robust to the use of thin or thick slice images. The model shows performance that is state of the art, and relatively consistent across thin and thick slices images, with thin slices having somewhat worse sensitivity. Thin and thick slice images, along with pairs with random patches exchanged are all fed into the network, and a mutual information loss is used to encourage consistency between all the variant input images.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
Good experiments are done to compare with state of the art, as well as thorough ablation studies to justify the various added components.
Also, generalization of performance across images with different slice thicknesses is a common problem in many MIC applications, and so the paper addresses an important issue.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The paper is lacking details in how the inference works when only one image (thin or thick slice) is used for input. What happens with the feature refinement module, etc, in that case?
It is hard to interpret some of the results without knowing the class balance.
While the paper covers related work using deep learning methods, little attention is paid to other approaches, such as radiomics, which have reported similar performance in previous works (e.g.
https://doi.org/10.1158/0008-5472.CAN-20-0999 https://doi.org/10.1016/j.lungcan.2019.03.025 )
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Generally I like the paper, but think it needs some additional clarifications.
As mentioned, we need to know the class balance in the data set, and we need to also know how patients were separated between the training and test sets (since some patients seem to contribute multiple image pairs).
We would also need to know how regions were chosen for copy paste? And once the patch is chosen, the region size is listed as, say, (180x180), but these are 3D images, so how is the 3rd dimension being handled?
What model was used for segmentation in the cropping/resizing step?
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Overall I like the paper and think it addresses an important problem. I recommend it should be accepted. I do think there are a few points of clarification necessary, as indicated above. I also think the information on the class balance could affect the interpretation of the results. So I am rating it a 4.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
Accept
- [Post rebuttal] Please justify your final decision from above.
The authors have adequately answered the questions of both myself and the other reviewers. I continue to support acceptance.
Review #3
- Please describe the contribution of the paper
EGFR is needed for therapy planning but current methods are invasive and expensive gene sequencing. So there have been several deep learning-based methods to predict EGFR mutation status from CT images. But scanning parameters vary for CT images - especially slice thickness. Previous methods have used domain adaptation (QSNet) and contrastive learning (PLCHNet) to account for these variations.
This paper introduces feature-based data augmentation to account for data variations:
The FCPNet model takes as input a pair of thin and thick-slice CT images. They first undergo copy-paste operation in the image space where a patch of pixels is exchanged between the two images. Next these two pairs (original and copy-pasted) are passed through an encoder to obtain feature encodings. The feature encodings for all 4 images are upsampled to the full input size. The upsampled encodings of the copy-pasted pair undergo another copy-paste operation (this time in the feature space) in the same region as the image patches. These copy-pasted features should be consistent with the original image pair features. This is ensured using mutual information based consistency loss.
A feature refinement (FR) module uses orthogonal projection of the encoding features to remove redundancy from the encodings.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
- The paper is well-written and easy to follow.
- The results of the proposed model were compared with 5 other models - all trained and tested on the same dataset. The proposed model outperformed the rest on measures of accuracy and AUC.
- The work presents an ablation study comparing performance of the model with and without the FCPC and FR modules. The ablation study is very helpful to assess the utility of the introduced FCPC and FR modules.
- The results present a thorough analysis for the choice of loss function and effect of patch size on the performance of the model.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- I’m a little unclear on the test dataset. The dataset description mentions 1633 thin-thick image pairs for test dataset. Did the model use image pairs as input during test time? If yes, how was the performance reported on thin and thick slices separately? Were the same pairs presented to PLCHNet and DN-co models?
- The comparison models report much higher AUCs on their own datasets in their respective papers. The evaluations need to be performed on multiple community datasets to test the true generalizability of the models (what the introduced FCPC and FR module supposedly help with).
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
If I understood the architecture correctly, the colors in Figure 1 LCPC module might need some correction. F_A should be fully blue. F_B should be fully orange. F_A’ should have an orange patch. F_B’ should have a blue patch. F’_A’ will be fully orange (maybe highlight the PF_B’ with red outline) and F’_B’ will be fully blue (highlight PF_A’ with red outline).
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Well-written. Strong evaluation and comparisons. Ablation studies are really helpful to understand the utility! Analysis presented for choice of the architectures.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
R#1 Comment 1:The paper lacks details on inference and the FR module’s use with a single image. Response: If only one image is available in inference, we duplicate it to form an input pair. Since the ICP operates on identical images, the angle between f_p and f_a’/f_b’ is nearly 0(Fig 1), so the FR module will preserve the original feature f_p Comment 2(Opt.1): Class balance and details Response: There are 3433 patients and 5362 image pairs totally (page 5), with no patient overlap between training and test sets. The training set has 2271 positive (mutation) and 915 negative (normal) pairs; the validation set has 371 positive and 172 negative pairs; the test set has 1195 positive and 468 negative pairs Comment 3:The paper lacks discussion of radiomics-based methods Response: We reviewed the two referenced studies, and will introduce and discuss in the revised Introduction Opt. 2-3:How were 3D copy-paste regions selected, and what model was used for segmentation? Response: The patch size in 3D copy-paste is Z×180×180. Z is CT slice number in a patient. nnUNet is used for segmentation in preprocessing R#2 Comment 1-2, 5, 7:The paper would benefit from a stronger and more clinically grounded motivation. Several formatting issues and vague descriptions should be addressed Response: Thanks for the valuable suggestions. We revised f(.) to p(.) in page 5, thoroughly revised the introduction, formats and clarified some descriptions using equations following your suggestions Comment 3-4:Prior work [14] models consistency between thin and thick slices, and [1] applies image-level copy-paste to CT data. Please clarify how your method differs from them and why it performs better Response: [14] aligns thin/thick features only at prediction level, while our method enables interaction in intermediate layers via ICP and FCP, followed by refinement and consistency constraints, and achieves better performance. [1] only uses image-level copy-paste also lacks feature-level interaction, and shows worse results in experiment Comment 6:Please clarify the source of feature redundancy that motivates the FR module Response: The regions outside the copy-paste area are the same between the ICP image and the original image, which generate redundant features Comment 8-9:Table 1 shows FCPNet has lower SEN and SPE than some baselines, and DN-thin generalizes better than DN-thick. Please explain Response: Our method shows slightly lower SEN or SPE than some models. However, our model shows higher SEN+SPE than others, and achieves the highest ACC and AUC(Table 1). Since thin image contains richer structural details, the DN-thin learns fine-grained features that can generalize to thick image. However, thick image lacks such details, and the DN-thick shows poorer performance on thin images Comment 10:Which DenseNet variant is used as baseline in Table2. Why does FCP alone reduce thin-slice AUC while FR improves it? Response: We use DenseNet121 as the baseline (first row in Table 2). Using FCP alone can introduce redundant features that decrease its performance. However, the FR module helps reduce such redundancy and further improves performance Comment 11:Please discuss the consistency losses used in Fig3 and explain the choice of the 180×180 patch size Response: We compared three losses, and found MI is more suitable (Fig 3) as it better aligns feature distributions. In CP operation, too large patch may break original image structure, while small patch causes insufficient exchange. After testing five sizes, we finally chose 180×180 (Fig 3) R#3 Comment 1-2:Are image pairs used during testing? how are thin/thick results reported separately? More datasets are needed to verify generalizability, and Fig1 should be revised Response: When testing thick image results, we duplicate the thick image to form an input pair (similar in thin image results). We acknowledge the dataset limitation and will include more public datasets in future work. Fig 1 is revised accordingly
Meta-Review
Meta-review #1
- Your recommendation
Invite for Rebuttal
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #2
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A
Meta-review #3
- After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.
Accept
- Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
N/A