Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

We address image segmentation in the domain-incremental continual learning scenario, a use-case frequently encountered in medical diagnostics where privacy regulations and storage constraints prevent access to historical data. In this scenario, segmentation models must learn to cope with new domains (e.g., difference in imaging protocols or patient population) while maintaining performance on previously learned domains without full access to past data. Feature-based replay addresses the privacy concerns by only storing latent feature representations instead of original images. However, existing feature replay approaches have a critical limitation: they sacrifice U-Net skip-connections, which are essential for achieving high segmentation accuracy and fast convergence. This limitation significantly impacts clinical viability, especially when alternatives such as full model retraining or maintaining domain-specific models are available. Therefore, we propose feature replay with optimized channel-consistent dropout for U-Net skip-connections (FOCUS). FOCUS enables crucial skip-connections in feature replay while respecting privacy and storage constraints, and integrates recent domain generalization techniques based on data augmentation. Evaluation across two domain-incremental continual MRI segmentation settings demonstrates that FOCUS achieves substantial improvements (up to 21% average DSC) over existing methods, while saving only 0.5% of the original feature information per domain.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/0460_paper.pdf

SharedIt Link: https://rdcu.be/eHxcc

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05185-1_22

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/imigraz/FOCUS/

Link to the Dataset(s)

Prostate: https://liuquande.github.io/SAML/ Hippocampus: http://www.hippocampal-protocol.net/SOPs/index.php

BibTex

@InProceedings{JohSim_FOCUS_MICCAI2025,
        author = { Joham, Simon Johannes AND Thaler, Franz AND Hadzic, Arnela AND Urschler, Martin},
        title = { { FOCUS: Feature Replay with Optimized Channel-Consistent Dropout for U-Net Skip-Connections } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15973},
        month = {September},
        page = {219 -- 229}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper proposes a approach for domain-incremental continual MRI segmentation.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. Combining domain-balanced feature sampling and global intensity non-linear augmentation (GIN) strengthens domain generalization and stability. 2.Demonstrates up to 21% average DSC improvement over SOTA.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The method part is a combination of multiple modules, without original innovation.
2. Why CCD can eliminate domain-specific information?
3. The article lacks clarity in its problem definition. While the authors mention multiple aspects—including continual learning, U-Net skip-connections, privacy preservation, and storage efficiency—it remains unclear what the central problem is that the paper aims to solve. The narrative introduces many technical components, but the core research question and the specific limitations of existing methods that the proposed approach addresses are not explicitly or coherently stated. 4.Paper lacks visual comparison for segmentation output.
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(2) Reject — should be rejected, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
1. The method part is a combination of multiple modules, without original innovation.
2. The article lacks clarity in its problem definition. While the authors mention multiple aspects—including continual learning, U-Net skip-connections, privacy preservation, and storage efficiency—it remains unclear what the central problem is that the paper aims to solve. The narrative introduces many technical components, but the core research question and the specific limitations of existing methods that the proposed approach addresses are not explicitly or coherently stated.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper
This paper proposes a method to improve continual learning in the context of domain-incremental semantic segmentation. The main contributions can be summarized as follows:
1. The authors introduced a channel-consistent dropout strategy with a high drop probability, accompanied with a feature removal from each domain keeping small proportion of features in each domain, enabling privacy protection and reduced memory requirement.
2. A sampling strategy is proposed where features are sampled uniformly across all past domains and between foreground and background regions, balancing the learning across domains and avoiding bias towards a certain domain.
3. The authors argue that when training on the first domain alone without additional strong data augmentation, the encoder cannot be assumed to have learned domain-invariant features. As a result, when training on following domains with encoder weights frozen as the previous works do, there will be a significant domain shift resulting in reduction of performance. To address this, a stronger augmentation strategy (GIN) is applied during training on the first domain, encouraging the learning of robust domain-invariant features.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
This paper proposes a method to improve continual learning in the context of domain-incremental semantic segmentation. The main contributions can be summarized as follows:
1. The authors introduced a channel-consistent dropout strategy with a high drop probability, accompanied with a feature removal from each domain keeping small proportion of features in each domain, enabling privacy protection and reduced memory requirement.
2. A sampling strategy is proposed where features are sampled uniformly across all past domains and between foreground and background regions, balancing the learning across domains and avoiding bias towards a certain domain.
3. The authors argue that when training on the first domain alone without additional strong data augmentation, the encoder cannot be assumed to have learned domain-invariant features. As a result, when training on following domains with encoder weights frozen as the previous works do, there will be a significant domain shift resulting in reduction of performance. To address this, a stronger augmentation strategy (GIN) is applied during training on the first domain, encouraging the learning of robust domain-invariant features. The method is evaluated on two public domain-incremental MRI datasets, showing significant improvements of image segmentation performance over existing approaches. Ablation studies demonstrate the individual contribution of each proposed component.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The paper does not clearly explain how the features sampled by the Domain Sampling strategy are integrated into the network or combined with skip connections. This aspect currently requires access to the authors’ promised code release for full understanding. A clarification in the manuscript would be helpful.
2. In the Results or Experiments section, it would be helpful to specify what proportion of previous domain data each baseline method utilizes (e.g., as hinted by the “PP” and “RSF” columns in Table 3). This would provide clearer context for interpreting performance comparisons.
3. The definition of the AVG metric is unclear. Providing an example or explicit formula would help readers better understand the evaluation criteria.
4. The first row of the ablation study in Table 3 is ambiguous. Does it refer to training each domain without using any previous domain data? If so, why is the RSF score not zero? If not, please clarify the exact experimental setting used for this row.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper addresses the problem of domain-incremental continual learning in medical image segmentation under privacy and memory constraints, a practically important yet technically challenging scenario. The authors propose FOCUS, a feature replay strategy that introduces channel-consistent dropout to preserve U-Net skip-connections—crucial for segmentation performance—while also applying a carefully designed domain sampling strategy and strong data augmentation (GIN) to tackle domain shift. The proposed method is evaluated on two public MRI segmentation benchmarks and shows strong improvements over prior work.

Weaknesses / Reasons for Weak Accept:
1. Clarity in Methodological Details: Several key implementation details, particularly how sampled features are reintegrated with U-Net’s skip-connections, are unclear in the text. This hinders full comprehension of the method without the code release, which is only promised and not yet available.
2. Incomplete Baseline Context: The Results section lacks detail on how much prior domain information is retained for each baseline method. For fair comparison—especially under strict privacy constraints—this information is critical.
3. Metric Definitions and Experimental Setup Ambiguities: Important aspects like the AVG metric used in evaluation and the interpretation of ablation baselines (e.g., RSF scores in first row) are insufficiently described, leaving room for confusion.
4. Moderate Novelty: While the integration of channel-consistent dropout with replay and augmentation is well executed, each individual component is relatively incremental and may not be considered groundbreaking in isolation.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

This paper addresses the challenge of domain-incremental continual learning to enhance medical image segmentation. The authors propose a feature replay solution with channel-consistent dropout for U-Net skip connections, called as FOCUS. This approach respects privacy and storage constraints while integrating recent domain generalization data augmentation techniques. The effectiveness of this method is validated on two MRI segmentation tasks: prostate and hippocampus segmentation, with the authors reporting performance improvements over baseline methods.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The authors effectively tackle the challenge of domain-incremental continual learning, utilizing feature replay and channel-consistent dropout for U-Net skip connections. The incorporation of domain generalization techniques while maintaining data privacy and minimal storage requirements makes this work particularly compelling.
2. The paper is well-written, with a logical structure that facilitates readability and comprehension.
3. The methodology is clearly articulated and supported by well-defined equations and explanations.
4. The experimental design is robust, and the results are presented with clarity, aiding in the overall understanding of the findings.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The authors should clarify the following points in the methods section: a. After applying channel-consistent dropout (CCD), will channels still remain available, and if so, how are the values of these channels managed (e.g., setting others to zero)? b. Will the sparse features resulting from this dropout still retain essential information, such as edge or contour information, typically extracted in the initial layers of the network? c. The mechanism for leveraging features extracted from prior model training is slightly unclear. Are these previous features concatenated with current features, or are some current features replaced by these retrieved past features in the channel dimension?
2. A crucial point for this method to succeed is the expectation that feature distributions remain somewhat similar across different datasets. How would the method perform in scenarios where there are dramatic shifts in input dataset distributions due to unseen acquisition protocols or changes? Can the authors provide insights on this aspect?
3. In Table 1, what method is used in the second-to-last row (no skip, no CCD)? Specifically, is DFP being applied here?
4. The results in the last three rows of Table 1 suggest that CCD does not yield significant performance gains. Could the authors comment on this observation?
5. What is the distinction between the last two rows in Table 1? Does one utilize skip connections while the other does not?
6. Can the authors clarify the difference between the FOCUS method (third last row) and the last row (no CCD, 5% DFP) in Table 1?
7. Based on the ablation results in Table 3, does this imply that CCD may not be necessary, given the relatively good results observed in the sixth row overall in the ablation study with Skip, DBS, GIN active, CCD disabled and DFP set to 100, compared to FOCUS?
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The authors claimed to release the source code and/or dataset upon acceptance of the submission.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The paper presents a clear and innovative approach to continual image segmentation learning, effectively maintaining a small percentage of original feature information while ensuring data privacy. The well-executed experiments further enhance its potential contribution. However, the weaknesses particularly the need for additional clarity in some sections of the methods and results presentation, especially in Table 1 detract from its overall impact. Therefore, I recommend a weak accept, contingent upon the authors addressing these concerns to enhance the paper’s contributions to the field.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

I would like to thank the authors for their thoughtful rebuttal and for clarifying several points. After reviewing the rebuttal and the comments from other reviewers, I appreciate that some previously noted details have been addressed. However, I believe that certain aspects, such as the clarity of the tables and results presentation, could still be improved. Nonetheless, the paper presents a compelling approach to continual image segmentation learning, successfully maintaining a small percentage of original feature information while ensuring data privacy. The well-executed experiments further enhance its potential contribution.

Author Feedback

We thank reviewers for the constructive comments. We will update the paper accordingly. We appreciate that all reviewers acknowledge the large improvements of FOCUS over SOTA.

R1C1: FOCUS combines modules, lacks novelty. We beg to differ. Channel-consistent dropout (CCD) is an original contribution, used for the first time to address storage and privacy issues, while also enabling U-Net skip connections for feature replay. While true that our second contribution repurposes domain generalization (GIN) to continual learning (CL), this idea is thoughtfully motivated: The tunnel hypothesis (arXiv:2305.19753) implies that early layers are specialized to the first domain. Effective feature replay requires freezing the model encoder, which impairs the networks plasticity needed to adapt to domain changes. GIN counteracts this by promoting domain-invariant features when training on the first domain. Finally, while the domain-balanced sampling is technically simple, it is not used in related CL work. We show that it improves performance (Tab 3).

R1C2: Does CCD eliminate domain-specific information? No. CCD removes patient-specific information from feature maps. This misunderstanding is caused by an error in Sec 2: “(storing feature maps) introduces privacy and storage concerns as they contain domain-specific information”. “domain-specific” should be “patient-specific”.

R1C3: Lack of clarity in problem definition. We state the problem: domain-incremental continual MRI segmentation, and describe the technical challenges: ensuring privacy and limiting storage at the start of Sec 1. Further, we state in Sec 1 that related work removed skip-connections to preserve privacy, which we addressed with CCD. We will clarify this.

R1C4: Lack of visual comparison. A visual comparison is shown in Fig 2.

R2C1,R3C1c: How are features from past domains used? During training, the U-Net is alternately fed data from the current domain and sparse features from the previous domains. Stored sparse features are directly inserted as the output of the first convolution block. We will clarify this.

R2C2: Proportions of past domain data used by baselines? All baseline methods are privacy-protecting (PP), storing no past domain data. Instead, EWC, TED and MiB scale with the number of U-Net model weights. Thus, relative storage factor (RSF) is not applicable to these methods.

R2C3: Define the AVG metric. We agree that the AVG metric is defined ambiguously. It is defined same as Accuracy in (arXiv:1810.13166), thus tracking average performance over time. We will clarify this.

R2C4: Was previous domain data used in Tab 3 first row? Yes, using replay of exclusively bottleneck features. Hence, 0.1 RSF.

R3C1: Do the feature maps after CCD retain edge information, are channels still active? Results suggest they retain information needed to effectively train the network, visual interpretation is hard (Fig 1). Channels are active, see Eq 1.

R3C2: Performance under large distribution shifts? This is addressed by GIN, which exposes the network to large distribution shifts during training.

R3C3,5,6: Explain bottom rows in Tab 1. FOCUS: U-Net with skip connections, with CCD and with 5% of features from previous domains retained during training (DFP). Second-last row: same, but no skips, no CCD, and 100% DFP. Last row: with skips, no CCD, and 5% DFP.

R3C4,7: CCD yields no significant performance gain in Tab 1 and 3. CCD’s role is to limit the required storage and to protect privacy. It is not expected that it increases performance.

MR: Is FOCUS resistant to membership inference attacks (MIA)? Robustness to MIA can be promoted by using dropout during training (DOI:10.1145/3523273). We use aggressive channel-consistent dropout on top of the normal dropout inside convolution blocks, so we have strong reasons to believe FOCUS is resistant. While experimental analysis exceeds the scope of this paper, we will add MIA as an interesting direction for future work.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

In addition to the reviewers’ comments please clarify to what extent this method is susceptible to membership inference attacks.
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Reject
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

The authors propose a method for domain-incremental continual learning in MRI segmentation by integrating channel-consistent dropout and global intensity augmentation. While Reviewer #1 raised concerns regarding ambiguous methodology and reproducibility issues, Reviewers #2 and #3 offered weak accepts, acknowledging the relevance of the problem and the potential of the proposed solution. I recommend acceptance.

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’
In my opinion, the authors addressed all the important concerns and clarified some of the ambiguities in the original manuscript. Furthermore, looking at the scores before and after rebuttal, I think, overall the reviewers liked the paper. In that sense, I think the paper deserves acceptance considering the authors make the changes mentioned in the rebuttal.

As a meta-reviewer that came after the rebuttal, I have some minor concerns that I hope the authors can address either through the final manuscript or presentation (if accepted) or a future work. This minor concerns are:
- As far as I am aware, AVG, FWT and BWT are “metrics of metrics”. While they have been used with accuracy (as pointed out in the rebuttal) in natural image classification, there is no reason why these metrics could not be applied to other metrics (like DSC and MASD). I think the explanations would benefit from being clear.
- Almost all methods (except ccVAE) forget, as evidenced by the negative BWT. I know there is a space restriction for MICCAI but I would have loved a discussion on that.
- While the results suggest that CCD helps keeping information (it’s hard to argue with numbers), it is also true that the majority of features (85~90%) will not be used even before we consider DBS. I am curious on how that minimal information actually helps and I think it is worth trying to explore that further in a future work. For example, focusing only on whether a les random approach might work, whether specific features work and changing the %.

back to top

FOCUS: Feature Replay with Optimized Channel-Consistent Dropout for U-Net Skip-Connections

Author(s):