Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

High-frequency oscillations (HFOs) in intracranial Electroencephalography (iEEG) are critical biomarkers for localizing the epileptogenic zone in epilepsy treatment. However, traditional rule-based detectors for HFOs suffer from unsatisfactory precision, producing false positives that require time-consuming manual review. Supervised machine learning approaches have been used to classify the detection results, yet they typically depend on labeled datasets, which are difficult to acquire due to the need for specialized expertise. Moreover, accurate labeling of HFOs is challenging due to low inter-rater reliability and inconsistent annotation practices across institutions. The lack of a clear consensus on what constitutes a pathological HFO further challenges supervised refinement approaches. To address this, we leverage the insight that legacy detectors reliably capture clinically relevant signals despite their relatively high false positive rates. We thus propose the Self-Supervised to Label Discovery (SS2LD) framework to refine the large set of candidate events generated by legacy detectors into a precise set of pathological HFOs. SS2LD employs a variational autoencoder (VAE) for morphological pre-training to learn meaningful latent representation of the detected events. These representations are clustered to derive weak supervision for pathological events. A classifier then uses this supervision to refine detection boundaries, trained on real and VAE-augmented data. Evaluated on large multi-institutional interictal iEEG datasets, SS2LD outperforms state-of-the-art methods. SS2LD offers a scalable, label-efficient, and clinically effective strategy to identify pathological HFOs using legacy detectors. The code is available at https://github.com/roychowdhuryresearch/SS2LD.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/1486_paper.pdf

SharedIt Link: https://rdcu.be/eHwY9

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04984-1_46

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/roychowdhuryresearch/SS2LD

Link to the Dataset(s)

open ieeg dataset: https://openneuro.org/datasets/ds005398/versions/1.0.1 zurich ieeg dataset: https://openneuro.org/datasets/ds003498/versions/1.1.1

BibTex

@InProceedings{ZhaYip_SelfSupervised_MICCAI2025,
        author = { Zhang, Yipeng AND Ding, Yuanyi AND Duan, Chenda AND Daida, Atsuro AND Nariai, Hiroki AND Roychowdhury, Vwani},
        title = { { Self-Supervised Distillation of Legacy Rule-Based Methods for Enhanced EEG-Based Decision-Making } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15967},
        month = {September},
        page = {477 -- 486}
}

Reviews

Review #1

Please describe the contribution of the paper

(1) A self-supervised framework (SS2LD) that leverages legacy rule-based detectors to discover pathological HFOs without human labels (2) A VAE-based morphological pre-training approach that learns meaningful latent representations of HFO events (3) A weak supervision discovery method that uses clustering in the latent space to identify pathological events (4) A data augmentation strategy using the VAE’s generative capabilities to enhance classifier robustness (5) Comprehensive evaluation on large multi-institutional datasets demonstrating superior clinical performance
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

(1) Novel self-supervised approach: The method addresses a critical challenge in biomarker discovery without requiring extensive human labeling, which is particularly valuable given the low inter-rater reliability and lack of consensus in HFO annotation. (2) Clinical relevance: The paper targets a significant clinical problem in epilepsy treatment, with direct implications for patient outcomes. The framework is evaluated using clinically meaningful metrics (surgical outcome prediction and classification specificity). (3) Strong technical foundation: The approach combines multiple machine learning techniques effectively, including variational autoencoders, clustering for weak supervision, and data augmentation through generative models. (4) Robust evaluation: The method is validated on large multi-institutional datasets, demonstrating generalization across different clinical settings. (5) Interpretable latent space: The paper provides insightful analysis of the learned latent dimensions, showing how they encode neurophysiologically meaningful features of HFOs, which enhances the method’s clinical interpretability.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

(1) Limited ground truth validation: While the paper uses clinical outcomes as a proxy for evaluation, the lack of direct ground truth for individual HFO events makes it difficult to fully validate the accuracy of the method at the event level. (2) Simplistic clustering approach: The hierarchical k-means clustering with k=2 at each step may oversimplify the complex morphological differences between different types of HFOs. More sophisticated clustering approaches might better capture the nuanced variations in HFO morphology. (3) otential selection bias: The method of determining which cluster represents pathological HFOs (based on predominant location within the resected region) may introduce circular reasoning, as the evaluation is also based on resection outcomes. (4) Computational considerations: The paper does not thoroughly discuss the computational resources required for inference, which is important for potential clinical deployment. Information about training and inference time would enhance the evaluation of clinical applicability. (5) Limited generalizability assessment: While the paper evaluates the method on two datasets, both are from similar clinical contexts (epilepsy monitoring). The generalizability to other settings or different EEG recording conditions is not explored.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

(1) The paper could benefit from a more detailed discussion of how the proposed method might integrate with existing clinical workflows. What would be the practical steps to implement this approach in a clinical setting? (2) It would be valuable to include a discussion of how the learned latent representations might be used for other clinical tasks beyond HFO classification, such as patient stratification or long-term monitoring. (3) While the paper focuses on epilepsy, it would be interesting to explore how this self-supervised approach might extend to other neurological conditions where EEG biomarkers play a role. (4) The paper mentions the challenge of low inter-rater reliability in HFO annotation. It would be valuable to discuss how the proposed method might be used to improve consensus among clinicians.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

(1) Methodological concerns: While the SS2LD framework presents an interesting self-supervised approach, the core weak supervision method relies on a simplistic hierarchical k-means clustering with k=2 at each step. This may oversimplify the complex morphological differences between HFO types. The decision to label the cluster “predominantly located within the resected region” as pathological introduces circular reasoning, since the evaluation is also based on resection outcomes. (2) Insufficient clinical deployment considerations: Despite the clinical focus, the paper lacks discussion of practical implementation aspects, such as computational requirements, integration with existing workflows, and real-time processing capabilities. These are crucial for translating the method to clinical practice. (3) Marginal performance improvements: While the method outperforms existing approaches, the improvements are modest (e.g., ACC from 0.594 to 0.612 on the Open iEEG dataset). Given the complexity of the proposed pipeline, it’s unclear whether these gains justify the additional computational overhead. (4) Limited ground truth validation: The paper lacks direct validation against expert-annotated ground truth at the individual HFO event level. While using clinical outcomes as a proxy for evaluation is understandable given the challenges in obtaining reliable annotations, this indirect evaluation makes it difficult to fully assess the method’s accuracy in identifying truly pathological HFOs.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper presents an unsupervised method for automatically distilling pathological high-frequency oscillations (HFOs) from the large volume of events detected by legacy HFO detectors in intracranial EEG recordings. The proposed approach is evaluated on multi-institutional datasets and demonstrates superior performance compared to two state-of-the-art pipelines for classifying pathological HFOs.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The strengths of the paper include a well-motivated application, a clear and well-structured methodology, and comprehensive evaluations that effectively demonstrate the advantages of the proposed approach. The use of diverse, multi-institutional datasets further supports the generalizability and robustness of the method.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

A significant structural weakness of the paper is the absence of dedicated Discussion and Conclusion sections. While some interpretation is provided within the Results section, it is not sufficient for a thorough understanding of the implications, limitations, or potential extensions of the work. A clear discussion is especially important for situating the findings within the broader context of related research.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Several methodological details require clarification:
- The use of LKL is not clearly defined or explained.
- The choice of k = 2 in the k-means clustering is not justified, and it is unclear whether this decision was empirically or theoretically motivated.
- The assumptions underlying the weakly supervised classification of pathological versus physiological HFOs are not sufficiently discussed, which limits the interpretability and reproducibility of the approach.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Although the paper lacks dedicated Discussion and Conclusion sections—which is a notable structural weakness—I do not believe this alone warrants rejection. The motivation is clear, the application is relevant, and the method presents sufficient novelty. It is simple yet effective, and the experimental evaluation is sound and appropriately conducted. I recommend that the authors revise the paper to include a structured discussion section, offering deeper insights into the findings, limitations, and potential implications. With this revision, the paper would better adhere to scientific standards and significantly improve in clarity and impact.
Reviewer confidence

Somewhat confident (2)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

The paper demonstrates strong quality in motivation, methodology, and experiments. The authors have properly addressed the reviewers’ questions in the rebuttal, making acceptance well justified.

Review #3

Please describe the contribution of the paper

This paper tackles the problem of low precision, resulting from large number of false positives, in existing rule-based methods for detection of High-frequency oscillations (HFOs) in intracranial EEGs. The work is interesting and impactful given that it is validated against post-operative seizure outcome for real patients. Rule-based biomarker detectors are based on many years of clinical research and have high recall but suffer from low precision; the false positives resulting from these detectors must be removed after manually examination which is a laborious exercise. The core idea of this paper is to employ self-supervised learning based on VAEs to automatically distill pathological HFOs (by removed false positives).
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The main strengths of this paper are below:
1. It is is well organised, well written and provides a good overview of the current state-of-the-art in the problem domain.
2. Performance is validated against datasets from multiple institutions and the performance metrics take the surgical/clinical outcomes of the patients into account.
3. A novel approach is developed to tackle the problem at hand and the results an improved performance compared to recent methods; one, [9], based on weakly supervised learning and one , [22], based on conventional supervised learning.
4. Implementation details are provided and seem sufficient in case third parties want to verify or reproduce the results.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
There a number of weaknesses although most of these are minor.
1. Authors have not explicitly stated whether they used all or a subset of the rule-based clinical detectors ([13], [15], [16] & [20]) in their experiments. Space permitting, I recommend adding this detail in the methods or implementation section.
2. Authors have not quantified how much performance improvement was achieved compared to rule-based biomarker detectors. It would have been useful to see how much improvement in precision was obtained by using the proposed approach.
3. In section 2.1 the same symbol ‘e_i’ has been used to denote a channel event the event end time. I suggest using different symbols for clarity.
4. On the last line of page-3 I recommend adding a symbol denoting the VAE loss function in parenthesis, this may be useful for readers who are not very familiar with VAEs. For example, something like “beta on the original VAE loss function (L_KL)…”
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I recommend acceptance of this paper because it develops an interesting approach for tackling a real-world problem. The experiments and methodology are clearly described and demonstrate the effectiveness of the developed approach. I also appreciate the fact that the authors have added the clinical outcomes of patients into their performance evaluation metric.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

We thank the reviewers for their constructive feedback and thoughtful suggestions. We appreciate the recognition of our work’s novelty, clinical relevance, simplicity, and thorough evaluation. As suggested, we will improve clarity by refining notation and explicitly defining the VAE loss. Detailed responses to all weaknesses (W), optional comments (O), and justifications (J) follow.

[R1-W1] Choice of Rule-Based Detector: [9] showed that combining the two most widely used detectors[13,20] achieves the best recall of pathological events. We follow this practice in Sec.3.1.

[R1-W2] Rule-Based Detectors in Evaluation: Directly using rule-based detectors introduces many artifacts[25]. It is widely accepted (see [22,9]) unrefined HFOs always perform worse, so we excluded rule-based detectors from our evaluation to focus on competitive SOTA refined biomarkers.

[R2-W1,J4] Ground Truth Validation: There is no computational definition of pathological HFOs[24]. Clinically, spkHFOs are often used as a proxy, but their identification is subjective and lacks consensus (abstract&intro). Thus, surgical outcome prediction remains the most appropriate strategy. SS2DL outperforms the supervised model[22] trained from spkHFO annotations, showing the advantage of SS2LD.

[R2-W2,J1],[R3-O2] Clustering: The choice of k=2 in each stage is guided by clinical evidence suggesting three primary HFO types (Fig.1). Despite its simplicity, our hierarchical clustering yields semantically meaningful groupings and highlights the capacity of the VAE to learn HFO morphology.

[R2-W3,J1] Selection Bias: Pathological cluster labels are assigned only on the training set, avoiding circular reasoning. All evaluations are performed on the held-out test set.

[R2-W4,J2,O1] Computational Considerations: Our VAE uses a lightweight 4-layer encoder and decoder(~8M param), supporting easy deployment in clinical settings. Inference can be completed in standard real-time neuroprosthetic systems in the operating room.

[R2-W5,O2,O3] Generalizability: The VAE has broad applicability beyond epilepsy research, which is just one downstream task. Its pretraining enables morphological understanding of iEEG beyond HFO classification. By analyzing wideband TF plots(10–290Hz), it captures diverse non-pathological characteristics(Fig.2); e.g. a specific morphology of hippocampal ripples is directly correlated with the execution of a cognition task about episodic memories, and its absence could act as a biomarker for early-stage dementia. Thus, it can support any research that relies on fine-grained iEEG morphology.

[R2-O4] Applicability of SS2LD: With objective prediction, SS2LD can train junior clinicians and serve as a reference tool for borderline cases, potentially improving inter-rater reliability.

[R2-J3] Marginal Improvements: Given class imbalance in outcomes, we argue that F1 score, specificity(event-based metric), and performance on an external dataset (Zurich) are better robustness indicators. On both datasets, SS2LD achieves a substantially larger margin than prior supervised[22] and weakly supervised[9] works.

[R3] Conclusion: SS2LD introduces a self-supervised approach that outperforms supervised methods while offering explainable decision-making. The VAE pretraining yields a computational latent space that is interpretable via its decoder and could be applicable to various iEEG applications. However, the current model excludes frequency bands outside 10–290 Hz, potentially overlooking informative signals. Future work will extend the evaluation to broader frequency ranges, additional biomarkers beyond HFOs, and more diverse datasets. With these advancements, SS2LD has the potential to be seamlessly integrated into clinical workflows, providing objective, interpretable, and time-efficient pathological HFO predictions to support more effective treatment planning.
We will include a more structured and detailed Conclusion section. Thank you again!

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

This work is novel and the authors have addressed most concerns of the reviewers in their rebuttal.

back to top

Self-Supervised Distillation of Legacy Rule-Based Methods for Enhanced EEG-Based Decision-Making

Author(s):