List of Papers Browse by Subject Areas Author List
Abstract
Artificial intelligence has recently shown promise in automated embryo selection for In-Vitro Fertilization (IVF). However, current approaches either address partial embryo evaluation lacking holistic quality assessment or target clinical outcomes inevitably confounded by extra-embryonic factors, both limiting clinical utility. To bridge this gap, we propose a new task called Video-Based Embryo Grading - the first paradigm that directly utilizes full-length time-lapse monitoring (TLM) videos to predict embryologists’ overall quality assessments. To support this task, we curate a real-world clinical dataset comprising over 2,500 TLM videos, each annotated with a grading label indicating the overall quality of embryos. Grounded in clinical decision-making principles, we propose a Complementary Spatial-Temporal Pattern Mining (CoSTeM) framework that conceptually replicates embryologists’ evaluation process. The CoSTeM comprises two branches: (1) a morphological branch using a Mixture of Cross-Attentive Experts layer and a Temporal Selection Block to select discriminative local structural features, and (2) a morphokinetic branch employing a Temporal Transformer to model global developmental trajectories, synergistically integrating static and dynamic determinants for grading embryos. Extensive experimental results demonstrate the superiority of our design. This work provides a valuable methodological framework for AI-assisted embryo selection. The source code is available at https://github.com/RIL-Lab/CoSTeM.
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2769_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
N/A
Link to the Dataset(s)
N/A
BibTex
@InProceedings{SunYon_TimeLapse_MICCAI2025,
author = { Sun, Yong and Wang, Yipeng and Shi, Junyu and Zhang, Zhiyuan and Xiao, Yanmei and Zhu, Lei and Jiang, Manxi and Nie, Qiang},
title = { { Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15972},
month = {September},
page = {594 -- 604}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper proposes using ViT transformer encoders for feature extraction for classification of IFV videos. These features are then processed by two branches: a Morphokinetic branch for spatial feature selection and a Morphological branch for temporal feature selection. Additionally, the paper introduces a novel dataset.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-
The paper delivers a new dataset and proposes a solid baseline for the task of IFV classification
-
As far as we are aware, the authors use stable baselines—drawing from both video recognition in IVF and general video recognition.
-
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-
It is not clear how this task is novel compared to existing IVF
-
Results are entirely based on their in-house
-
The paper tends to over-claim in some parts. While the baseline performs better than prior work, the writing exaggerates the extent of this
-
The introduction and abstract are lengthy, whereas other sections—such as the dataset description—could benefit from more depth.
-
It is unclear whether results are based on a single validation split, and how the model designs were developed. This raises concerns about potential validation leakage, which is not addressed in the paper.
-
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
Some of the ablation figures could be consolidated into a single table, making space for more detailed information on the dataset. Certain method descriptions could be streamlined to allow for more experimental details, such as preprocessing steps. Sentences like “Embryos classified into the high-quality category exhibit excellent features at all stages and possess the best developmental potential.” add little value (for us) and could be replaced with clearer, more informative content. While the weaknesses may appear to quantitatively outweigh the strengths, this does not fully reflect the contribution of the paper.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The paper introduces a new dataset and proposes a strong baseline that outperforms methods from both medical and natural imaging domains. We believe the dataset has potential value for the community, and the baseline seems to work.
However, the presentation would benefit from clearer descriptions and more comprehensive details, especially regarding the dataset and experimental setup. The lack of clarity around the validation protocol raises concerns about potential leakage, particularly given the extensive ablation studies. These issues should be addressed to ensure the reliability and reproducibility of the findings.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
Based on the content of this paper, we consider the contributions as follows:
- This paper proposes a new task, Video-Based Embryo Grading, which aims to automatically categorize IVF embryos into three grades (poor, fair, and good). In contrast to traditional cell tracking methods used to monitor embryo development, this task focuses on directly assessing embryo quality, thereby enhancing its clinical utility.
- The paper provides a dataset of 2,596 time-lapse videos collected from real-world clinical IVF practices. Each video is paired with a grading label indicating the overall quality of the corresponding embryo.
- This paper proposes a novel framework for evaluating the quality of IVF embryos.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
This paper proposes a spatial-temporal feature complementary network for evaluating the quality of in vitro fertilization embryos. The task defined in the manuscript holds practical significance compared to traditional embryo development modeling, and the valuable dataset constructed in the study provides a certain degree of advancement for the field. Furthermore, this article is in general well-written and easy to follow. The in-depth analysis of the research problem demonstrate the authors’ profound insights into this domain.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The paper mentions the “grading” concept, but does not provide a detailed explanation. Moreover, we hope that the authors incorporate additional state-of-the-art methods in the experimental section, given the novelty of the task.
- Please rate the clarity and organization of this paper
Satisfactory
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission does not provide sufficient information for reproducibility.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Thank you for inviting me to review the paper titled “Time-Lapse Video-Based Embryo Grading via Complementary Spatial-Temporal Pattern Mining” (2769 submission). Given the public release of the dataset by the authors, we consider that this paper makes outstanding contributions to the field and recommend its acceptance for publication. Below, I provide some detailed comments which I hope can help to further improve your manuscript.
- The article proposes a grading task based on time-lapse videos but lacks sufficient analysis of the grading task, failing to clearly explain the specifics of grading. Firstly, the method presented in the article is constructed based on a classification paradigm rather than a grading method. Secondly, the article does not mention relevant research on processing microscopic cell videos using a grading paradigm.
- In the experimental section, the article primarily compares three mainstream action recognition frameworks. However, our studies on cell video research using action recognition frameworks have found that the X3D network performs exceptionally well. Therefore, it is suggested that the authors introduce the X3D network for performance comparison to enhance the comprehensiveness and persuasiveness of the experimental section.
- Reviewer confidence
Somewhat confident (2)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #3
- Please describe the contribution of the paper
The authors present a dual-branch framework that incorporates morphological and morphokinetic features for automatic grading of embryos from time-lapse sequences. The dataset used to develop and evaluate the approach will be released to the research community.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The authors present a novel framework for incorporation of spatial and long-range temporal features that emulates clinical embryo evaluation principles with minimal labels (one grade class per video). The ability to indicate the temporal stages used for prediction (from the Temporal Selection Block) and display alignment with a clinically established protocol demonstrates high clinical feasibility. The proposed approach is well-evaluated with comparison to other methods for AI-assisted Embryo Selection as well as another suitable domain (Video Action Recognition). A comprehensive ablation study demonstrates utility of individual components of the proposed approach.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
Only the average values are reported for evaluation, so it is not clear if the differences between methods are statistically significant. The Temporal Selection Block in the morphological branch provides valuable insights into the temporal stages most used for embryo grade prediction. However, the temporal stages most important in the morphokinetic branch and contribution of morphological vs. morphokinetic features remains unknown, limiting insights that would help in clinical adoption.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The authors claimed to release the source code and/or dataset upon acceptance of the submission.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This paper demonstrates clinical utility to existing embryo grading protocols and provides a strong evaluation, comparing to methods developed for the same application and also a transferable established application in the computer vision community. The work is accompanied by a large dataset that will facilitate future research into video-based embryo grading.
- Reviewer confidence
Very confident (4)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
We express sincere gratitude to all reviewers for providing valuable and thoughtful feedback on our work. R1.1: “The article proposes a grading task based on time-lapse videos but lacks sufficient analysis of the grading task, failing to clearly explain the specifics of grading.” In this work, “grading” refers to classifying embryos into three quality levels — high, medium, and low — based on daily assessments from TLM videos by clinical embryologists (Istanbul consensus). High-quality embryos show consistently good development, while low-quality ones are unsuitable for transfer. The word “Grading” is widely used in reproductive medicine literature to describe such categorical evaluations. We will clarify this terminology and expand the related work section in the final version. R1.2: “The comparison with X3D is suggested.” We conducted experiments with X3D (Accuracy = 0.8146, Precision = 0.6945, Recall = 0.7184, F1 = 0.7020). While X3D performs competitively, our method (Accuracy = 0.8606, Precision = 0.7746, Recall = 0.7595, F1 = 0.7618) still shows superior results.
R2.1: “Only the average values are reported for evaluation, so it is not clear if the differences between methods are statistically significant.” Our initial experiments showed stable test performance on nearly 800 TLM videos across different random seeds. We agree that statistical significance analysis is important and plan to include multiple runs or cross-validation in future work. R2.2: “The contribution of morphological vs. morphokinetic features remains unknown, limiting insights that would help in clinical adoption.” Ablation results in Table. 2 show both morphological and morphokinetic features are essential, with the former showing a relatively stronger impact on the final performances.
R3.1: “It is not clear how this task is novel compared to existing IVF.” The novelty of our task lies in three aspects:
- While most existing IVF studies rely on single-image assessments, our method leverages time-lapse microscopy (TLM) videos to model both static and dynamic developmental patterns over the entire developing process.
- Prior video-based works focus on partial evaluations (e.g., blastocyst formation and developmental stage classification). To obtain a final decision, they may need to train multiple networks. Conversely, we aim to directly predict clinicians’ holistic quality evaluations by analyzing the complete developmental timeline and implicitly integrate all the key evaluation metrics.
- Our three-level grading system (high/medium/low) reflects real-world clinical priorities and offers a finer resolution than traditional binary schemes. R3.2: “Results are entirely based on their in-house.” Our results are based on an in-house dataset because (1) the task is novel and no public dataset matches our setting, and (2) most related studies are closed-source with no open data or code. We will make the dataset publicly available for evaluation. R3.3: “The paper tends to over-claim in some parts.” We acknowledge the comments and will revise writing in the final version. Here, we provide more details about the dataset and experimental setups: Dataset: Embryos were evaluated daily via TLM videos by embryologists following the Istanbul consensus, with final three-level labels (high/medium/low) derived from aggregated daily assessments like cell number, symmetry, etc based on clinical transfer criteria. Experiments: Stratified splitting ensured balanced class distribution and no data leakage. Preliminary tests (~800 videos, multiple seeds) confirmed performance stability, supporting our single-split approach under resource constraints. We will add more details about the dataset and experimental setup and remove redundant parts in the final version. R3.4: “It is unclear whether results are based on a single validation split…” As detailed in R3.3, stability across multiple random seeds validates this choice under computation resource constraints.
Meta-Review
Meta-review #1
- Your recommendation
Provisional Accept
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A