Abstract

Veriserum is an open-source dataset designed to support the training of deep learning registration for dual-plane fluoroscopic analysis. It comprises approximately 110,000 X-ray images of 10 knee implant pair combinations (2 femur and 5 tibia implants) captured during 1,600 trials, incorporating poses associated with daily activities such as level gait and ramp descent. Each image is annotated with an automatically registered ground-truth pose, while 200 images include manually registered poses for benchmarking. Key features of Veriserum include dual-plane images and calibration tools. The dataset aims to support the development of applications such as 2D/3D image registration, image segmentation, X-ray distortion cor- rection, and 3D reconstruction. Freely accessible, Veriserum aims to ad- vance computer vision and medical imaging research by providing a re- producible benchmark for algorithm development and evaluation. The Veriserum dataset used in this study is publicly available via https: //movement.ethz.ch/data-repository/veriserum.html, with the data stored at ETH Zürich Research Collections: https://doi.org/10.3929/ ethz-b-000701146.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/2716_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/wjh19990923/veriserum

Link to the Dataset(s)

https://doi.org/10.3929/ethz-b-000701146

BibTex

@InProceedings{WanJin_Veriserum_MICCAI2025,
        author = { Wang, Jinhao and Vogl, Florian and Schütz, Pascal and Ćuković, Saša and Taylor, William R.},
        title = { { Veriserum: A dual-plane fluoroscopic dataset with knee implant phantoms for deep learning in medical imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15972},
        month = {September},
        page = {645 -- 655}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The paper presents Veriserum, an open-source dual-plane fluoroscopic X-ray dataset focused on knee implant phantoms. It includes ~110,000 images generated using a robotic setup to simulate clinically relevant poses. The dataset provides both automated and manually verified 6-DOF pose annotations and calibration information to support downstream tasks such as 2D/3D image registration and segmentation.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    First open-access dual-plane fluoroscopic dataset of knee implant phantoms. The data is collected using robotically controlled, high-precision hardware, enhancing label quality and repeatability. Inclusion of automated + manually verified pose annotations supports benchmark comparisons.

    Calibration procedures and tools (DISCAL, SICAL) are well-described and made available for reproducibility.

    Dataset size is large (110k+), spanning diverse implant designs and simulated motions.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    This is primarily a dataset paper with no novel algorithm, method, or analysis pipeline introduced. All images are of phantom implants without soft tissue or bone structures, severely limiting generalization to real-world clinical data. The dataset cannot model challenges like occlusions, anatomical variability, or soft tissue deformation.

    Despite being based on motion data, the dataset includes only discrete static frames, making it unsuitable for dynamic tracking or time-series learning, which are critical for many orthopedic imaging tasks. Beyond pose error comparisons (robot vs auto vs manual), the paper does not demonstrate downstream applications such as registration, segmentation, or reconstruction. A dataset claiming broad utility should at least present baseline experiments to support its usefulness.

    Although dual-plane setup is a new feature, prior datasets like CAMS-Knee already offer clinically sourced knee implant motion data [1]. The phantom-based setup in this paper cannot match the clinical realism or anatomical diversity of those existing resources.

    [1] Taylor W R, Schütz P, Bergmann G, et al. A comprehensive assessment of the musculoskeletal system: The CAMS-Knee data set[J]. Journal of biomechanics, 2017, 65: 32-39.

  • Please rate the clarity and organization of this paper

    Poor

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This paper introduces a dataset, but the contribution is primarily engineering, with no novel scientific insight or algorithmic development. The dataset lacks clinical realism, includes only phantom static images, and offers limited validation of downstream applications. Therefore, I cannot recommend acceptance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Reject

  • [Post rebuttal] Please justify your final decision from above.

    While the authors have provided some clarifications in their rebuttal, I think the following issues remain insufficiently addressed: the dataset lacks clinical realism and anatomical diversity, and its effectiveness for downstream applications has not been adequately validated.



Review #2

  • Please describe the contribution of the paper

    This study introduces a novel dataset comprising dual-plane X-ray images of 10 different knee implant configurations, accompanied by corresponding kinematic data. The dataset is publicly released, which constitutes a significant contribution to the biomechanics and medical imaging research communities. It provides a valuable resource for developing and validating algorithms related to 2D-3D registration, implant tracking, and motion analysis.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. Public Availability of the Dataset: Making this dataset freely available enables reproducibility, benchmarking, and further research in various subfields of medical imaging and orthopedics.
    2. Pose Adjustment Algorithm: The study includes a pose refinement algorithm designed to improve 2D-3D registration accuracy. This is implemented using custom-designed registration software, enhancing the usability and precision of the provided data.
    3. Quality Assurance Metrics: Appropriate quality metrics are used to evaluate the default pose estimates. For instance, normalized cross-correlation (NCC) is employed to assess alignment quality, which helps ensure baseline reliability for downstream applications.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. Limited Discussion of Downstream Applications: The manuscript lacks sufficient reference to the intended downstream use cases of the dataset. A clearer explanation of how the data might be applied in real-world clinical or research scenarios would strengthen its relevance and impact.
    2. Deviation from Clinical-Grade Implants: as stated by the authors, the implants used differ slightly from real clinical-grade counterparts. Depending on the application (e.g., surgical planning, implant wear studies), this deviation could be critical. Including a quantitative or descriptive metric to capture this disparity would be beneficial.
    3. Presence of Robotic Attachment in Images: The implants are shown with a robotic attachment piece that may not be present in clinical scenarios. If this feature affects image interpretation or downstream analysis, further justification or clarification is warranted, particularly in the context of potential clinical applications.
    4. The provided dataset is primarily intended for use by AI-based methods for 2D-3D registration of knee implants. This raises an important question: how does the spatial accuracy of the dataset compare to that of traditional image-based 2D-3D registration algorithms? In other words, is the spatial accuracy of the provided dataset beyond the range of achievable spatial accuracy of a common 2D-3D registration approach? Again, further explanation on the intended downstream applications would help answer such questions.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I have assessed the paper positively due to its significant contribution in making a high-quality dual-plane X-ray dataset of knee implants publicly available—a valuable resource for the biomechanics and medical imaging communities. The inclusion of a pose refinement algorithm and appropriate quality assurance metrics further strengthens the technical rigor of the work.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I have accepted the paper after the rebuttal primarily since I see the value of the public dataset presented and the effort involved. This is more a platform that future research can be built on top of. However, please note that the underlying methodology is rather trivial and therefore from a purely technical perspective, the paper may lack novelty.



Review #3

  • Please describe the contribution of the paper

    The paper presents Veriserum, an open access database of “110,000 X-ray images of 10 knee implant combinations […] captured during 1,600 trials”. The researchers used a robot to simulate movements during the aquisitions. These movements aim at simulating typical knee movements such as stair descents. The laboratory setup consists of two image intensifiers detectors which are commonly used in fluoroscopic C-arm system. In contrast to flat-panel detectors, these detectors suffer from electro-magnetic distortion. The authors use a state of the art distortion correction and compute the geometric calibration using an additional calibration phantom. This information is essential for successful 2D/3D registration and the authors provide this information in addition to the aquisitions.

    In general publicly available X-ray datasets are rarely available to researchers and in particual datasets which provide full distortion correction and geometry calibrations are almost not avaiable at all. This makes this data collection and providing it to the public very valuable for a lot of reserachers.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Open fluoroscopy datasets are sparse and in particular the dual-plane dataset presented in this paper adds a significant benefit for researchers.
    • In particular for research related to deep learning and/or 2D/3D registration this extensive collection, including calibration, provides a huge benefit to researchers.
    • The paper is well written, clearly structured, and easy to follow.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    • The dataset contains image from a single detector type only. Image from flat-panel detectors and other vendors would add additional, essential information for deap learning models.
    • The dataset only contains images of implants. In particular for 2D/3D registration as well as deep learning, bone and soft tissue material would be highly relevant too. But such are missing from the dataset.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The authors claimed to release the source code and/or dataset upon acceptance of the submission.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (4) Weak Accept — could be accepted, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    In summary I believe this dataset represents an important contribution to the scientific community.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    No changed opinion after the rebuttal; definitely a valuable data set.




Author Feedback

Rebuttal to Reviewers

We sincerely thank all reviewers for their valuable comments.

Major Issues (Reviewer 1,2,3)

  1. Limited discussion of downstream applications The Veriserum dataset is designed for solving experimental research problem and downstream tasks such as image registration, implant segmentation, and 3D reconstruction for post-op patients [1]. We refined the poses using an automated PyTorch3D-based renderer, exemplifying registration as a key downstream task. The diversity of the poses in the dataset supports data-driven (ML, AI) approaches by improving generalization and preventing overfitting [2, Ref 8&18]. Segmentation masks can be generated via rendering, helping implant segmentation and pose initialization [Ref 2].
    • [1] Mescheder et al., Occupancy Networks, CVPR 2019.
    • [2] Wang et al., Multi-View Point-Based Registration, Engineering, 2021.
  2. Necessity of dual-plane dataset compared to CAMS-Knee CAMS-Knee suffers from large out-of-plane errors due to its single-plane setup. Our analysis using CAMS-Knee data revealed Z-axis instability. Veriserum with its dual-plane setup leads to high-precision ground truth, which enables direct analysis of registration accuracy in 3D space. It supports the pre-training of AI-based models under artificial occlusions or noises and reduces the development effort for dual-plane fluoroscope pipelines.

  3. Novelty in methodology and dual-plane renderer We developed a novel differentiable surface renderer using PyTorch3D used for automated implant registration (refined pose). This module is compatible with Pytorch, supports gradient-based optimizer, and has potential for future integration to AI-based networks. The open source code is publicly available as a service to fellow researchers.

Minor Issues

Reviewer 1

  1. Deviation from clinical-grade implants The implant geometries were adapted from CAD models provided by our collaborator (Zimmer Biomet). We smoothed inside engravings and adjusted non-loading support wings to avoid IP issues while retaining realistic shape. The geometries of the various implant manufacturers differ anyway and therefore require transfer learning in any case.

  2. Presence of robotic attachment in images The robotic attachment has a different X-ray attenuation from the implant, allowing it to be separated via traditional vision tools. In our evaluations, the attachment did not interfere with our automated pose estimation.

  3. Comparison in spatial accuracy with other 2D/3D registration methods AI-based learning models for implant registration currently yield higher registration errors compared to traditional 2D3D approaches. Training AI models with our refined poses (MAE < 0.8 mm, 0.9°) is expected to improve prediction accuracy, potentially outperforming traditional methods. Our diverse, realistic kinematics will further enhance model robustness.

Reviewers 2 & 3

  1. Lack of flat-panel and soft tissue / bone structures Our lab, like many other research institutes, does not yet have flat panels, but with our postprocessing, the obtained images are suitable for downstream tasks regardless of vendors. Upgrade to flat-panels is planned.
    Our dataset focuses on post-operative implant imaging. With Veriserum and transfer learning, models can be trained with limited clinical data. Models with soft tissue are hard to manipulate across diverse poses, limiting their use. However, ongoing data collection with real participants will soon expand our dataset.

Reviewer 3

  1. Only discrete static frames Veriserum is not merely a collection of discrete static frames, but rather a set of fluoroscopic data executed by a high-precision robotic arm, capturing smooth trajectories if played in sequence. Therefore, with the robot and refined poses provided, Veriserum supports dynamic tracking and time-series tasks to a certain extent.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    The value of the paper is the unique and significant dataset. While it has limitations – no soft tissue – the extension of the dataset and data is valuable.



back to top