List of Papers Browse by Subject Areas Author List
Abstract
Red blood cells (RBCs) are fundamental to human health, and precise morphological analysis is critical for diagnosing hematological disorders. Despite the potential of foundation models for medical diagnostics, comprehensive AI solutions for RBC analysis remain limited.
We introduce RedDino, a self-supervised foundation model specifically designed for RBC image analysis. Leveraging a RBC-tailored version of the DINOv2 self-supervised learning framework, RedDino is trained on an extensive, meticulously curated dataset comprising 1.25 million RBC images from diverse acquisition modalities and sources. Comprehensive evaluations demonstrate that RedDino significantly outperforms existing state-of-the-art models on the RBC shape classification task. Through systematic assessments, including linear probing and nearest neighbor classification, we validate the model’s robust feature representation and strong generalization capabilities. Our key contributions are (1) a dedicated foundation model tailored for RBC analysis, (2) detailed ablation studies exploring DINOv2 configurations for RBC modeling, and (3) comprehensive generalization performance evaluation.
We address key challenges in computational hematology by developing RedDino, a robust and generalizable model that captures nuanced morphological characteristics and represents a substantial advancement in developing reliable diagnostic tools.
The source code and pretrained models for RedDino are available at https://anonymous.4open.science/r/RedDino-1F17 .
Links to Paper and Supplementary Materials
Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4083_paper.pdf
SharedIt Link: Not yet available
SpringerLink (DOI): Not yet available
Supplementary Material: Not Submitted
Link to the Code Repository
https://github.com/Snarci/RedDino
Link to the Dataset(s)
Provided in the paper
BibTex
@InProceedings{ZedLuc_RedDino_MICCAI2025,
author = { Zedda, Luca and Loddo, Andrea and Di Ruberto, Cecilia and Marr, Carsten},
title = { { RedDino: A foundation model for red blood cell analysis } },
booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
year = {2025},
publisher = {Springer Nature Switzerland},
volume = {LNCS 15963},
month = {September},
page = {443 -- 452}
}
Reviews
Review #1
- Please describe the contribution of the paper
The paper curates a large-scale dataset of red blood cell (RBC) images to develop a foundation model for RBC analysis. The authors use the DINOv2 framework and exceed the state-of-the art in RBC analysis.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
-The authors exceed many of the previously proposed foundation models for similar tasks, outperforming DINOv2 and DinoBloom by ~3% and ~3.5% in F1 and Acc. -A compelling visualization of the learned features (PCA, UMAP) is shown, demonstrating the model’s capabilities in distinguishing between visual features across classes. -The curation of a dataset for training an RBC foundation model is valuable work and no doubt an important contributor to this model’s superior performance.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
-The methodology section is weak, and has two parts: training data and testing data. In short, it seems the authors mostly fine-tuned DINOv2 on a curated dataset of RBC. It is encouraging to see that this was sufficient to beat SOTA but there seems to be no additional methodological enhancements.
-This follows from previous point, the technical novelty is limited, with this work being very similar to some of the methods it compares against (e.g. DinoBloom). There is no new training strategy proposed (SSL, etc.), and no adaptation of the model to a new task. Instead, the dataset is slightly different (perhaps bigger, more diverse, which is certainly good) and the model performs better.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(2) Reject — should be rejected, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
The limited technical novelty of the work is difficult to overlook, despite the good performance reported by the authors. The curation of a new large dataset is a noteworthy contribution, but there are no methodological enhancements to the DINO method.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #2
- Please describe the contribution of the paper
They trained a self-supervised foundational model based on the DiNO architecture, specialized in red blood cell related tasks. These models achieve State-of-the-Art results on several tasks. The authors plan to provide code and model weights.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1) Experimenting with different ways to train DiNO, focussing either on single cell images or patches. 2) Aggregation of multiple datasets and clear visualisations of each one. 3) Providing strong & consistent SOTA results on multiple tasks. 4) Providing potential explanations for results, such as why patch wise approach performs better than the single-cell approach.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1) Many results in the ablation study are below the baseline. 2) Some hypothesis are presented with confidence but are not supported by evidence. In particular, the sentence: “Pathological and abnormal RBCs, which should stand out in the feature space, were overly suppressed” is not supported by evidence. The removal of the Koleo Loss increases the F1, but the reason is not proven. Hence, the fact that it might be due to the suppression of abnormal RBCs in the feature space is a hypothesis, not a fact. Please introduce this with less certainty or provide data or a citation that proves this claim. The koleo loss penalizes embedding two different samples very close to each other. It is unclear to me why it would suppress abnormal samples. 3) Replacing the moving average centering with Sinkhorn-Knopp centering is an idea already tested in the original DiNOv2 paper, and the results in RedDino just replicate the DiNOv2 paper results. SK centering is included in the base DiNOv2 configuration. Please make this clear in the text. 4) The sentence: “We then proceeded to train a suite of models for 2,000 iterations each, after which performance decreased over time, a pattern not limited to our RedDino models but a well-known phenomenon in foundation model research [26]”. It is not true that training saturates in general after 2k iterations. This depends on dataset size, diversity and hyperparameters. 5) Does 1-NN make sense? Why did you choose to report 1 and 20 neighbors, usually 1-NN is not done due to high variance. Was 20-NN the optimal amount of neighbors? How did you find this number? Grid search? 6) Why is RedDino base better than large in some cases? Maybe dataset size and diversity is not enough to provide inductive biases for the largest model. The phrase: “Furthermore, RedDino base proved to be a strong general solution, balancing performance with efficiency by utilizing fewer parameters (86 million vs. 304 million).” Is not very precise or informative. 7) The claim for a “a custom DINOv2 architecture” is not justified as there have been no changes to the architecture, only to augmentations, which belong to the dataloader. If there are other modifications, please include them. Please detail your contributions to the architecture or remove this sentence. 8) Please provide better tasks descriptions, such as number of classes and approximate class distribution, in order for the reader to have an intuition on task difficulty. Maybe also report per-class scores.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
1) Single cell approach is all below baseline and thus not so relevant, also DiNO has not been conceived to work on single-cell level, since the loss establishes a local-global correspondence. Therefore, subpar performance on single-cell level might be expected. Anyways, it is nice to include negative results for the benefit of the research community. 2) On the DSE dataset the balanced accuracy is a lot lower ~16 pts than the weighted F1 score, but on the Chula and Elsafty datasets, the two scores are very similar. Please comment on these results.
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(4) Weak Accept — could be accepted, dependent on rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
Applying DiNOv2 to a new dataset is very useful for the community, and the model achieves SOTA and generalizes across datasets. Furthermore, providing the code for reproducibility, 4 or 5-fold cross-validation with standard deviation and qualitative results are also highly appreciated. However, the novelty is exaggerated, as the DiNOv2 architecture is used almost out-of-the-box. More extensive hyperparameter optimization, for example with Bayesian Optimization could have enhanced the results.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Review #3
- Please describe the contribution of the paper
This paper introduces RedDino, an open-weight foundation model family for red blood cell (RBC) representation learning, based on a tailored adaptation of the DINOv2 self-supervised learning framework. Trained on over 3 million RBCs extracted from 18 diverse datasets, RedDino is evaluated on a held-out dataset (Elsafty) and further tested on Chula and DSE. The authors assess generalization performance using linear probing and k-nearest neighbor classification.
- Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The paper addresses an important and underexplored area in medical AI. It contributes a large-scale RBC image resource, and the authors openly share pretrained models and code. The evaluation is satisfactory and includes comparisons to state-of-the-art approaches like DinoBloom and DINOv2. RedDino demonstrates improved generalization, also in unbalanced and cross-source scenarios.
- Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
The evaluation focuses exclusively on classification tasks. While this is appropriate for benchmarking, it may limit insight into how well the learned representations transfer to other relevant tasks in hematology, such as segmentation, anomaly detection, or progression tracking. Addressing this would strengthen the manuscript.
- Please rate the clarity and organization of this paper
Good
- Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.
The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
- Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html
N/A
- Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.
(5) Accept — should be accepted, independent of rebuttal
- Please justify your recommendation. What were the major factors that led you to your overall score for this paper?
This is a strong and well-executed paper that presents a meaningful advance in self-supervised representation learning for red blood cell analysis. Despite the focus on classification, the results and resources provided are valuable and relevant to the MICCAI community.
- Reviewer confidence
Confident but not absolutely certain (3)
- [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.
N/A
- [Post rebuttal] Please justify your final decision from above.
N/A
Author Feedback
Dear Area Chair, dear Reviewers,
Thank you for your valuable feedback and for recognizing the strengths of our work. We are pleased that all reviewers acknowledged our model’s superior performance over the state-of-the-art (SOTA), along with its enhanced generalizability, even in unbalanced scenarios (R1)—a crucial aspect in red blood cell (RBC) analysis.
We appreciate the recognition of our effort in curating the largest source of both patch and single-cell RBC data (R1, R2, R3). We believe that the quality and scale of this dataset have significantly contributed to the improvements over previous models. Moreover, we are grateful for R2’s acknowledgment of our detailed evaluation of patch and single-cell approaches during training, which offered critical insights into their respective benefits for RBC analysis. We also thank R3 for emphasizing the importance of our feature visualization experiments, which provided an explainable analysis of class-specific features and highlighted the emerging properties of Red DINO.
Regarding technical novelty, we understand R3’s observation about the similarities with DINO Bloom. However, our approach features distinct adaptations, most notably the customized DINOv2 framework. Our modifications include collecting the largest RBC dataset, investigating two distinct image sources (patches and single cells), and adjusting the final loss composition by removing the Koleo loss. This adaptation enhances model convergence and improves performance, outperforming previous baselines, as confirmed by our ablation studies.
Additionally, we implemented a comprehensive set of 32 augmentations, which, to our knowledge, have not been used in DINO Bloom or other DINOv2 SSL pretraining methods. These augmentations further improve the model’s robustness and generalization.
In response to R2’s comments (points 2, 3, and 7), we acknowledge that our hypotheses are based on the results obtained, including the impact of Koleo loss removal. We also clarify that the Sinkhorn-Knopp centering is consistent with the original DINOv2 paper (point 3), and that our modifications are limited to augmentations, not architectural changes (point 7). The misleading phrasing mentioned in point 4 will be revised for clarity in the camera-ready version.
Addressing R2’s point 5, we explain our use of 1-NN and 20-NN: 1-NN is effective in imbalanced scenarios, as seen in the DSE dataset, while 20-NN assesses feature robustness across a larger neighborhood. This balanced approach ensures fair comparison with DINO Bloom and provides a deeper understanding of model behavior. Applying a more common 5-NN or 10-NN approach, for example, to the DSE dataset would invalidate the results since it has fewer than five samples of such classes in the test set.
For point 6, we will rephrase our explanation regarding the lower performance of larger models, highlighting that it is linked to limited training data diversity and the well-known “curse of dimensionality.”
We plan to include data distribution in the camera-ready version (point 8), but we will be unable to provide single-class scores due to space constraints.
Finally, we acknowledge R1’s concern regarding the limited diversity of tasks in our evaluation. Our focus is to develop a single robust model that can be adapted for various RBC tasks, streamlining future research.
Once again, we thank the reviewers for their constructive feedback, which has greatly helped us enhance the quality of our work.
Meta-Review
Meta-review #1
- Your recommendation
Provisional Accept
- If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.
N/A