Abstract

Contrastive pretraining provides robust representations by ensuring their invariance to different image transformations while simultaneously preventing representational collapse. Equivariant contrastive learning, on the other hand, provides representations sensitive to specific image transformations while remaining invariant to others. By introducing equivariance to time-induced transformations, such as disease-related anatomical changes in longitudinal imaging, the model can effectively capture such changes in the representation space. In this work, we propose a Time-equivariant Contrastive Learning (TC) method. First, an encoder embeds two unlabeled scans from different time points of the same patient into the representation space. Next, a temporal equivariance module is trained to predict the representation of a later visit based on the representation from one of the previous visits and the corresponding time interval with a novel regularization loss term while preserving the invariance property to irrelevant image transformations. On a large longitudinal dataset, our model clearly outperforms existing equivariant contrastive methods in predicting progression from intermediate age-related macular degeneration (AMD) to advanced wet-AMD within a specified time-window.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/3246_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/3246_supp.pdf

Link to the Code Repository

https://github.com/EmreTaha/TC-time_equivariant_disease_progression

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Emr_Learning_MICCAI2024,
        author = { Emre, Taha and Chakravarty, Arunava and Lachinov, Dmitrii and Rivail, Antoine and Schmidt-Erfurth, Ursula and Bogunović, Hrvoje},
        title = { { Learning Temporally Equivariance for Degenerative Disease Progression in OCT by Predicting Future Representations } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15012},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors employ a time-equivariant contrastive learning approach that effectively utilizes images from multiple visits to predict the progression of Age-related Macular Degeneration (AMD). This method demonstrates application of temporal dynamics in medical image analysis.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper introduces an equivariance module that effectively propagates representation to future stages, and a regularization loss designed to maintain sensitivity to temporal changes.
    2. Extensive experiments are conducted comparing the method against several baselines within two specific time windows (6 and 12 months). Additionally, ablation studies provide insight into the impact of the displacement map (DM) and the regularization loss on model performance.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. In introduction section, the initial discussion on contrastive learning, specifically the Siamese network architecture and its adaptation to methods without negative pairs, lacks citations of several related studies [1, 2, 3].
    2. The datasets utilized for pretraining are relatively small in size, consisting of approximately 10,000 samples. This sample size raises concerns regarding the effectiveness of contrastive pretraining. Expanding the dataset or demonstrating the method’s efficacy on larger datasets would significantly bolster the findings and showcase the model’s robustness in real-world settings.
    3. It would be beneficial to include comparisons with additional contrastive learning baselines in the results section. For instance, methods like SimCLR, and DCL, which are also applicable to smaller batch sizes (such as the batch size of 128 mentioned in this paper). Relevant implementations can be found in this link: https://github.com/lightly-ai/lightly?tab=readme-ov-file

    [1] Chen, Ting, et al. “A simple framework for contrastive learning of visual representations.” International conference on machine learning. PMLR, 2020. [2] Wu, Zhirong, et al. “Unsupervised feature learning via non-parametric instance discrimination.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. [3] Grill, Jean-Bastien, et al. “Bootstrap your own latent-a new approach to self-supervised learning.” Advances in neural information processing systems 33 (2020): 21271-21284.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I believe that addressing the points could significantly strengthen the manuscript.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    While I lean towards a weak accept for this paper, addressing the points mentioned above could strengthen the manuscript significantly.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The contribution of the paper is introducing a time-equivariant contrastive learning framework that makes representation spaces sensitive to changes due to the changes in disease states at different points in AMD progression.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The main strength is a strong evaluation across typical self-supervised methods as well as typical equivariant contrastive methods. Additionally, the motivation for the work makes sense as well as having a methodology that flows directly from the associated formulation of the regularization term.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the paper is that it is currently evaluated with respect to a narrow setting of wet-AMD conversion. I am unsure if there is something specific about the wet-AMD conversion that makes the author’s method more tractable to this setting or if they proposed work is a more generalizable way of dealing with time-series contrastive learning applications. It would have been interesting to also compare against the typical approaches used in computer vision to see if there is something about your approach that works specifically within this domain. An example would be this paper:

    Jenni, S., & Jin, H. (2021). Time-equivariant contrastive video representation learning. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 9970-9980).

    Another approach would be to have more datasets to compare with within the wet-AMD setting, but I understand that such datasets can be difficult to obtain within a medical context.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    The dataset is publicly available.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    I think one point of interest would be to analyze your method in other settings as well. I understand the intuition behind your approach, but I cannot tell if it is due to the nature of wet-AMD data.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I think the experimental validation and clarity of the work are well done. My only issue is that all the work was done on a single dataset within a constrained application setting, but considering the nature of medical applications this can be also viewed as an advantage to the methodology. Additionally, I believe it would have been informative to see how a traditional computer vision technique for contrastive learning in a time series would have done.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    The paper presents a method for learning time equivariant models from longitudinal medical imaging data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The topic of the paper is very interesting and relevant. While longitudinal patient data is relatively common, the are not many DL methods specifically tailored to modelling it.

    The methodology seem quite well motivated and is novel. The evaluation is appropriate and includes comparison to strong baselines and an ablations study.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The main weakness of the paper is that it is evaluated on a single clinical application. Evaluation on several time series datasets would have been much more convincing.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The presented methodology seems to be much more general and not specifically limited to this application domain. While I understand the limited space of the manuscript, an inclusion of experiments on additional datasets would make the paper much stronger. A discussion on the limitations of the methods would also be appreciated.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    Well-motivated and novel methodology, extensive evaluation, good results compared to baseline methods.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank all the reviewers for their valuable comments and feedback. The reviewers acknowledged the methodological novelty and the extensive evaluation of our paper with an early acceptance.

The reviewers raised concerns related to the limitation of the method and comparison with additional datasets: (i) Limitation of the Dataset (R3,R4,R5): Our scans were collected from 1000 patients through multi-institutional collaboration. The irreversible nature of wet-AMD as a degenerative disease is the most suitable to our task. Additionally the public datasets have less frequent visits, compared to our dataset (24 visits per patient). Even though our dataset is temporal, in the input, we process only 2 time points with large time difference, unlike the video specific models (R5). Finally we agree with R3,R5 that multiple medical tasks would enhance the strength of TC. We leave this as an extension to our method. (ii) Lack of comparison against the other contrastive methods (R4): Our method has been extensively evaluated on multiple equivariant methods. All of them and our method can be trained with any other contrastive method. We chose VICReg for its popularity, but our loss terms are compatible with SimCLR, DCL. In the camera-ready version, we will add contrastive methods more extensively in the introduction section as the reviewer (R4) suggested. (iii) Limitation of TC (R3): The major limitation of our method is that it relies on the assumption of irreversible disease progression. Even though the most degenerative diseases fell into this category, it limits the general use of the future prediction module. We will extend the conclusion section with the limitation.




Meta-Review

Meta-review not available, early accepted paper.



back to top