Abstract

Medical image analysis suffers from a shortage of data, whether annotated or not. This becomes even more pronounced when it comes to 3D medical images. Self-Supervised Learning (SSL) can partially ease this situation by utilizing unlabeled data. However, most existing SSL methods can only make use of data in a single dimensionality (e.g. 2D or 3D), and are incapable of enlarging the training dataset by using data with differing dimensionalities jointly. In this paper, we propose a new cross-dimensional SSL framework based on a pseudo-3D transformation (CDSSL-P3D), that can leverage both 2D and 3D data for joint pre-training. Specifically, we introduce an image transformation based on the im2col algorithm, which converts 2D images into a format consistent with 3D data. This transformation enables seamless integration of 2D and 3D data, and facilitates cross-dimensional self-supervised learning for 3D medical image analysis. We run extensive experiments on 13 downstream tasks, including 2D and 3D classification and segmentation. The results indicate that our CDSSL-P3D achieves superior performance, outperforming other advanced SSL methods.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/0308_paper.pdf

SharedIt Link: https://rdcu.be/dV58c

SpringerLink (DOI): https://doi.org/10.1007/978-3-031-72120-5_17

Supplementary Material: N/A

Link to the Code Repository

N/A

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Gao_CrossDimensional_MICCAI2024,
        author = { Gao, Fei and Wang, Siwen and Zhang, Fandong and Zhou, Hong-Yu and Wang, Yizhou and Wang, Churan and Yu, Gang and Yu, Yizhou},
        title = { { Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15011},
        month = {October},
        page = {178 -- 188}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    Describes a way of projecting 2D images into 3D so that they can be used to pretrain a 3D model with self-supervised learning. It can be easier to get hold of large numbers of 2D images (such as radiographs) compared to 3D images (CT). The method is evaluated on 13 classification/segmentation tasks and compared to other self-supervised learning methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    Apparently simple approach to turning 2D images into pseudo-3D images. Results suggest improved performance compared to alternative approaches. There is a reasonably large improvement in the 3D segmentation tasks by adding the 2D data, significantly more than that obtained by the other technique which uses 2D and 3D data (UniMiSS). The improvement on the 2D task is smaller (about 2%).

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The key step, converting the 2D image into a 3D volume, is not clearly explained. An equation of the form I_3(x,y,z) = I_2(….,….) would really help. I assume from the sizes of the patches in Fig.2b that the z-row in I_3(x,y,: ) is the raster scan of the k x k patch centred at (sx,sy) in the 2D image – is that right? If it is the case, then the detailed explanation of the im2col approach is thus rather confusing. I would recommend that it is removed. It is hard to see why this works, as the 2D structure is somewhat mangled when spreading it over 3D.

  • Please rate the clarity and organization of this paper

    Satisfactory

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    If the main algorithm can be described clearly then it should be fairly reproducible.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    The idea of attempting to leverage 2D data to help 3D tasks is a good one, and the proposed method does better than the alternative method (UniMiSS) at this. It is hard to see why this works, as the 2D structure is somewhat mangled when spreading it over 3D. Presumable one slice of the new 3D volume is effectively a downsampled version of the original image, so some structure is retained in the (x,y) dimensions, but the z direction is a bit strange, as it is a patchwork made up of different rows from the original. It would have been helpful to show alternative schemes for the 2D to 3D mapping to give more insight. For instance, creating 3D volumes by duplicating 2D slices.

    Minor Points: I suggest changing utilizing -> using, utilize -> use Fig.1 “Fintune” -> “Fine-tune” P3. “preblem” -> “problem” P4 “is maintained in two dimension of “ -> “are maintained in two dimensions of” P4 “is converted to complex 3D space” -> “is converted to 3D space” (“complex” doesn’t add anything useful) P5 “force the model capture” -> “force the model to capture”

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Reject — could be rejected, dependent on rebuttal (3)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper shows that using large amounts of 2D data can help train 3D tasks, but it isn’t clear why it works, given that the 2D data is rather mangled when mapping to 3D.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    A method to transform 2D images to 3D with locally realistic appearance, so that they can be used for self-supervised pretraining.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    An interesting idea of transforming 2D to 3D images. Good experimental validation, good results.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The method is not described very clearly, explicit equations would help. It takes 3-4 pages until the main idea is exposed.

    If I understand correctly, the ‘k \times k’ possible shifts are transformed to one new dimension. We do not know exactly how but I expect that we stack the shifts row by row. In that case, there would be discontinuity when jumping between the last pixel of one row and the first pixel of the next row.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Exponential notation should be used instead of writing e.g. “1e-5”

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    good idea, good experimental evaluation, good results

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes a Cross-Dimensional Self-Supervised Learning framework based on a Pseudo-3D transformation (CDSSL-P3D) that can utilize 2D and 3D medical images for dictionary learning. By transforming 2D medical images into 3D based on the im2col algorithm, CDSSL-P3D enables data from different dimensions to be utilized for learning together. This enables the AI model to learn the representation of medical images from a more diverse perspective. Furthermore, CDSSL-P3D has the advantage of being compatible regardless of the backbone type, making it more generalizable. The authors demonstrate the proposed framework’s effectiveness through extensive experiments.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. This paper contributes to solving one of the biggest challenges in medical AI, the lack of data, by enabling the utilization of 2D and 3D medical images together within a single framework.

    2. The proposed framework is not confined to a specific network architecture, be it CNN-based or Transformer-based. Its adaptability allows for the utilization of various models for downstream tasks, fostering a culture of innovation and generalization in the field of medical AI.

    3. Overall, the paper is written in a clean and straightforward manner, showing well what the authors are proposing and arguing.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    1. The figure describing the pseudo-3D transformation is unclear.

    2. To enhance reproducibility, more description of the exact structure of the proposed framework or implementation code is needed.

    3. A few typos should be corrected.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Do you have any additional comments regarding the paper’s reproducibility?

    See below.

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html
    1. The authors’ arguments in this paper are well presented in the overall flow of [1. Introduction ~ 2.Method], and [3. Experiments] demonstrate the effectiveness of the proposed framework through experiments on 13 tasks.

    2. The pseudo-3D transformation described in Figure 2 needs to be clarified. When unrolling each window of a 2D image in the pseudo-3D transformation, it is confusing whether it is simply copying multiple windows of the same window to form each window of the pseudo-3D image (i.e., x_i**p3d).

    3. -> Experimental results for this should be included in the paper.

    4. -> This interpretation is questionable, as the performance was higher when the window size was set to 5x5 than when it was set to 7x7. Further experimentation or analysis is needed.

    5. Minor typo error
      • 2.1 preblem -> problem
      • 3.1 6453 -> 6,453
      • 3.1 377088 -> 377,088
      • 3.3 3d -> 3D
      • 3.3 2.1%,2.7% -> 2.1%, 2.7%
      • 3.3 1.0%,1.6% -> 1.0%, 1.6%
  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The lack of data is arguably the biggest factor hindering technological progress in medical AI. In this paper, we contribute to solving this problem by proposing a framework that can utilize 2D and 3D medical images together. The best part is that it is not limited to a specific network architecture and is universally compatible.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

N/A




Meta-Review

Meta-review not available, early accepted paper.



back to top