Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Radiology deep learning pipelines predominantly employ end-to-end 3D networks based on models pre-trained on other tasks, which are then fine-tuned on the task at hand. In contrast, adjacent medical fields such as pathology, which focus on 2D images, have effectively adopted task-agnostic foundational models based on self-supervised learning, combined with weakly-supervised deep learning. However, the field of radiology still lacks task-agnostic representation models due to the computational and data demands of 3D imaging and the anatomical complexity inherent to radiology scans. To address this gap, we propose CLEAR, a framework for 3D radiology images that uses extracted embeddings from 2D slices along with attention-based aggregation to efficiently predict clinical endpoints. As part of this framework, we introduce LECL, a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans. Specifically, we trained single-domain contrastive learning approaches using three different architectures: Vision Transformers, Vision State Space Models and Gated Convolutional Neural Networks. We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models, including BiomedCLIP. Our findings demonstrate that CLEAR, using representations learned through LECL, outperforms existing foundation models, while being substantially more compute- and data-efficient. The code is available at https://github.com/KatherLab/CLEAR.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4381_paper.pdf

SharedIt Link: https://rdcu.be/eHaVi

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-04965-0_2

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/KatherLab/CLEAR

Link to the Dataset(s)

N/A

BibTex

@InProceedings{LigMar_AbnormalityDriven_MICCAI2025,
        author = { Ligero, Marta AND Lenz, Tim AND Wölflein, Georg AND El Nahhas, Omar S. M. AND Truhn, Daniel AND Kather, Jakob Nikolas},
        title = { { Abnormality-Driven Representation Learning for Radiology Imaging } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15963},
        month = {September},
        page = {14 -- 24}
}

Reviews

Review #1

Please describe the contribution of the paper

This paper introduces a contrastive learning-based framework for radiology with lesion-enhanced semi-supervised algorithm.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Strength: 1) The proposed CLEAR could be more computational efficient, as it uses a 2D image encoder instead of a 3D encoder. 2) It has a holistic evaluation of various vision encoder architecture to validate the best vision encoder and superiority of LECL framework against others.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Weakness: 1) Though this paper claims the 2D model can improve computational efficiency, it is still important to compare model performance against a 3D vision encoder. This will see if the tradeoff computational resources would be necessary. 2) Table 3 might also need to consider BACC metric as each label has imbalanced weight.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

Needs minimal revision to improve the evaluation of the CLEAR framework.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #2

Please describe the contribution of the paper

This paper introduces CLEAR, a lightweight and scalable framework for radiology image analysis that leverages abnormality-driven contrastive representation learning.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
1. The idea of leveraging multiple instance learning, a technique commonly used in computational pathology, for modeling slices in 3D radiology data is both novel and interesting.
2. The proposed LeCL method introduces abnormality-driven supervision within a contrastive learning framework, enhancing the model’s ability to learn lesion-aware representations.
3. The framework is evaluated across three diverse clinical tasks and datasets, demonstrating comparable or superior performance to state-of-the-art foundation models while using fewer parameters and computational resources.
4. The source code is available and supports reproducibility.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. While the paper makes a strong case for abnormality-driven learning, it does not sufficiently review prior works that have applied contrastive learning techniques for lesion-aware representation learning in radiology. Some examples are listed below. The authors should review and discuss these related studies to better position their contribution. [1] Foundation model for cancer imaging biomarkers[J]. Nature machine intelligence, 2024, 6(3): 354-367. [2] Cross-grained contrastive representation for unsupervised lesion segmentation in medical images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 2347-2354. [3] HiFi-Syn: Hierarchical granularity discrimination for high-fidelity synthesis of MR images with structure preservation[J]. Medical Image Analysis, 2025, 100: 103390.
2. It would strengthen the paper to include a comparison against 3D pre-trained encoders within the proposed framework, to better highlight its computational efficiency and explicitly demonstrate the performance gap (if any) between 2D and 3D encoder designs.
3. The DeepLesion dataset primarily contains lesion patches and lacks normal tissue patches, raising questions about how the authors implemented Lesion-enhanced Contrastive Learning (LeCL). More detail is needed on how negative samples were defined and balanced, and whether this skew in the dataset influenced the learning dynamics.
4. The use of fixed patch sizes in LeCL is conceptually sound, but the rationale for choosing these specific sizes is insufficiently discussed. Very small patches may yield fragmented representations, while large patches may conflate distinct tissue types (as discussed in [2]). The authors could add an ablation study or discuss this in the paper.
5. Figure 1C illustrates that LeCL is used for encoder pretraining, while Figure 1B shows that the encoder is frozen. The pipeline could benefit from additional clarification
Please rate the clarity and organization of this paper

Good
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

I rated this paper positively due to its novel integration of lesion-aware contrastive learning with an efficient attention-based framework tailored for radiology. The results across multiple clinical tasks, combined with the open-source release, support the practical utility of the approach.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Review #3

Please describe the contribution of the paper

The paper introduces a framework for 3D radiology that uses abnormality-driven 2D slice embeddings with attention-based aggregation, along with a contrastive learning approach.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

A key strength of this paper is its use of 2D axial slices with attention-based aggregation, which reduces computational load while preserving spatial context. It also introduces a domain-specific contrastive learning strategy that captures abnormality-driven features, improving clinical relevance of the learned representations.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
1. The study focuses solely on CT scans; it’s unclear how well the approach generalizes to other imaging modalities like MRI or PET.
2. LeCL relies on lesion-centered annotations for its contrastive strategy, which can limit scalability in low-resource or unlabeled settings.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(5) Accept — should be accepted, independent of rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The scope is somewhat narrow, focusing solely on CT data, and the reliance on annotated lesion slices hinders generalizability. Some architectural choices, such as the attention aggregation and the contrastive setup, would benefit from deeper ablation to understand their individual impact. The comparison with baselines, while relevant, could also be expanded.

Despite these limitations, the paper is technically sound, addresses a meaningful challenge, and presents a method with both empirical strength and practical relevance. With minor improvements in analysis and broader validation, it could become a strong contribution. Thus, I recommend Accept.
Reviewer confidence

Very confident (4)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

N/A
[Post rebuttal] Please justify your final decision from above.

N/A

Author Feedback

N/A

Meta-Review

Meta-review #1

Your recommendation

Provisional Accept
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A

back to top

Abnormality-Driven Representation Learning for Radiology Imaging

Author(s):