Paper Info Reviews Author Feedback Meta-Review Back to top

List of Papers Browse by Subject Areas Author List

Abstract

Scoring systems are widely adopted in medical applications for their inherent simplicity and transparency, particularly for classification tasks involving tabular data. In this work, we introduce RegScore, a novel, sparse, and interpretable scoring system specifically designed for regression tasks. Unlike conventional scoring systems constrained to integer-valued coefficients, RegScore leverages beam search and k-sparse ridge regression to relax these restrictions, thus enhancing predictive performance. We extend RegScore to bimodal deep learning by integrating tabular data with medical images. We utilize the classification token from the TIP (Tabular Image Pretraining) transformer to generate Personalized Linear Regression parameters and a Personalized RegScore, enabling individualized scoring. We demonstrate the effectiveness of RegScore by estimating mean Pulmonary Artery Pressure using tabular data and further refine these estimates by incorporating cardiac MRI images. Experimental results show that RegScore and its personalized bimodal extensions achieve performance comparable to, or better than, state-of-the-art black-box models. Our method provides a transparent and interpretable approach for regression tasks in clinical settings, promoting more informed and trustworthy decision-making. We provide our code at https://github.com/SanoScience/RegScore.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/3065_paper.pdf

SharedIt Link: https://rdcu.be/eHxc5

SpringerLink (DOI): https://doi.org/10.1007/978-3-032-05185-1_50

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/SanoScience/RegScore

Link to the Dataset(s)

N/A

BibTex

@InProceedings{GrzMic_RegScore_MICCAI2025,
        author = { Grzeszczyk, Michal K. AND Szczepański, Tomasz AND Renc, Pawel AND Yoon, Siyeop AND Charton, Jerome AND Trzciński, Tomasz AND Sitek, Arkadiusz},
        title = { { RegScore: Scoring Systems for Regression Tasks } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15973},
        month = {September},
        page = {518 -- 528}
}

Reviews

Review #1

Please describe the contribution of the paper

In this work, the authors proposed an updated scoring system in healthcare that could be used for regression tasks. In summary, the aim of this work is interesting. However I found the manuscript not well presented and clear in all its part. Sections 1 and 2 would benefit, respectively, from a more clear description of the task and specific contributions, and from a more clear and rigorous description of the problem formulations, sequence models, etc. Also Section 3 does not provide a clear evidence of some advantage of the proposed method over other methods, including questionable comparisons with methods in different modalities.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

The objective is very relevant and fits the conference’s aims
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
- The presentation must be improved, including the writing and organization
- The novelty and technological contribution is limited
- the experimental results are questionable
Please rate the clarity and organization of this paper

Poor
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission does not provide sufficient information for reproducibility.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

1) Section 1 does not provide a good introduction to the task, it presents the existing methods in a rather smoky way, and does not give a meaningful idea of the problems to be solved. This lack is even more evident considering that a real related work section is missing, while it is necessary to provide a baseline to the readers.

2) Even the description of the proposed method is too concise and incomplete. Some modules are just presented with almost no justification or description of their rationale, except for very concise and vague motivations. On the other hand, they mostly appear to be taken from previous works. Event the methodology proposed to address the task is not so innovative, since in the last years several methods integrating tabular data into decision support systems based on images have been proposed.

3) In Section 3 the experiments are rather superficial and incomplete, and do not show any clear evidence of some advantage of the proposed method over other methods. Indeed in the tables are compared very different approaches and different modalities and the results are not so outstanding. Indeed, one would think that even a traditional machine learning method could achieve better performance by incorporating features from images.
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(3) Weak Reject — could be rejected, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

see comments above
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

In the first round, I recommended a weak reject for this submission. While my opinion regarding the contribution of the paper has not significantly changed, I am now giving an accept only because the final decision appears to be between two options, and the other reviewers have already expressed a (weak) accept—thus pushing the overall evaluation above the rejection threshold. I do not want to be the sole dissenting voice in a case where the consensus leans toward acceptance.

Review #2

Please describe the contribution of the paper

This papers describes a method to create a scoring system (aka, like a nomogram) from both tabular and imaging data. This method combines previous research on scoring systems and also leverages some recent deep learning approaches. It conducted a comprehensive literature review and comparison of existing methods.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
The strengths of this papers are:
1. Very comprehensive evaluation of existing methods
2. A combination of table and image data in one framework, and some good methods combined together to work with these multi-modal data.
3. Improvements to a standard dataset performance
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
This paper could be stronger if the author can include:
1. Whether a foundation model in the imaging data part can improve the performance
2. Bench mark this scoring system to a fully continuous approach without binning tabular or image data
3. Explain in rebuttal the difference between sparse ridge penalty versus LASSO penalty.
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

This is already a very well written paper, but its topic is not very strongly related to MIC. But I would strongly recommend its to MICCAI community if the above 3 limitations can be addressed.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Accept
[Post rebuttal] Please justify your final decision from above.

Sufficiently address my comments

Review #3

Please describe the contribution of the paper

The paper introduces RegScore a scoring system specifically designed for regression tasks like Pulmonary Hypertension diagnosis. by leveraging beam search and k-sparse ridge regression, allowing for improved predictive performance while assuring interpretability, With Bimodal extensions (PLR & PRS) that integrate tabular and imaging data via the Tabular Image Pretraining (TIP) model.
Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

Enabling a deep learning level predictions in PH diagnosis with personalized interpretable results. Instead of classification, the paper introduces a regression based scoring system, replacing the integer constraints with sparse ridge regression. Simple yet effective. Extended bimodal deep learning approach, to consider tabular as well as image context in personalized predictions. Extensive experiments proving the outperformance over ML architectures, and competitiveness with deep learning models, with time and compute saving approach. Ablation studies attesting to the importance of each module; tirtle bins, images and SSL.
Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

Reduced regression performance for the favor of interpretability (TIP) The feature selection process might struggle with datasets where feature interactions are highly non-linear. Using CLS token embeddings for personalized weights may cause overfitting, since it relies on learned representations of the tabular inputs. Unjustified the choice of MLDP and tertile binning for the discretization function (ξ) while it is an important factor in model performance. Unclear on how the near 500 missing values were filled. Unjustified choice of k values, have you tested other values (other than 5 and 50).
Please rate the clarity and organization of this paper

Satisfactory
Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

The submission has provided an anonymized link to the source code, dataset, or any other dependencies.
Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

N/A
Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

(4) Weak Accept — could be accepted, dependent on rebuttal
Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

The choice of some important parameters in the architecture was not justified or backed with an ablation study, yet it has an major effect on model performance.
Reviewer confidence

Confident but not absolutely certain (3)
[Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

Reject
[Post rebuttal] Please justify your final decision from above.

So many limitations that could be addressed to make a remarkable and generalizable architecture.

Author Feedback

We thank the Reviewers for their valuable feedback. We appreciate that the Reviewers found our paper relevant to the MICCAI community (R1,R2), well-written (R2), and supported by extensive experiments and ablations (R2,R3). Below, we address the main concerns:

Unclear contributions, motivation, and limited novelty (R1): As stated in the introduction, existing scoring systems handle only classification. We explore them for regression and propose RegScore, which surpasses existing state-of-the-art (SOTA) methods. We further introduce Personalized Linear Regression (PLR) and Personalized RegScore (PRS) models, generating interpretable predictions from bimodal, self-supervised Tabular Image Pretraining (TIP) that are competitive with black-box methods. We evaluate these on Pulmonary Hypertension diagnosis, a regression-then-thresholding task, for which our method is more suitable than classification-based systems.

Incomplete/superficial experiments (R1): We respectfully disagree. As noted by R2,R3, we conducted extensive experiments. We benchmarked RegScore against SOTA scoring systems and personalized extensions of RegScore integrating imaging data (PLR, PRS) against bimodal black-box architectures, showing competitive or superior performance. Finally, we conducted ablations on key components that R3 explicitly highlighted.

No related works section (R1): Due to space constraints, MICCAI papers often integrate related works into the introduction, which we have done thoroughly, as noted by R2.

No reproducibility (R1): We provided the anonymized source code.

Lack of evidence for advantages (R1): RegScore is specifically designed for regression and outperforms other scoring systems with statistical significance, as shown in our results, while PLR/PRS generate interpretable predictions from a bimodal architecture.

Limited MIC aspect (R2): We agree that only part of our paper addresses MIC through bimodal deep learning. The broader aim is to contribute interpretable clinical decision support, which aligns well with CAI and remains relevant for MICCAI.

Foundation models to improve PLR/PRS (R2): We agree and appreciate this suggestion. Leveraging a foundation model for imaging could further enhance PLR and PRS, since SSL proved to be valuable. An interesting future work could develop a bimodal (tabular+imaging) foundation model as a base for PLR/PRS.

Benchmarking scoring systems against continuous approaches (R2): It can be done by comparing results from Table 1 with those in Table 2 (e.g., linear regression).

Sparse ridge penalty vs LASSO penalty (R2): The main difference is that ridge uses L2-norm and LASSO uses L1-norm. We use OKRidge [16] as one of the backbones of RegScore, because it solves a k-sparse ridge regression problem in an optimally provable way via branch-and-bound search.

No ablation of model size k (R3): We compare RegScore with a second-best FasterRisk method for varying k (3-50) in Fig.2. We use k=5, the most common choice in prior work.

Trade-off between interpretability and performance (R3): We agree, and stated this as the limitation, however, this aspect allows clinicians to choose between more interpretable or higher-performing models based on their needs.

Feature selection and non-linear interactions (R3): We acknowledge this limitation, common to scoring systems, and will address it more thoroughly in the revised version.

CLS token and overfitting (R3): We agree, generating sample-specific weights from CLS token embeddings increases model flexibility and can amplify overfitting risk. To mitigate this, we use self-supervised pretraining (TIP) and cross-validation.

MDLP and tertile binning choice (R3): We followed standard practices from scoring systems literature (e.g. [10]), using consistent binning across methods. We agree that it is worth exploring in future work.

Missing value imputation (R3): We imputed missing values using the mean for continuous and the mode for categorical features.

Meta-Review

Meta-review #1

Your recommendation

Invite for Rebuttal
If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

N/A
After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #2

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

Meta-review #3

After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

Accept
Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

N/A

back to top

RegScore: Scoring Systems for Regression Tasks

Author(s):