Abstract

Portal hypertension (PHT), a critical complication of liver disease, is primarily assessed via invasive procedures that carry inherent risks and discomfort. Recent advancements in deep learning have demonstrated potential for non-invasive diagnostic assistance based on computed tomography (CT) images. However, the small sample size and notable imbalance in PHT clinical data severely restrict the performance of deep learning methods, while the bias introduced by label discretization further compromises model robustness. To address these challenges, we propose a Regression-assisted Classification (RAC) method for non-invasive PHT diagnosis. Firstly, we propose the RAC method instead of direct classification, enabling fine-grained estimation of hepatic venous pressure gradient (HVPG) values before making categorical decisions, thereby reducing the bias caused by discrete label assignment. Moreover, the boundary-aware weighted learning method is proposed to jointly optimize model parameters and the loss function by dynamically assigning online bucket-based weights and enforcing gradient balance across decision boundaries. We show that this approach can significantly reduce the impact of data imbalance and help handle the challenges of small-sample learning in PHT diagnosis. Experiments on our collected clinical CT dataset achieve 83.28% accuracy and 82.69% for the area under the receiver operating characteristic curve in the three-class classification task of PHT, outperforming the cross-entropy baseline by +1.01% and +2.38%, respectively. These results demonstrate leading performance in PHT multi-class classification diagnostic tasks and offer an effective solution for the direct diagnosis of PHT based on CT images.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2025/paper/4338_paper.pdf

SharedIt Link: Not yet available

SpringerLink (DOI): Not yet available

Supplementary Material: Not Submitted

Link to the Code Repository

https://github.com/GuoLab-UESTC/RAC-for-CT-based-PHT

Link to the Dataset(s)

PHT CT data subset: https://drive.google.com/drive/folders/1DEQ9fm_4E2hIRZ0ymvecKbUJXfxVcMr4?usp=drive_link

BibTex

@InProceedings{CaiWuq_Regressionassisted_MICCAI2025,
        author = { Cai, Wuque and He, Jiayi and Guo, Xu and Sun, Hongze and Tong, Huan and Wei, Bo and Wu, Hao and Yao, Dezhong and Guo, Daqing},
        title = { { Regression-assisted Classification for CT-based Portal Hypertension Diagnosis } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2025},
        year = {2025},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15974},
        month = {September},

}


Reviews

Review #1

  • Please describe the contribution of the paper

    This paper proposes a novel Regression-assisted Classification (RAC) framework for diagnosing portal hypertension (PHT) using CT images. By jointly learning continuous HVPG values and discrete classes, along with a new Boundary-aware Weighted (BAW) loss function, the method addresses challenges of data imbalance and limited clinical data.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    1. The paper is overall well-written.
    2. The proposed RAC framework has certain novelty by combining regression and classification to reduce the bias introduced by discretizing continuous HVPG measurements.
    3. A novel boundary-aware weighted loss for improving learning from imbalanced regression is proposed.
    4. Authors have demonstrated superior performance on a CT dataset.
  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.
    1. The proposed method is only evaluated on a single CT dataset collected from one institution.
    2. Given that the method is specifically designed to address class imbalance, it is important to report per-class metrics in addition to macro-averaged metrics.
    3. While LDAM is a widely used loss function in imbalanced classification tasks, the rationale for choosing LDAM in this specific context over other alternatives is not clearly discussed. Furthermore, since the BAW loss uses sample-wise trainable weights w_i , scalability could become a concern for large-scale datasets.
    4. I am a bit confused about the actual task it is solving, If the main task is about PHT classification, then why do we need to include the regression task of estimating HVPG, and in the abstract you mention that your are trying to tackle the imbalance issue in PHT data, then in the introduction the task become solving imbalance issue in HVPG.
    5. Table 3 mentions a 3-class classification setup, but the exact class definitions are not provided.
    6. Codes are missing and it could be hard to reimplement the method based on the current explanation in the implementation detail section. E.g., what is the number of prompts used in VPT-Deep what is the feature extractor backbone and its pretraining dataset, why VPT is chosen over other fine-tuning strategies and whether the method works without fine-tuning, and if efficient parameter tuning is a central contribution?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not provide sufficient information for reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (3) Weak Reject — could be rejected, dependent on rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    I find the paper to address an important and clinically relevant problem with a novel framework that integrates regression and classification for imbalanced CT-based PHT diagnosis. However, several aspects, including the experimental scope, clarity of the task setup, justification for design choices, and missing implementation details, limit the paper’s clarity and reproducibility in its current form. I am currently rating the paper as a weak reject, but I am open to increasing my score if the authors can adequately address the concerns outlined above.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    I thank the authors for their efforts in addressing my concerns. As all of my issues have been satisfactorily resolved, I am willing to increase my score to accept.



Review #2

  • Please describe the contribution of the paper

    This paper presents a regression-assisted classification framework for non-invasive diagnosis of portal hypertension from CT images. To address data imbalance and label discretization issues, the authors introduce fine-grained HVPG estimation (regression-assisted classification) and a boundary-aware weighted loss. The method achieves state-of-the-art performance on a private clinical dataset, demonstrating its effectiveness in small-sample, multi-class PHT diagnosis.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The paper is clearly written and guides the reader gradually to understand the advantages of the new loss function. Particularly useful is Section 2.3 to show the effect of the gradient calculation.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    Questions/notes:

    • In the abstract, instead of the absolute numbers would put the gain in performance of the proposed method, over the state of the art, to highlight the improvements.
    • “To address data scarcity and standardize sample size, multiple layers are combined while maintaining strict train-test separation based on medical records to prevent data leakage.”: what do the authors mean with “multiple layers are combined”?
    • The authors refer to gamma_j saying “represents the decision boundary”, should it not be “margin”?
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    N/A

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well written and clear. The new loss function is clearly introduced and compared against other widely used methods from the state of the art. I only have a few questions which I shared in my previous comment.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors addressed my and other reviewers’ concerns in the rebuttal, clarifying some unresolved questions. I confirm my positive feedback about this work and recommend for acceptance.



Review #3

  • Please describe the contribution of the paper

    This paper proposes a novel framework called Regression-Assisted Classification (RAC) for diagnosing portal hypertension (PHT) using CT imaging. The standard method for assessing PHT involves measuring the hepatic venous pressure gradient (HVPG), which is invasive, expensive, and clinically burdensome. The authors aim to offer a non-invasive, CT-based alternative, overcoming key challenges in current deep learning models, particularly label discretization bias and data imbalance in small-sample medical datasets.

  • Please list the major strengths of the paper: you should highlight a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The use of regression to assist classification is elegant, addressing the limitations of label discretization while preserving interpretability for clinical decision-making.

    Effectively addresses small-sample and imbalanced label distribution problems common in clinical datasets.

    Uses macro-averaged metrics to avoid bias from dominant classes.

    Includes ablation studies, ROC curve visualizations, and comparisons with existing methods (e.g., LDAM, Focal, BMC, LDS, etc.).

    Achieves a notable accuracy of 83.28% and AUC of 82.69% for the 3-class PHT diagnosis task.

    Multi-modal integration: Incorporates Child-Pugh scores (a known clinical factor) alongside CT images for better prediction.

    The workflow (CT + clinical score input) is realistic in many hospital systems.

  • Please list the major weaknesses of the paper. Please provide details: for instance, if you state that a formulation, way of using data, demonstration of clinical feasibility, or application is not novel, then you must provide specific references to prior work.

    The paper lacks commitment to releasing code, model weights, or a public subset of the dataset for reproducibility.

    The model relies on known components (e.g., VPT-Deep, attention visualization, bucketed weighting). The main novelty lies in loss formulation and task combination rather than architecture.

    While performance is strong, integrating such a regression-classification hybrid model into real-time radiology workflows could pose interpretability and usability challenges.

  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission does not mention open access to source code or data but provides a clear and detailed description of the algorithm to ensure reproducibility.

  • Optional: If you have any additional comments to share with the authors, please provide them here. Please also refer to our Reviewer’s guide on what makes a good review and pay specific attention to the different assessment criteria for the different paper categories: https://conferences.miccai.org/2025/en/REVIEWER-GUIDELINES.html

    Consider releasing a preprocessed subset of your dataset or providing synthetic CT cases with HVPG annotations to aid reproducibility.

    It would be helpful to assess the model’s performance on external validation sets, possibly from other institutions or populations.

    Future work could explore integrating interpretability tools (e.g., saliency maps, SHAP values) to support clinical transparency.

    The use of Child-Pugh score is insightful — extending the model with additional clinical parameters (e.g., lab tests, patient history) could further improve diagnostic precision.

    Additionally, I suggest the authors consider citing recent related work such as DiGAN (https://doi.org/10.1016/j.cmpbup.2024.100152), which proposes a GAN-based approach to address imbalanced medical datasets. While the modalities differ, the shared focus on correcting data imbalance in small-sample clinical settings may offer useful comparative insights and strengthen the paper’s discussion on methodological relevance across domains.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making.

    (5) Accept — should be accepted, independent of rebuttal

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This work presents a well-motivated and clinically impactful framework that addresses two common challenges in medical AI — discretization of continuous clinical measurements and data imbalance in small patient cohorts. The RAC framework, with its novel BAW loss, is tested rigorously on real clinical data, demonstrating improvements across accuracy, recall, and AUC. Despite the lack of external validation or open code, the method is promising, well-executed, and relevant to MICCAI’s goals of clinical translation. It deserves acceptance.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the authors’ rebuttal, please state your final opinion of the paper.

    Accept

  • [Post rebuttal] Please justify your final decision from above.

    The authors have provided a thorough and well-structured rebuttal that effectively addresses the concerns raised during the initial review.




Author Feedback

We thank Reviewers 1-3 for their insightful comments and suggestions (weakness: W, comment: C), and provide point-by-point responses (A) below.

R1W6, R3W1, R3C1: Code implementations and data subsets are not provided, and reproducibility is insufficient. A: Our code, model parameters, and a subset of the data will be released after formal acceptance. Implementation details will be provided in the code.

R1W1, R3C2: The method using only a single institution’s CT dataset may lack external validation. A: This is an insightful comment. However, there are currently no public CT datasets for hepatic venous pressure gradient (HVPG) prediction and portal hypertension (PHT) assessment. Accordingly, we used independently clinical data collected by ourself in this study.

R1W4, R1W5: Clarification is needed regarding the 3-category criteria and the relationship between HVPG and PHT. A: We thank this important feedback. Our categories are: HVPG≤5mmHg (normal portal pressure), 5<HVPG≤12mmHg (portal hypertension), and HVPG>12mmHg (variceal bleeding threshold), with sample sizes detailed in Section 3.1. As the clinical gold standard, HVPG enables both continuous regression predictions and threshold-based classification. We will clarify this dual approach in revision.

R1W2: Per-class metrics are not reported, limiting evaluation of class imbalance handling. A: Thanks for this important comment. Based on rebuttal guidelines, we cannot add new experimental results. To a certain extent, our reported macro metrics (Recall/Precision) demonstrate balanced inter-category performance. We will report detailed per-class metrics on large-scale datasets in future work.

R1W3: The justification for LDAM loss selection is insufficient, and BAW loss’s sample weights may have scalability issues in large-scale applications. A: This is an insightful comment. LDAM directly optimizes category-level decision boundaries, offering theoretical advantages over hyperparameter-dependent methods. As shown in Table 1, LDAM achieves competitive performance alone, with further gains when combined with BAW loss. While current dataset size limits full scalability evaluation, please note that our ongoing data collection will enable large-scale validation in future work.

R2W1: The performance improvement relative to the state of the art (SOTA) is not highlighted in the abstract, only the absolute value provided. A: We will highlight the SOTA performance improvement in the revised abstract.

R2W2: The meaning of “Multiple layers are combined” is not clearly explained. A: Thanks for careful reading. Multiple layers here refer to the salient layers part in Fig.1. we will explain it in revision.

R2W3: There is an ambiguity in gamma_i’s description of boundary or margin. A: Thanks for this constructive feedback. gamma_i indeed represents a decision boundary in our formulation. We will revise the manuscript to consistently use “boundary”.

R3W2: The main novelty lies in the loss formulation and task combinations rather than architecture (framework). A: This constructive suggestion is greatly appreciated. Our primary innovation indeed lies in the novel loss function design and multi-task integration strategy. We will consistently refer to this as the “RAC method” (rather than framework) in revision to better reflect its technical nature.

R3W3, R3C3: There are concerns about the interpretability and usability challenges of our hybrid model. A: We appreciate the reviewers’ observation. To address interpretability, we provide attention visualizations in the current work, with plans to integrate additional explainability tools such as saliency maps. For clinical usability, we have implemented the complete pipeline as modular code (to be released) enabling straightforward deployment. These efforts collectively enhance clinical translation potential.

R3C5: Relevant recent work such as DiGAN is not cited. A: We have studied the suggested work (DiGAN) and will cite it in revision.




Meta-Review

Meta-review #1

  • Your recommendation

    Invite for Rebuttal

  • If your recommendation is “Provisional Reject”, then summarize the factors that went into this decision. In case you deviate from the reviewers’ recommendations, explain in detail the reasons why. You do not need to provide a justification for a recommendation of “Provisional Accept” or “Invite for Rebuttal”.

    N/A

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



Meta-review #2

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    This paper proposes a regression-assisted classification (RAC) method for CT-based portal hypertension diagnosis, combining regression and classification tasks with a novel loss function design. All three reviewers support acceptance, noting that the authors provided a thorough and constructive rebuttal that addressed key concerns around data labelling, task formulation, and interpretability. While the study is based on data from a single institution and public datasets are unavailable, the authors demonstrate clear awareness of generalisability limitations and have committed to releasing code and data subsets. The paper is methodologically sound, clinically relevant, and well-positioned to contribute meaningfully to AI-driven diagnosis.



Meta-review #3

  • After you have reviewed the rebuttal and updated reviews, please provide your recommendation based on all reviews and the authors’ rebuttal.

    Accept

  • Please justify your recommendation. You may optionally write justifications for ‘accepts’, but are expected to write a justification for ‘rejects’

    N/A



back to top