Abstract

Right Heart Catheterization is a gold standard procedure for diagnosing Pulmonary Hypertension by measuring mean Pulmonary Artery Pressure (mPAP). It is invasive, costly, time-consuming and carries risks. In this paper, for the first time, we explore the estimation of mPAP from videos of noninvasive Cardiac Magnetic Resonance Imaging. To enhance the predictive capabilities of Deep Learning models used for this task, we introduce an additional modality in the form of demographic features and clinical measurements. Inspired by all-Multilayer Perceptron architectures, we present TabMixer, a novel module enabling the integration of imaging and tabular data through spatial, temporal and channel mixing. Specifically, we present the first approach that utilizes Multilayer Perceptrons to interchange tabular information with imaging features in vision models. We test TabMixer for mPAP estimation and show that it enhances the performance of Convolutional Neural Networks, 3D-MLP and Vision Transformers while being competitive with previous modules for imaging and tabular data. Our approach has the potential to improve clinical processes involving both modalities, particularly in noninvasive mPAP estimation, thus, significantly enhancing the quality of life for individuals affected by Pulmonary Hypertension. We provide a source code for using TabMixer at https://github.com/SanoScience/TabMixer.

Links to Paper and Supplementary Materials

Main Paper (Open Access Version): https://papers.miccai.org/miccai-2024/paper/2329_paper.pdf

SharedIt Link: pending

SpringerLink (DOI): pending

Supplementary Material: https://papers.miccai.org/miccai-2024/supp/2329_supp.pdf

Link to the Code Repository

https://github.com/SanoScience/TabMixer

Link to the Dataset(s)

N/A

BibTex

@InProceedings{Grz_TabMixer_MICCAI2024,
        author = { Grzeszczyk, Michal K. and Korzeniowski, Przemysław and Alabed, Samer and Swift, Andrew J. and Trzciński, Tomasz and Sitek, Arkadiusz},
        title = { { TabMixer: Noninvasive Estimation of the Mean Pulmonary Artery Pressure via Imaging and Tabular Data Mixing } },
        booktitle = {proceedings of Medical Image Computing and Computer Assisted Intervention -- MICCAI 2024},
        year = {2024},
        publisher = {Springer Nature Switzerland},
        volume = {LNCS 15005},
        month = {October},
        page = {pending}
}


Reviews

Review #1

  • Please describe the contribution of the paper

    The authors propose an approach to integrate imaging and tabular data utilising Multilayer Perceptrons (MLPs), naming their method TabMixer. The motivation for this is to enhance the estimation of mean Pulmonary Artery Pressure using cardiac MR as an alternative method to the invasive procedure right heart catheterisation. The results show improvement in terms of the MAE, RSME and MAPE over other SOTA methods with different types of backbone architectures for integrating tabular and imaging data.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    The authors propose a method that uses only MLPs to carry out the mixing of tabular and imaging data which does seem to be novel. They also perform an extensive comparison to SOTA methods that integrate tabular and imaging data, as well as those that utilise imaging-only and tabular-only data. Furthermore, they perform an ablation study to demonstrate the utility of each added component of their proposed method.

    The paper also includes a comprehensive discussion of previous and related work, and justification of design choices.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    The authors claim that the third contribution of the paper is that they are the first to demonstrate noninvasive mPAP estimation using CMR videos. There are non deep learning methods that do this (e.g. Ramos 2020, Reiter 2021). Also, given that the deep learning method would require more compute power and likely time to train (the authors themselves mention that TabMixer is not lightweight), a comparison to such non-DL methods would be necessary to see whether the extra computational needs give a performance benefit.

    Ramos JG, Fyrdahl A, Wieslander B, et al. Cardiovascular magnetic resonance 4D flow analysis has a higher diagnostic yield than Doppler echocardiography for detecting increased pulmonary artery pressure. BMC medical imaging 2020;20(1):28. Reiter U, Kovacs G, Reiter C, et al. MR 4D flow-based mean pulmonary arterial pressure tracking in pulmonary hypertension. Eur Radiol 2021;31(4):1883-1893

  • Please rate the clarity and organization of this paper

    Excellent

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please soften the statement claiming the work as the first to demonstrate noninvasive mPAP estimation using CMR. Double check the literature and refer to other works that do this (including non deep learning methods), clarifying what this work does differently, which will help to highlight the contribution of the paper.

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The authors have presented a novel method of integrating tabular and imaging data using MLPs and demonstrate its use via extensive comparison to other techniques, as well as showing an ablation study to evaluate different components of their method. However some alterations should be made, in particular, the claim of being the first method to estimate mPAP using CMR videos.

  • Reviewer confidence

    Somewhat confident (2)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #2

  • Please describe the contribution of the paper

    The paper proposes a novel method to estimate mean Pulmonary Artery Pressure (mPAP) using noninvasive Cardiac Magnetic Resonance Imaging (CMR) videos. The authors introduce TabMixer, a module that integrates imaging and tabular data through spatial, temporal and channel mixing. They evaluate TabMixer on the mPAP prediction task and show that it outperforms existing methods. This is the first demonstration of noninvasive mPAP estimation using CMR videos.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.
    • Novelty: The paper proposes a novel method, TabMixer, for combining imaging and tabular data in 3D networks.
    • Improved performance: TabMixer demonstrates improvement in mPAP estimation compared to existing methods on both short-axis and 4-chamber CMR planes.
    • Explainability: The use of MLPs allows for interpretability of the model’s decision-making process.
    • Robustness: The model exhibits resilience to noise in both imaging and tabular data, with TabMixer showing greater robustness to noisy videos.
    • Clinical relevance: The non-invasive estimation of mPAP using TabMixer has the potential to reduce the need for invasive procedures.
  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.
    • Lacks validation on external datasets, limiting generalizability claims.
    • Focuses on single CMR planes (SA) for estimation, potentially missing complementary information from other planes (4CH).
    • TabMixer can be computationally expensive due to extensive use of MLP layers, leading to a high number of trainable parameters.
    • The paper does not explore alternative lightweight architectures for TabMixer while maintaining its effectiveness.
    • Comparison with tabular-only methods: While TabMixer outperforms most imaging-only methods, some tabular-only methods (like Random Forest) still achieve better performance which could indicate that most of the information for the diagnosis is contained in the clinical data. The motivation of adding the imaging data is reduced.
  • Please rate the clarity and organization of this paper

    Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    Sharing the source code for TabMixer is commendable and promotes reproducibility

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Suggestions for Improvement:

    • External Validation: Validate TabMixer’s performance on independent datasets to strengthen generalizability claims.
    • Lightweight TabMixer: Explore methods to reduce the computational cost of TabMixer. This could involve using more efficient architectures, parameter reduction techniques, or a hybrid approach with simpler methods for specific data types.
    • Tabular Data Analysis: Conduct feature importance analysis to identify which types of tabular features contribute most significantly to mPAP prediction. This knowledge can guide data collection and model development.
    • Comparison with Recent Fusion Methods: Consider including comparisons with more recent advancements in image and tabular data fusion beyond DAFT and FiLM.
    • 4CH and SA Video Combination: Investigate the potential benefits of combining predictions from both 4CH and SA video planes for improved accuracy.

    For extension:

    • Self-supervised Pre-training: Explore pre-training TabMixer on large, unlabeled datasets to improve feature learning and reduce reliance on large amounts of labeled data. For example using the method presented in [1].

    1: Best of Both Worlds: Multimodal Contrastive Learning with Tabular and Imaging Data

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Weak Accept — could be accepted, dependent on rebuttal (4)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    The paper is well-written and clearly explains the proposed method, evaluation metrics, and results. The ablation study effectively demonstrates the importance of each component in TabMixer. The inclusion of the source code for TabMixer is a valuable resource for the community. This paper presents a promising novel method (TabMixer) with clear strengths in combining imaging and tabular data for mPAP estimation. While limitations exist regarding generalizability, computational cost, and potential for further improvement, the overall contribution is significant.

  • Reviewer confidence

    Very confident (4)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A



Review #3

  • Please describe the contribution of the paper

    This paper proposes the first noninvasive mPAP estimation using CMR videos. A new idea of mixing tabular data with imaging features, TabMixer was proposed and demonstrated performance improvements over previous methods.

  • Please list the main strengths of the paper; you should write about a novel formulation, an original way to use data, demonstration of clinical feasibility, a novel application, a particularly strong evaluation, or anything else that is a strong aspect of this work. Please provide details, for instance, if a method is novel, explain what aspect is novel and why this is interesting.

    This paper proposed to use CMR videos for mPAP estimation which is novel in the domain.

    The proposed TabMixer demonstrates stable and significant performance gain on this particular task over prior methods.

    The paper demonstrates technical soundness with sufficient comparative analysis and ablation studies. It’s also clearly written and easy to follow.

  • Please list the main weaknesses of the paper. Please provide details, for instance, if you think a method is not novel, explain why and provide a reference to prior work.

    There is one concern regarding the network detail. Since the CMR videos are grayscale, meaning the number of color channels is 1, it is unclear why a vision backbone generates C color channels. This needs to be accounted of.

  • Please rate the clarity and organization of this paper

    Very Good

  • Please comment on the reproducibility of the paper. Please be aware that providing code and data is a plus, but not a requirement for acceptance.

    The submission has provided an anonymized link to the source code, dataset, or any other dependencies.

  • Do you have any additional comments regarding the paper’s reproducibility?

    N/A

  • Please provide detailed and constructive comments for the authors. Please also refer to our Reviewer’s guide on what makes a good review. Pay specific attention to the different assessment criteria for the different paper categories (MIC, CAI, Clinical Translation of Methodology, Health Equity): https://conferences.miccai.org/2024/en/REVIEWER-GUIDELINES.html

    Please provide an illustration of the color channel issue.

    Noise resistance study (Fig 3) could include raw I3D performance

  • Rate the paper on a scale of 1-6, 6 being the strongest (6-4: accept; 3-1: reject). Please use the entire range of the distribution. Spreading the score helps create a distribution for decision-making

    Accept — should be accepted, independent of rebuttal (5)

  • Please justify your recommendation. What were the major factors that led you to your overall score for this paper?

    This is a very well-written, scientifically sound paper that demonstrated the feasibility of using multimodal data (CMR video+tabular) to perform mPAP estimation. The proposed TabMixer module can potentially benefit other applications.

  • Reviewer confidence

    Confident but not absolutely certain (3)

  • [Post rebuttal] After reading the author’s rebuttal, state your overall opinion of the paper if it has been changed

    N/A

  • [Post rebuttal] Please justify your decision

    N/A




Author Feedback

We thank the Reviewers for their constructive comments and positive feedback. Below, we address the main concerns regarding our work.

It is not the first work that uses CMR for mPAP estimation (R1): We agree that our work is not the first to utilize CMR for mPAP estimation. While some previous studies use features extracted from CMRs (such as systolic septal angle, ventricle volumes, etc.) to predict mPAP, to the best of our knowledge, we are the first to make this prediction directly from CMR videos without additional processing steps or the extraction of manually chosen features. We will clarify this distinction in our camera-ready version and include references to other works in this domain.

The vision backbone generates C channels while the input is grayscale and has 1 color channel (R6): This is standard processing within hierarchical vision models. In these models, the number of channels increases in successive layers while the spatial resolution decreases.

Tabular-only methods achieve better performance (R7): Although some imaging methods result in higher prediction errors compared to tabular-only methods, the addition of TabMixer improves overall performance. Notably, the combination of I3D with TabMixer and the SA plane achieves the lowest error among all tested methods (imaging and/or tabular). This underscores the importance of the optimal combination of the vision backbone and tabular module for the task.

Lightweight alternatives of TabMixer, self-supervised pretraining, combination of multiple CMR planes, and tabular data analysis (R7): We agree that these are promising research directions worth exploring.




Meta-Review

Meta-review not available, early accepted paper.



back to top