What's New

Risk stratification in pulmonary arterial hypertension using Bayesian analysis

Kanwar MK, Gomberg-Maitland M, Hoeper M, Pausch C, Pittow D, Strange G, Anderson J, Zhao C, Scott J, Druzdzel M, Kraisangka J, Lohmueller L, Antaki J, Benza R. Risk stratification in pulmonary arterial hypertension using Bayesian analysis. Eur Respir J. Accepted April 21, 2020.


Background Current risk stratification tools in pulmonary arterial hypertension (PAH) are limited in their discriminatory abilities, partly due to the assumption that prognostic clinical variables have an independent and linear relationship to clinical outcomes. We sought to demonstrate the utility of Bayesian network (BN) based machine learning in enhancing the predictive ability of an existing state-of-the-art risk stratification tool, REVEAL 2.0.

Methods We derived a Tree Augmented Naïve Bayes model (titled PHORA) to predict one-year survival in PAH patients included in the REVEAL registry, using the same variables and cut-points found in REVEAL 2.0. PHORA models were validated internally (within the REVEAL registry) and externally (in COMPERA and PHSANZ registry). Patients were classified as low, intermediate and high-risk (<5%, 5-20% and>10% 12-month mortality, respectively) based on the 2015 ESC/ERS guidelines.

Results PHORA had an AUC of 0.80 for predicting one-year survival, which was an improvement over REVEAL 2.0 (AUC of 0.76). When validated in COMPERA and PHSANZ registries, PHORA demonstrated an AUC of 0.74 and 0.80 respectively. One-year survival rates predicted by PHORA were greater for patients with lower risk scores and poorer for those with higher risk scores (P<.001), with excellent separation between low-, intermediate-, and high-risk groups in all three registries.

Conclusion Our BN derived risk prediction model, PHORA, demonstrated an improvement in discrimination over existing models. This is reflective of BN based model’s ability to account for the interrelationships between clinical variables on outcome, and tolerance to missing data elements when calculating predictions.