Prediction of Incident Hypertension Within the Next Year: Prospective Study Using Statewide Electronic Health Records and Machine Learning

Chengyin Ye(Hangzhou Normal University), Tianyun Fu, Shiying Hao(Lucile Packard Children's Hospital), Yan Zhang(First Hospital of Shijiazhuang), Oliver Wang, Bo Jin, Minjie Xia, Modi Liu, Xin Zhou(Pingjin Hospital), Qian Wu(North China Electric Power University), Yanting Guo(Zhejiang University), Chunqing Zhu, Yuming Li(Pingjin Hospital), Devore S Culver(Lifenet Health), Shaun T Alfreds(Lifenet Health), Frank Stearns, Karl G. Sylvester(Stanford University), Eric Widen, Doff B. McElhinney(Lucile Packard Children's Hospital), Xuefeng B. Ling(Lucile Packard Children's Hospital)
Journal of Medical Internet Research
January 30, 2018
Cited by 250Open Access
Full Text

Abstract

BACKGROUND: As a high-prevalence health condition, hypertension is clinically costly, difficult to manage, and often leads to severe and life-threatening diseases such as cardiovascular disease (CVD) and stroke. OBJECTIVE: The aim of this study was to develop and validate prospectively a risk prediction model of incident essential hypertension within the following year. METHODS: Data from individual patient electronic health records (EHRs) were extracted from the Maine Health Information Exchange network. Retrospective (N=823,627, calendar year 2013) and prospective (N=680,810, calendar year 2014) cohorts were formed. A machine learning algorithm, XGBoost, was adopted in the process of feature selection and model building. It generated an ensemble of classification trees and assigned a final predictive risk score to each individual. RESULTS: The 1-year incident hypertension risk model attained areas under the curve (AUCs) of 0.917 and 0.870 in the retrospective and prospective cohorts, respectively. Risk scores were calculated and stratified into five risk categories, with 4526 out of 381,544 patients (1.19%) in the lowest risk category (score 0-0.05) and 21,050 out of 41,329 patients (50.93%) in the highest risk category (score 0.4-1) receiving a diagnosis of incident hypertension in the following 1 year. Type 2 diabetes, lipid disorders, CVDs, mental illness, clinical utilization indicators, and socioeconomic determinants were recognized as driving or associated features of incident essential hypertension. The very high risk population mainly comprised elderly (age>50 years) individuals with multiple chronic conditions, especially those receiving medications for mental disorders. Disparities were also found in social determinants, including some community-level factors associated with higher risk and others that were protective against hypertension. CONCLUSIONS: With statewide EHR datasets, our study prospectively validated an accurate 1-year risk prediction model for incident essential hypertension. Our real-time predictive analytic model has been deployed in the state of Maine, providing implications in interventions for hypertension and related diseases and hopefully enhancing hypertension care.


Related Papers

No related papers found

Powered by citation graph analysis