Predicting atrial fibrillation in primary care using machine learning

Nathan R. Hill(Bristol-Myers Squibb (United Kingdom)), Daniel Ayoubkhani(Health Economics and Outcomes Research (United Kingdom)), Phil McEwan(Health Economics and Outcomes Research (United Kingdom)), Daniel Sugrue(Health Economics and Outcomes Research (United Kingdom)), Usman Farooqui(Bristol-Myers Squibb (United Kingdom)), Steven Lister(Bristol-Myers Squibb (United Kingdom)), Matthew Lumley(Pfizer (United Kingdom)), Ameet Bakhai(The Royal Free Hospital), Alexander T. Cohen(King's College London), Mark O’Neill(King's College London), David A. Clifton(University of Oxford), Jason Gordon(Health Economics and Outcomes Research (United Kingdom))
PLoS ONE
November 1, 2019
Cited by 145Open Access
Full Text

Abstract

BACKGROUND: Atrial fibrillation (AF) is the most common sustained heart arrhythmia. However, as many cases are asymptomatic, a large proportion of patients remain undiagnosed until serious complications arise. Efficient, cost-effective detection of the undiagnosed may be supported by risk-prediction models relating patient factors to AF risk. However, there exists a need for an implementable risk model that is contemporaneous and informed by routinely collected patient data, reflecting the real-world pathology of AF. METHODS: This study sought to develop and evaluate novel and conventional statistical and machine learning models for risk-predication of AF. This was a retrospective, cohort study of adults (aged ≥30 years) without a history of AF, listed on the Clinical Practice Research Datalink, from January 2006 to December 2016. Models evaluated included published risk models (Framingham, ARIC, CHARGE-AF), machine learning models, which evaluated baseline and time-updated information (neural network, LASSO, random forests, support vector machines), and Cox regression. RESULTS: Analysis of 2,994,837 individuals (3.2% AF) identified time-varying neural networks as the optimal model achieving an AUROC of 0.827 vs. 0.725, with number needed to screen of 9 vs. 13 patients at 75% sensitivity, when compared with the best existing model CHARGE-AF. The optimal model confirmed known baseline risk factors (age, previous cardiovascular disease, antihypertensive medication usage) and identified additional time-varying predictors (proximity of cardiovascular events, body mass index (both levels and changes), pulse pressure, and the frequency of blood pressure measurements). CONCLUSION: The optimal time-varying machine learning model exhibited greater predictive performance than existing AF risk models and reflected known and new patient risk factors for AF.


Related Papers

No related papers found

Powered by citation graph analysis