Integrating large scale genetic and clinical information to predict cases of heart failure
Abstract
Heart failure (HF) is a major global cause of death. Early risk prediction and intervention could mitigate disease progression. We aimed to improve HF prediction by integrating genome-wide association studies (GWAS)- and electronic health records (EHR)-derived risk scores. We previously performed a large HF GWAS within the Global Biobank Meta-analysis Initiative to create a polygenic risk score (PRS). Three Michigan Medicine (MM) cohorts were used to develop the clinical risk score (ClinRS): 1) Primary Care Provider cohort (MM-PCP; N = 61,849), 2) Heart Failure cohort (MM-HF; N = 53,272), and 3) Michigan Genomics Initiative cohort (MM-MGI; N = 60,215). To extract information from high-dimensional EHR data, we leveraged natural language processing to generate 350 latent phenotypes representing EHR codes and used coefficients from LASSO regression on these phenotypes in a training set as weights to calculate ClinRS in a validation set. Using logistic regression, model performances were compared between baseline model and models with risk scores added: 1) PRS, 2) ClinRS, and 3) ClinRS+PRS. We further compared the proposed models with Atherosclerosis Risk in Communities (ARIC) HF risk score. PRS and ClinRS each predict HF outcomes significantly better than the baseline model, up to eight years prior to HF diagnosis. Including both PRS and ClinRS further improves prediction performance up to ten years prior to diagnosis, two years earlier than either score alone. Additionally, ClinRS significantly outperforms the ARIC model one year prior. We demonstrate the additive power of integrating GWAS- and EHR-derived risk scores to predict HF cases prior to diagnosis. This standardizable and scalable risk predictor may enable physicians to provide earlier interventions to improve patient outcomes. Heart failure (HF) is a leading cause of death worldwide. Early identification of individuals at high risk could facilitate interventions to slow disease progression. In this study, we develop an approach to improve HF risk prediction by combining patient genetic information and clinical information from electronic health records (EHR). We create two risk scores: a polygenic risk score (PRS) based on genetic information, and a clinical risk score (ClinRS) based on patient EHR. We test how well these scores predict HF before diagnosis. Both PRS and ClinRS improve predictions individually and identify high-risk individuals up to eight years in advance. When used together, they provide greater accuracy, predicting HF up to ten years before diagnosis. We suggest that combining genetic and clinical information could help doctors detect HF earlier for better treatment and prevention strategies in the future. Kuan-Han et al. examine if integrating patient genetic and clinical information from electronic health records can better predict heart failure in patients. Their findings show improvement in heart failure prediction up to ten years prior to diagnosis, which is two years earlier than using a single risk score alone.
Related Papers
No related papers found
Powered by citation graph analysis