Proteomic signatures improve risk prediction for common and rare diseases

Julia Carrasco-Zanini(Queen Mary University of London), Maik Pietzner(Queen Mary University of London), Jonathan Davitte, Praveen Surendran(Age UK), Damien C. Croteau‐Chonka, Chloe Robins, Ana Torralbo(University College London), Christopher Tomlinson(National Institute for Health and Care Research), Florian Grünschläger(German Cancer Research Center), Natalie Fitzpatrick(University College London), C. R. Ytsma(University College London), Tokuwa Kanno, Stephan Gade, Daniel F. Freitag(Age UK), Frederik Ziebell, Simon Haas(Queen Mary University of London), Spiros Denaxas(National Institute for Health and Care Research), Joanna Betts(Age UK), Nicholas J. Wareham(University of Cambridge), Harry Hemingway(National Institute for Health and Care Research), Robert A. Scott(Age UK), Claudia Langenberg(Queen Mary University of London)
Nature Medicine
July 22, 2024
Cited by 184Open Access
Full Text

Abstract

For many diseases there are delays in diagnosis due to a lack of objective biomarkers for disease onset. Here, in 41,931 individuals from the United Kingdom Biobank Pharma Proteomics Project, we integrated measurements of ~3,000 plasma proteins with clinical information to derive sparse prediction models for the 10-year incidence of 218 common and rare diseases (81-6,038 cases). We then compared prediction models developed using proteomic data with models developed using either basic clinical information alone or clinical information combined with data from 37 clinical assays. The predictive performance of sparse models including as few as 5 to 20 proteins was superior to the performance of models developed using basic clinical information for 67 pathologically diverse diseases (median delta C-index = 0.07; range = 0.02-0.31). Sparse protein models further outperformed models developed using basic information combined with clinical assay data for 52 diseases, including multiple myeloma, non-Hodgkin lymphoma, motor neuron disease, pulmonary fibrosis and dilated cardiomyopathy. For multiple myeloma, single-cell RNA sequencing from bone marrow in newly diagnosed patients showed that four of the five predictor proteins were expressed specifically in plasma cells, consistent with the strong predictive power of these proteins. External replication of sparse protein models in the EPIC-Norfolk study showed good generalizability for prediction of the six diseases tested. These findings show that sparse plasma protein signatures, including both disease-specific proteins and protein predictors shared across several diseases, offer clinically useful prediction of common and rare diseases.


Related Papers