A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health ServiceBackground: To effectively prevent, detect, and treat health conditions that affect people during their lifecourse, health-care professionals and researchers need to know which sections of the population are susceptible to which health conditions and at which ages. Hence, we aimed to map the course of human health by identifying the 50 most common health conditions in each decade of life and estimating the median age at first diagnosis. Methods: We developed phenotyping algorithms and codelists for physical and mental health conditions that involve intensive use of health-care resources. Individuals older than 1 year were included in the study if their primary-care and hospital-admission records met research standards set by the Clinical Practice Research Datalink and they had been registered in a general practice in England contributing up-to-standard data for at least 1 year during the study period. We used linked records of individuals from the CALIBER platform to calculate the sex-standardised cumulative incidence for these conditions by 10-year age groups between April 1, 2010, and March 31, 2015. We also derived the median age at diagnosis and prevalence estimates stratified by age, sex, and ethnicity (black, white, south Asian) over the study period from the primary-care and secondary-care records of patients. Findings: We developed case definitions for 308 disease phenotypes. We used records of 2 784 138 patients for the calculation of cumulative incidence and of 3 872 451 patients for the calculation of period prevalence and median age at diagnosis of these conditions. Conditions that first gained prominence at key stages of life were: atopic conditions and infections that led to hospital admission in children (<10 years); acne and menstrual disorders in the teenage years (10-19 years); mental health conditions, obesity, and migraine in individuals aged 20-29 years; soft-tissue disorders and gastro-oesophageal reflux disease in individuals aged 30-39 years; dyslipidaemia, hypertension, and erectile dysfunction in individuals aged 40-59 years; cancer, osteoarthritis, benign prostatic hyperplasia, cataract, diverticular disease, type 2 diabetes, and deafness in individuals aged 60-79 years; and atrial fibrillation, dementia, acute and chronic kidney disease, heart failure, ischaemic heart disease, anaemia, and osteoporosis in individuals aged 80 years or older. Black or south-Asian individuals were diagnosed earlier than white individuals for 258 (84%) of the 308 conditions. Bone fractures and atopic conditions were recorded earlier in male individuals, whereas female individuals were diagnosed at younger ages with nutritional anaemias, tubulointerstitial nephritis, and urinary disorders. Interpretation: We have produced the first chronological map of human health with cumulative-incidence and period-prevalence estimates for multiple morbidities in parallel from birth to advanced age. This can guide clinicians, policy makers, and researchers on how to formulate differential diagnoses, allocate resources, and target research priorities on the basis of the knowledge of who gets which diseases when. We have published our phenotyping algorithms on the CALIBER open-access Portal which will facilitate future research by providing a curated list of reusable case definitions. Funding: Wellcome Trust, National Institute for Health Research, Medical Research Council, Arthritis Research UK, British Heart Foundation, Cancer Research UK, Chief Scientist Office of the Scottish Government Health and Social Care Directorates, Department of Health and Social Care (England), Health and Social Care Research and Development Division (Welsh Government), Public Health Agency (Northern Ireland), Economic and Social Research Council, Engineering and Physical Sciences Research Council, National Institute for Social Care and Health Research, and The Alan Turing Institute.
Improving the odds of drug development success through human genomics: modelling studyLack of efficacy in the intended disease indication is the major cause of clinical phase drug development failure. Explanations could include the poor external validity of pre-clinical (cell, tissue, and animal) models of human disease and the high false discovery rate (FDR) in preclinical science. FDR is related to the proportion of true relationships available for discovery (γ), and the type 1 (false-positive) and type 2 (false negative) error rates of the experiments designed to uncover them. We estimated the FDR in preclinical science, its effect on drug development success rates, and improvements expected from use of human genomics rather than preclinical studies as the primary source of evidence for drug target identification. Calculations were based on a sample space defined by all human diseases - the 'disease-ome' - represented as columns; and all protein coding genes - 'the protein-coding genome'- represented as rows, producing a matrix of unique gene- (or protein-) disease pairings. We parameterised the space based on 10,000 diseases, 20,000 protein-coding genes, 100 causal genes per disease and 4000 genes encoding druggable targets, examining the effect of varying the parameters and a range of underlying assumptions, on the inferences drawn. We estimated γ, defined mathematical relationships between preclinical FDR and drug development success rates, and estimated improvements in success rates based on human genomics (rather than orthodox preclinical studies). Around one in every 200 protein-disease pairings was estimated to be causal (γ = 0.005) giving an FDR in preclinical research of 92.6%, which likely makes a major contribution to the reported drug development failure rate of 96%. Observed success rate was only slightly greater than expected for a random pick from the sample space. Values for γ back-calculated from reported preclinical and clinical drug development success rates were also close to the a priori estimates. Substituting genome wide (or druggable genome wide) association studies for preclinical studies as the major information source for drug target identification was estimated to reverse the probability of late stage failure because of the more stringent type 1 error rate employed and the ability to interrogate every potential druggable target in the same experiment. Genetic studies conducted at much larger scale, with greater resolution of disease end-points, e.g. by connecting genomics and electronic health record data within healthcare systems has the potential to produce radical improvement in drug development success rate.
UK phenomics platform for developing and validating electronic health record phenotypes: CALIBERSpiros Denaxas, Arturo González-Izquierdo, Kenan Direk et al.|Journal of the American Medical Informatics Association|2019 OBJECTIVE: Electronic health records (EHRs) are a rich source of information on human diseases, but the information is variably structured, fragmented, curated using different coding systems, and collected for purposes other than medical research. We describe an approach for developing, validating, and sharing reproducible phenotypes from national structured EHR in the United Kingdom with applications for translational research. MATERIALS AND METHODS: We implemented a rule-based phenotyping framework, with up to 6 approaches of validation. We applied our framework to a sample of 15 million individuals in a national EHR data source (population-based primary care, all ages) linked to hospitalization and death records in England. Data comprised continuous measurements (for example, blood pressure; medication information; coded diagnoses, symptoms, procedures, and referrals), recorded using 5 controlled clinical terminologies: (1) read (primary care, subset of SNOMED-CT [Systematized Nomenclature of Medicine Clinical Terms]), (2) International Classification of Diseases-Ninth Revision and Tenth Revision (secondary care diagnoses and cause of mortality), (3) Office of Population Censuses and Surveys Classification of Surgical Operations and Procedures, Fourth Revision (hospital surgical procedures), and (4) DM+D prescription codes. RESULTS: Using the CALIBER phenotyping framework, we created algorithms for 51 diseases, syndromes, biomarkers, and lifestyle risk factors and provide up to 6 validation approaches. The EHR phenotypes are curated in the open-access CALIBER Portal (https://www.caliberresearch.org/portal) and have been used by 40 national and international research groups in 60 peer-reviewed publications. CONCLUSIONS: We describe a UK EHR phenomics approach within the CALIBER EHR data platform with initial evidence of validity and use, as an important step toward international use of UK EHR data for health research.
Identifying and visualising multimorbidity and comorbidity patterns in patients in the English National Health Service: a population-based studyBACKGROUND: Globally, there is a paucity of multimorbidity and comorbidity data, especially for minority ethnic groups and younger people. We estimated the frequency of common disease combinations and identified non-random disease associations for all ages in a multiethnic population. METHODS: In this population-based study, we examined multimorbidity and comorbidity patterns stratified by ethnicity or race, sex, and age for 308 health conditions using electronic health records from individuals included on the Clinical Practice Research Datalink linked with the Hospital Episode Statistics admitted patient care dataset in England. We included individuals who were older than 1 year and who had been registered for at least 1 year in a participating general practice during the study period (between April 1, 2010, and March 31, 2015). We identified the most common combinations of conditions and comorbidities for index conditions. We defined comorbidity as the accumulation of additional conditions to an index condition over an individual's lifetime. We used network analysis to identify conditions that co-occurred more often than expected by chance. We developed online interactive tools to explore multimorbidity and comorbidity patterns overall and by subgroup based on ethnicity, sex, and age. FINDINGS: We collected data for 3 872 451 eligible patients, of whom 1 955 700 (50·5%) were women and girls, 1 916 751 (49·5%) were men and boys, 2 666 234 (68·9%) were White, 155 435 (4·0%) were south Asian, and 98 815 (2·6%) were Black. We found that a higher proportion of boys aged 1-9 years (132 506 [47·8%] of 277 158) had two or more diagnosed conditions than did girls in the same age group (106 982 [40·3%] of 265 179), but more women and girls were diagnosed with multimorbidity than were boys aged 10 years and older and men (1 361 232 [80·5%] of 1 690 521 vs 1 161 308 [70·8%] of 1 639 593). White individuals (2 097 536 [78·7%] of 2 666 234) were more likely to be diagnosed with two or more conditions than were Black (59 339 [60·1%] of 98 815) or south Asian individuals (93 617 [60·2%] of 155 435). Depression commonly co-occurred with anxiety, migraine, obesity, atopic conditions, deafness, soft-tissue disorders, and gastrointestinal disorders across all subgroups. Heart failure often co-occurred with hypertension, atrial fibrillation, osteoarthritis, stable angina, myocardial infarction, chronic kidney disease, type 2 diabetes, and chronic obstructive pulmonary disease. Spinal fractures were most strongly non-randomly associated with malignancy in Black individuals, but with osteoporosis in White individuals. Hypertension was most strongly associated with kidney disorders in those aged 20-29 years, but with dyslipidaemia, obesity, and type 2 diabetes in individuals aged 40 years and older. Breast cancer was associated with different comorbidities in individuals from different ethnic groups. Asthma was associated with different comorbidities between males and females. Bipolar disorder was associated with different comorbidities in younger age groups compared with older age groups. INTERPRETATION: Our findings and interactive online tools are a resource for: patients and their clinicians, to prevent and detect comorbid conditions; research funders and policy makers, to redesign service provision, training priorities, and guideline development; and biomedical researchers and manufacturers of medicines, to provide leads for research into common or sequential pathways of disease and inform the design of clinical trials. FUNDING: UK Research and Innovation, Medical Research Council, National Institute for Health and Care Research, Department of Health and Social Care, Wellcome Trust, British Heart Foundation, and The Alan Turing Institute.
Association of Smoking, Alcohol Consumption, Blood Pressure, Body Mass Index, and Glycemic Risk Factors With Age-Related Macular DegenerationIMPORTANCE: Advanced age-related macular degeneration (AMD) is a leading cause of blindness in Western countries. Causal, modifiable risk factors need to be identified to develop preventive measures for advanced AMD. OBJECTIVE: To assess whether smoking, alcohol consumption, blood pressure, body mass index, and glycemic traits are associated with increased risk of advanced AMD. DESIGN, SETTING, PARTICIPANTS: This study used 2-sample mendelian randomization. Genetic instruments composed of variants associated with risk factors at genome-wide significance (P < 5 × 10-8) were obtained from published genome-wide association studies. Summary-level statistics for these instruments were obtained for advanced AMD from the International AMD Genomics Consortium 2016 data set, which consisted of 16 144 individuals with AMD and 17 832 control individuals. Data were analyzed from July 2020 to September 2021. EXPOSURES: Smoking initiation, smoking cessation, lifetime smoking, age at smoking initiation, alcoholic drinks per week, body mass index, systolic and diastolic blood pressure, type 2 diabetes, glycated hemoglobin, fasting glucose, and fasting insulin. MAIN OUTCOMES AND MEASURES: Advanced AMD and its subtypes, geographic atrophy (GA), and neovascular AMD. RESULTS: A 1-SD increase in logodds of genetically predicted smoking initiation was associated with higher risk of advanced AMD (odds ratio [OR], 1.26; 95% CI, 1.13-1.40; P < .001), while a 1-SD increase in logodds of genetically predicted smoking cessation (former vs current smoking) was associated with lower risk of advanced AMD (OR, 0.66; 95% CI, 0.50-0.87; P = .003). Genetically predicted increased lifetime smoking was associated with increased risk of advanced AMD (OR per 1-SD increase in lifetime smoking behavior, 1.32; 95% CI, 1.09-1.59; P = .004). Genetically predicted alcohol consumption was associated with higher risk of GA (OR per 1-SD increase of log-transformed alcoholic drinks per week, 2.70; 95% CI, 1.48-4.94; P = .001). There was insufficient evidence to suggest that genetically predicted blood pressure, body mass index, and glycemic traits were associated with advanced AMD. CONCLUSIONS AND RELEVANCE: This study provides genetic evidence that increased alcohol intake may be a causal risk factor for GA. As there are currently no known treatments for GA, this finding has important public health implications. These results also support previous observational studies associating smoking behavior with risk of advanced AMD, thus reinforcing existing public health messages regarding the risk of blindness associated with smoking.