Plasma proteomic associations with genetics and health in the UK BiobankAbstract The Pharma Proteomics Project is a precompetitive biopharmaceutical consortium characterizing the plasma proteomic profiles of 54,219 UK Biobank participants. Here we provide a detailed summary of this initiative, including technical and biological validations, insights into proteomic disease signatures, and prediction modelling for various demographic and health indicators. We present comprehensive protein quantitative trait locus (pQTL) mapping of 2,923 proteins that identifies 14,287 primary genetic associations, of which 81% are previously undescribed, alongside ancestry-specific pQTL mapping in non-European individuals. The study provides an updated characterization of the genetic architecture of the plasma proteome, contextualized with projected pQTL discovery rates as sample sizes and proteomic assay coverages increase over time. We offer extensive insights into trans pQTLs across multiple biological domains, highlight genetic influences on ligand–receptor interactions and pathway perturbations across a diverse collection of cytokines and complement networks, and illustrate long-range epistatic effects of ABO blood group and FUT2 secretor status on proteins with gastrointestinal tissue-enriched expression. We demonstrate the utility of these data for drug discovery by extending the genetic proxied effects of protein targets, such as PCSK9, on additional endpoints, and disentangle specific genes and proteins perturbed at loci associated with COVID-19 susceptibility. This public–private partnership provides the scientific community with an open-access proteomics resource of considerable breadth and depth to help to elucidate the biological mechanisms underlying proteo-genomic discoveries and accelerate the development of biomarkers, predictive models and therapeutics 1 .
Diverse Sources of <i>C. difficile</i> Infection Identified on Whole-Genome SequencingDavid W. Eyre, Madeleine Cule, Daniel J. Wilson et al.|New England Journal of Medicine|2013 BACKGROUND: It has been thought that Clostridium difficile infection is transmitted predominantly within health care settings. However, endemic spread has hampered identification of precise sources of infection and the assessment of the efficacy of interventions. METHODS: From September 2007 through March 2011, we performed whole-genome sequencing on isolates obtained from all symptomatic patients with C. difficile infection identified in health care settings or in the community in Oxfordshire, United Kingdom. We compared single-nucleotide variants (SNVs) between the isolates, using C. difficile evolution rates estimated on the basis of the first and last samples obtained from each of 145 patients, with 0 to 2 SNVs expected between transmitted isolates obtained less than 124 days apart, on the basis of a 95% prediction interval. We then identified plausible epidemiologic links among genetically related cases from data on hospital admissions and community location. RESULTS: Of 1250 C. difficile cases that were evaluated, 1223 (98%) were successfully sequenced. In a comparison of 957 samples obtained from April 2008 through March 2011 with those obtained from September 2007 onward, a total of 333 isolates (35%) had no more than 2 SNVs from at least 1 earlier case, and 428 isolates (45%) had more than 10 SNVs from all previous cases. Reductions in incidence over time were similar in the two groups, a finding that suggests an effect of interventions targeting the transition from exposure to disease. Of the 333 patients with no more than 2 SNVs (consistent with transmission), 126 patients (38%) had close hospital contact with another patient, and 120 patients (36%) had no hospital or community contact with another patient. Distinct subtypes of infection continued to be identified throughout the study, which suggests a considerable reservoir of C. difficile. CONCLUSIONS: Over a 3-year period, 45% of C. difficile cases in Oxfordshire were genetically distinct from all previous cases. Genetically diverse sources, in addition to symptomatic patients, play a major part in C. difficile transmission. (Funded by the U.K. Clinical Research Collaboration Translational Infection Research Initiative and others.).
Evolutionary dynamics of <i>Staphylococcus aureus</i> during progression from carriage to diseaseBernadette Young, Tanya Golubchik, Elizabeth M. Batty et al.|Proceedings of the National Academy of Sciences|2012 Whole-genome sequencing offers new insights into the evolution of bacterial pathogens and the etiology of bacterial disease. Staphylococcus aureus is a major cause of bacteria-associated mortality and invasive disease and is carried asymptomatically by 27% of adults. Eighty percent of bacteremias match the carried strain. However, the role of evolutionary change in the pathogen during the progression from carriage to disease is incompletely understood. Here we use high-throughput genome sequencing to discover the genetic changes that accompany the transition from nasal carriage to fatal bloodstream infection in an individual colonized with methicillin-sensitive S. aureus. We found a single, cohesive population exhibiting a repertoire of 30 single-nucleotide polymorphisms and four insertion/deletion variants. Mutations accumulated at a steady rate over a 13-mo period, except for a cluster of mutations preceding the transition to disease. Although bloodstream bacteria differed by just eight mutations from the original nasally carried bacteria, half of those mutations caused truncation of proteins, including a premature stop codon in an AraC-family transcriptional regulator that has been implicated in pathogenicity. Comparison with evolution in two asymptomatic carriers supported the conclusion that clusters of protein-truncating mutations are highly unusual. Our results demonstrate that bacterial diversity in vivo is limited but nonetheless detectable by whole-genome sequencing, enabling the study of evolutionary dynamics within the host. Regulatory or structural changes that occur during carriage may be functionally important for pathogenesis; therefore identifying those changes is a crucial step in understanding the biological causes of invasive bacterial disease.
Genetic architecture of 11 organ traits derived from abdominal MRI using deep learningCardiometabolic diseases are an increasing global health burden. While socioeconomic, environmental, behavioural, and genetic risk factors have been identified, a better understanding of the underlying mechanisms is required to develop more effective interventions. Magnetic resonance imaging (MRI) has been used to assess organ health, but biobank-scale studies are still in their infancy. Using over 38,000 abdominal MRI scans in the UK Biobank, we used deep learning to quantify volume, fat, and iron in seven organs and tissues, and demonstrate that imaging-derived phenotypes reflect health status. We show that these traits have a substantial heritable component (8-44%) and identify 93 independent genome-wide significant associations, including four associations with liver traits that have not previously been reported. Our work demonstrates the tractability of deep learning to systematically quantify health parameters from high-throughput MRI across a range of organs and tissues, and use the largest-ever study of its kind to generate new insights into the genetic architecture of these traits.
Within-Host Evolution of Staphylococcus aureus during Asymptomatic CarriageBACKGROUND: Staphylococcus aureus is a major cause of healthcare associated mortality, but like many important bacterial pathogens, it is a common constituent of the normal human body flora. Around a third of healthy adults are carriers. Recent evidence suggests that evolution of S. aureus during nasal carriage may be associated with progression to invasive disease. However, a more detailed understanding of within-host evolution under natural conditions is required to appreciate the evolutionary and mechanistic reasons why commensal bacteria such as S. aureus cause disease. Therefore we examined in detail the evolutionary dynamics of normal, asymptomatic carriage. Sequencing a total of 131 genomes across 13 singly colonized hosts using the Illumina platform, we investigated diversity, selection, population dynamics and transmission during the short-term evolution of S. aureus. PRINCIPAL FINDINGS: We characterized the processes by which the raw material for evolution is generated: micro-mutation (point mutation and small insertions/deletions), macro-mutation (large insertions/deletions) and the loss or acquisition of mobile elements (plasmids and bacteriophages). Through an analysis of synonymous, non-synonymous and intergenic mutations we discovered a fitness landscape dominated by purifying selection, with rare examples of adaptive change in genes encoding surface-anchored proteins and an enterotoxin. We found evidence for dramatic, hundred-fold fluctuations in the size of the within-host population over time, which we related to the cycle of colonization and clearance. Using a newly-developed population genetics approach to detect recent transmission among hosts, we revealed evidence for recent transmission between some of our subjects, including a husband and wife both carrying populations of methicillin-resistant S. aureus (MRSA). SIGNIFICANCE: This investigation begins to paint a picture of the within-host evolution of an important bacterial pathogen during its prevailing natural state, asymptomatic carriage. These results also have wider significance as a benchmark for future systematic studies of evolution during invasive S. aureus disease.