Sequencing of 53,831 diverse genomes from the NHLBI TOPMed ProgramAbstract The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes) 1 . In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
Inherited causes of clonal haematopoiesis in 97,691 whole genomesSequencing of 53,831 diverse genomes from the NHLBI TOPMed ProgramDaniel Taliun, Daniel Harris, Michael D. Kessler et al.|bioRxiv (Cold Spring Harbor Laboratory)|2019 Summary paragraph The Trans-Omics for Precision Medicine (TOPMed) program seeks to elucidate the genetic architecture and disease biology of heart, lung, blood, and sleep disorders, with the ultimate goal of improving diagnosis, treatment, and prevention. The initial phases of the program focus on whole genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here, we describe TOPMed goals and design as well as resources and early insights from the sequence data. The resources include a variant browser, a genotype imputation panel, and sharing of genomic and phenotypic data via dbGaP. In 53,581 TOPMed samples, >400 million single-nucleotide and insertion/deletion variants were detected by alignment with the reference genome. Additional novel variants are detectable through assembly of unmapped reads and customized analysis in highly variable loci. Among the >400 million variants detected, 97% have frequency <1% and 46% are singletons. These rare variants provide insights into mutational processes and recent human evolutionary history. The nearly complete catalog of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and non-coding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and extends the reach of nearly all genome-wide association studies to include variants down to ~0.01% in frequency.
Genome‐wide association scan of quantitative traits for attention deficit hyperactivity disorder identifies novel associations and confirms candidate gene associationsJessica Lasky‐Su, Benjamin M. Neale, Barbara Franke et al.|American Journal of Medical Genetics Part B Neuropsychiatric Genetics|2008 Attention deficit hyperactivity disorder (ADHD) is a complex condition with environmental and genetic etiologies. Up to this point, research has identified genetic associations with candidate genes from known biological pathways. In order to identify novel ADHD susceptibility genes, 600,000 SNPs were genotyped in 958 ADHD proband-parent trios. After applying data cleaning procedures we examined 429,981 autosomal SNPs in 909 family trios. We generated six quantitative phenotypes from 18 ADHD symptoms to be used in genome-wide association analyses. With the PBAT screening algorithm, we identified 2 SNPs, rs6565113 and rs552655 that met the criteria for significance within a specified phenotype. These SNPs are located in intronic regions of genes CDH13 and GFOD1, respectively. CDH13 has been implicated previously in substance use disorders. We also evaluated the association of SNPs from a list of 37 ADHD candidate genes that was specified a priori. These findings, along with association P-values with a magnitude less than 10(-5), are discussed in this manuscript. Seventeen of these candidate genes had association P-values lower then 0.01: SLC6A1, SLC9A9, HES1, ADRB2, HTR1E, DDC, ADRA1A, DBH, DRD2, BDNF, TPH2, HTR2A, SLC6A2, PER1, CHRNA4, SNAP25, and COMT. Among the candidate genes, SLC9A9 had the strongest overall associations with 58 association test P-values lower than 0.01 and multiple association P-values at a magnitude of 10(-5) in this gene. In sum, these findings identify novel genetic associations at viable ADHD candidate genes and provide confirmatory evidence for associations at previous candidate genes. Replication of these results is necessary in order to confirm the proposed genetic variants for ADHD.
<i>MMP12,</i> Lung Function, and COPD in High-Risk PopulationsBACKGROUND: Genetic variants influencing lung function in children and adults may ultimately lead to the development of chronic obstructive pulmonary disease (COPD), particularly in high-risk groups. METHODS: We tested for an association between single-nucleotide polymorphisms (SNPs) in the gene encoding matrix metalloproteinase 12 (MMP12) and a measure of lung function (prebronchodilator forced expiratory volume in 1 second [FEV(1)]) in more than 8300 subjects in seven cohorts that included children and adults. Within the Normative Aging Study (NAS), a cohort of initially healthy adult men, we tested for an association between SNPs that were associated with FEV(1) and the time to the onset of COPD. We then examined the relationship between MMP12 SNPs and COPD in two cohorts of adults with COPD or at risk for COPD. RESULTS: The minor allele (G) of a functional variant in the promoter region of MMP12 (rs2276109 [-82A-->G]) was positively associated with FEV(1) in a combined analysis of children with asthma and adult former and current smokers in all cohorts (P=2x10(-6)). This allele was also associated with a reduced risk of the onset of COPD in the NAS cohort (hazard ratio, 0.65; 95% confidence interval [CI], 0.46 to 0.92; P=0.02) and with a reduced risk of COPD in a cohort of smokers (odds ratio, 0.63; 95% CI, 0.45 to 0.88; P=0.005) and among participants in a family-based study of early-onset COPD (P=0.006). CONCLUSIONS: The minor allele of a SNP in MMP12 (rs2276109) is associated with a positive effect on lung function in children with asthma and in adults who smoke. This allele is also associated with a reduced risk of COPD in adult smokers.