Comprehensive functional genomic resource and integrative model for the human brainINTRODUCTION Strong genetic associations have been found for a number of psychiatric disorders. However, understanding the underlying molecular mechanisms remains challenging. RATIONALE To address this challenge, the PsychENCODE Consortium has developed a comprehensive online resource and integrative models for the functional genomics of the human brain. RESULTS The base of the pyramidal resource is the datasets generated by PsychENCODE, including bulk transcriptome, chromatin, genotype, and Hi-C datasets and single-cell transcriptomic data from ~32,000 cells for major brain regions. We have merged these with data from Genotype-Tissue Expression (GTEx), ENCODE, Roadmap Epigenomics, and single-cell analyses. Via uniform processing, we created a harmonized resource, allowing us to survey functional genomics data on the brain over a sample size of 1866 individuals. From this uniformly processed dataset, we created derived data products. These include lists of brain-expressed genes, coexpression modules, and single-cell expression profiles for many brain cell types; ~79,000 brain-active enhancers with associated Hi-C loops and topologically associating domains; and ~2.5 million expression quantitative-trait loci (QTLs) comprising ~238,000 linkage-disequilibrium–independent single-nucleotide polymorphisms and of other types of QTLs associated with splice isoforms, cell fractions, and chromatin activity. By using these, we found that >88% of the cross-population variation in brain gene expression can be accounted for by cell fraction changes. Furthermore, a number of disorders and aging are associated with changes in cell-type proportions. The derived data also enable comparison between the brain and other tissues. In particular, by using spectral analyses, we found that the brain has distinct expression and epigenetic patterns, including a greater extent of noncoding transcription than other tissues. The top level of the resource consists of integrative networks for regulation and machine-learning models for disease prediction. The networks include a full gene regulatory network (GRN) for the brain, linking transcription factors, enhancers, and target genes from merging of the QTLs, generalized element-activity correlations, and Hi-C data. By using this network, we link disease genes to genome-wide association study (GWAS) variants for psychiatric disorders. For schizophrenia, we linked 321 genes to the 142 reported GWAS loci. We then embedded the regulatory network into a deep-learning model to predict psychiatric phenotypes from genotype and expression. Our model gives a ~6-fold improvement in prediction over additive polygenic risk scores. Moreover, it achieves a ~3-fold improvement over additive models, even when the gene expression data are imputed, highlighting the value of having just a small amount of transcriptome data for disease prediction. Lastly, it highlights key genes and pathways associated with disorder prediction, including immunological, synaptic, and metabolic pathways, recapitulating de novo results from more targeted analyses. CONCLUSION Our resource and integrative analyses have uncovered genomic elements and networks in the brain, which in turn have provided insight into the molecular mechanisms underlying psychiatric disorders. Our deep-learning model improves disease risk prediction over traditional approaches and can be extended with additional data types (e.g., microRNA and neuroimaging). A comprehensive functional genomic resource for the adult human brain. The resource forms a three-layer pyramid. The bottom layer includes sequencing datasets for traits, such as schizophrenia. The middle layer represents derived datasets, including functional genomic elements and QTLs. The top layer contains integrated models, which link genotypes to phenotypes. DSPN, Deep Structured Phenotype Network; PC1 and PC2, principal components 1 and 2; ref, reference; alt, alternate; H3K27ac, histone H3 acetylation at lysine 27.
Transcriptome and epigenome landscape of human cortical development modeled in organoidsINTRODUCTION The human cerebral cortex has undergone an extraordinary increase in size and complexity during mammalian evolution. Cortical cell lineages are specified in the embryo, and genetic and epidemiological evidence implicates early cortical development in the etiology of neuropsychiatric disorders such as autism spectrum disorder (ASD), intellectual disabilities, and schizophrenia. Most of the disease-implicated genomic variants are located outside of genes, and the interpretation of noncoding mutations is lagging behind owing to limited annotation of functional elements in the noncoding genome. RATIONALE We set out to discover gene-regulatory elements and chart their dynamic activity during prenatal human cortical development, focusing on enhancers, which carry most of the weight upon regulation of gene expression. We longitudinally modeled human brain development using human induced pluripotent stem cell (hiPSC)–derived cortical organoids and compared organoids to isogenic fetal brain tissue. RESULTS Fetal fibroblast–derived hiPSC lines were used to generate cortically patterned organoids and to compare oganoids’ epigenome and transcriptome to that of isogenic fetal brains and external datasets. Organoids model cortical development between 5 and 16 postconception weeks, thus enabling us to study transitions from cortical stem cells to progenitors to early neurons. The greatest changes occur at the transition from stem cells to progenitors. The regulatory landscape encompasses a total set of 96,375 enhancers linked to target genes, with 49,640 enhancers being active in organoids but not in mid-fetal brain, suggesting major roles in cortical neuron specification. Enhancers that gained activity in the human lineage are active in the earliest stages of organoid development, when they target genes that regulate the growth of radial glial cells. Parallel weighted gene coexpression network analysis (WGCNA) of transcriptome and enhancer activities defined a number of modules of coexpressed genes and coactive enhancers, following just six and four global temporal patterns that we refer to as supermodules, likely reflecting fundamental programs in embryonic and fetal brain. Correlations between gene expression and enhancer activity allowed stratifying enhancers into two categories: activating regulators (A-regs) and repressive regulators (R-regs). Several enhancer modules converged with gene modules, suggesting that coexpressed genes are regulated by enhancers with correlated patterns of activity. Furthermore, enhancers active in organoids and fetal brains were enriched for ASD de novo variants that disrupt binding sites of homeodomain, Hes1, NR4A2, Sox3, and NFIX transcription factors. CONCLUSION We validated hiPSC-derived cortical organoids as a suitable model system for studying gene regulation in human embryonic brain development, evolution, and disease. Our results suggest that organoids may reveal how noncoding mutations contribute to ASD etiology. Summary of the study, analyses, and main results. Data were generated for iPSC-derived human telencephalic organoids and isogenic fetal cortex. Organoids modeled embryonic and early fetal cortex and show a larger repertoire of enhancers. Enhancers could be divided into activators and repressors of gene expression. We derived networks of modules and supermodules with correlated gene and enhancer activities, some of which were implicated in autism spectrum disorders (ASD).
Functional assessment of human enhancer activities using whole-genome STARR-sequencingBACKGROUND: Genome-wide quantification of enhancer activity in the human genome has proven to be a challenging problem. Recent efforts have led to the development of powerful tools for enhancer quantification. However, because of genome size and complexity, these tools have yet to be applied to the whole human genome. RESULTS: In the current study, we use a human prostate cancer cell line, LNCaP as a model to perform whole human genome STARR-seq (WHG-STARR-seq) to reliably obtain an assessment of enhancer activity. This approach builds upon previously developed STARR-seq in the fly genome and CapSTARR-seq techniques in targeted human genomic regions. With an improved library preparation strategy, our approach greatly increases the library complexity per unit of starting material, which makes it feasible and cost-effective to explore the landscape of regulatory activity in the much larger human genome. In addition to our ability to identify active, accessible enhancers located in open chromatin regions, we can also detect sequences with the potential for enhancer activity that are located in inaccessible, closed chromatin regions. When treated with the histone deacetylase inhibitor, Trichostatin A, genes nearby this latter class of enhancers are up-regulated, demonstrating the potential for endogenous functionality of these regulatory elements. CONCLUSION: WHG-STARR-seq provides an improved approach to current pipelines for analysis of high complexity genomes to gain a better understanding of the intricacies of transcriptional regulation.
Human Organoids Share Structural and Genetic Features with Primary Pancreatic Adenocarcinoma TumorsAbstract Patient-derived pancreatic ductal adenocarcinoma (PDAC) organoid systems show great promise for understanding the biological underpinnings of disease and advancing therapeutic precision medicine. Despite the increased use of organoids, the fidelity of molecular features, genetic heterogeneity, and drug response to the tumor of origin remain important unanswered questions limiting their utility. To address this gap in knowledge, primary tumor- and patient-derived xenograft (PDX)-derived organoids, and 2D cultures for in-depth genomic and histopathologic comparisons with the primary tumor were created. Histopathologic features and PDAC representative protein markers (e.g., claudin 4 and CA19-9) showed strong concordance. DNA- and RNA-sequencing (RNAseq) of single organoids revealed patient-specific genomic and transcriptomic consistency. Single-cell RNAseq demonstrated that organoids are primarily a clonal population. In drug response assays, organoids displayed patient-specific sensitivities. In addition, the in vivo PDX response to FOLFIRINOX and gemcitabine/abraxane treatments were examined, which was recapitulated in vitro with organoids. This study has demonstrated that organoids are potentially invaluable for precision medicine as well as preclinical drug treatment studies because they maintain distinct patient phenotypes and respond differently to drug combinations and dosage. Implications: The patient-specific molecular and histopathologic fidelity of organoids indicate that they can be used to understand the etiology of the patient's tumor and the differential response to therapies and suggests utility for predicting drug responses.
The transcription factor POU3F2 regulates a gene coexpression network in brain tissue from patients with psychiatric disordersChao Chen, Qingtuan Meng, Yan Xia et al.|Science Translational Medicine|2018 POU3F2 regulates expression of key genes in postmortem brain tissue from patients with schizophrenia or bipolar disorder.