Cagatay Dursun

GENCODE: reference annotation for the human and mouse genomes in 2023

Adam Frankish, Sílvia Carbonell Sala, Mark Diekhans et al.|Nucleic Acids Research|2022

Cited by 637Open Access

Abstract GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.

GENCODE 2025: reference gene annotation for human and mouse

Jonathan M. Mudge, Sílvia Carbonell Sala, Mark Diekhans et al.|Nucleic Acids Research|2024

Cited by 295Open Access

GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result. Meanwhile, we are incorporating data from state-of-the-art proteomics and Ribo-seq experiments to fine-tune our annotation of translated sequences, while further insights into function can be gained from multi-genome alignments that grow richer as more species' genomes are sequenced. Such methodologies are combined into a fully integrated annotation workflow. However, the increasing complexity of our resources can present usability challenges, and we are resolving these with the creation of filtered genesets such as MANE Select and GENCODE Primary. The next challenge is to propagate annotations throughout multiple human and mouse genomes, as we enter the pangenome era. Our resources are freely available at our web portal www.gencodegenes.org, and via the Ensembl and UCSC genome browsers.

Single-cell genomics and regulatory networks for 388 human brains

Prashant S. Emani, Jason Liu, Jason Liu et al.|Science|2024

Cited by 144Open Access

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.

Single-cell multi-cohort dissection of the schizophrenia transcriptome

W. Brad Ruzicka, Shahin Mohammadi, John F. Fullard et al.|Science|2024

Cited by 125Open Access

The complexity and heterogeneity of schizophrenia have hindered mechanistic elucidation and the development of more effective therapies. Here, we performed single-cell dissection of schizophrenia-associated transcriptomic changes in the human prefrontal cortex across 140 individuals in two independent cohorts. Excitatory neurons were the most affected cell group, with transcriptional changes converging on neurodevelopment and synapse-related molecular pathways. Transcriptional alterations included known genetic risk factors, suggesting convergence of rare and common genomic variants on neuronal population-specific alterations in schizophrenia. Based on the magnitude of schizophrenia-associated transcriptional change, we identified two populations of individuals with schizophrenia marked by expression of specific excitatory and inhibitory neuronal cell states. This single-cell atlas links transcriptomic changes to etiological genetic risk factors, contextualizing established knowledge within the human cortical cytoarchitecture and facilitating mechanistic understanding of schizophrenia pathophysiology and heterogeneity.

A data-driven single-cell and spatial transcriptomic map of the human prefrontal cortex

Louise A. Huuki-Myers, Abby Spangler, Nicholas J. Eagles et al.|Science|2024

Cited by 64Open Access

The molecular organization of the human neocortex historically has been studied in the context of its histological layers. However, emerging spatial transcriptomic technologies have enabled unbiased identification of transcriptionally defined spatial domains that move beyond classic cytoarchitecture. We used the Visium spatial gene expression platform to generate a data-driven molecular neuroanatomical atlas across the anterior-posterior axis of the human dorsolateral prefrontal cortex. Integration with paired single-nucleus RNA-sequencing data revealed distinct cell type compositions and cell-cell interactions across spatial domains. Using PsychENCODE and publicly available data, we mapped the enrichment of cell types and genes associated with neuropsychiatric disorders to discrete spatial domains.

Is this you? Claim your profile.

Top publicationsby citations