Landscape of transcription in human cellsEukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene. A description is given of the ENCODE effort to provide a complete catalogue of primary and processed RNAs found either in specific subcellular compartments or throughout the cell, revealing that three-quarters of the human genome can be transcribed, and providing a wealth of information on the range and levels of expression, localization, processing fates and modifications of known and previously unannotated RNAs. These authors describe the ENCODE (Encyclopedia of DNA Elements) effort to provide a complete catalogue of primary and processed RNAs found either in specific sub-cellular compartments or throughout the cell. They show that three-quarters of the human genome can be transcribed, and provide a wealth of information about the range and levels of expression, localization, processing fates and modifications of both known and previously unannotated RNAs. Collectively, these observations suggest that the current concept of a gene should be revisited.
An encyclopedia of mouse DNA elements (Mouse ENCODE)e laboratory mouse is the premier mammalian model organism for the study of human disease, and it has played a vital role in both the annotation of the human genome and the study of gene function and regulation. Similar to humans, mice naturally develop diverse diseases that affect the hematologic, nervous, cardiovascular, endocrine, musculoskeletal, renal and other systems, providing excellent experimental paradigms for studying the pathogenesis of cancer, autoimmune disease, diabetes, obesity, atherosclerosis, hypertension, gastrointestinal disorders and diverse neurodegenerative states. Mouse models are currently available for hundreds of human disorders [1-4], spanning diverse quantitative and behavioral phenotypes and physiological systems. ese comprise both inbred strains and genetically engineered mutants, many of which have been extensively characterized. For these reasons, the mouse has emerged as a premier system for translating basic human genetic, genomic and physiologic research into paradigms for therapeutic development.
Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machineChenghai Xue, Fei Li, Tao He et al.|BMC Bioinformatics|2005 BACKGROUND: MicroRNAs (miRNAs) are a group of short (approximately 22 nt) non-coding RNAs that play important regulatory roles. MiRNA precursors (pre-miRNAs) are characterized by their hairpin structures. However, a large amount of similar hairpins can be folded in many genomes. Almost all current methods for computational prediction of miRNAs use comparative genomic approaches to identify putative pre-miRNAs from candidate hairpins. Ab initio method for distinguishing pre-miRNAs from sequence segments with pre-miRNA-like hairpin structures is lacking. Being able to classify real vs. pseudo pre-miRNAs is important both for understanding of the nature of miRNAs and for developing ab initio prediction methods that can discovery new miRNAs without known homology. RESULTS: A set of novel features of local contiguous structure-sequence information is proposed for distinguishing the hairpins of real pre-miRNAs and pseudo pre-miRNAs. Support vector machine (SVM) is applied on these features to classify real vs. pseudo pre-miRNAs, achieving about 90% accuracy on human data. Remarkably, the SVM classifier built on human data can correctly identify up to 90% of the pre-miRNAs from other species, including plants and virus, without utilizing any comparative genomics information. CONCLUSION: The local structure-sequence features reflect discriminative and conserved characteristics of miRNAs, and the successful ab initio classification of real and pseudo pre-miRNAs opens a new approach for discovering new miRNAs.
HITS-CLIP and Integrative Modeling Define the Rbfox Splicing-Regulatory Network Linked to Brain Development and AutismThe RNA binding proteins Rbfox1/2/3 regulate alternative splicing in the nervous system, and disruption of Rbfox1 has been implicated in autism. However, comprehensive identification of functional Rbfox targets has been challenging. Here, we perform HITS-CLIP for all three Rbfox family members in order to globally map, at a single-nucleotide resolution, their in vivo RNA interaction sites in the mouse brain. We find that the two guanines in the Rbfox binding motif UGCAUG are critical for protein-RNA interactions and crosslinking. Using integrative modeling, these interaction sites, combined with additional datasets, define 1,059 direct Rbfox target alternative splicing events. Over half of the quantifiable targets show dynamic changes during brain development. Of particular interest are 111 events from 48 candidate autism-susceptibility genes, including syndromic autism genes Shank3, Cacna1c, and Tsc2. Alteration of Rbfox targets in some autistic brains is correlated with downregulation of all three Rbfox proteins, supporting the potential clinical relevance of the splicing-regulatory network.
Comparative analysis of the transcriptome across distant species