J

Jörg Drenkow

Cold Spring Harbor Laboratory

Publishes on Genomics and Chromatin Dynamics, RNA and protein synthesis mechanisms, RNA Research and Splicing. 56 papers and 84.7k citations.

56Publications
84.7kTotal Citations

Is this you? Claim your profile.

Add your photo, update your bio, and get notified when your ranking changes.

Top publicationsby citations

STAR: ultrafast universal RNA-seq aligner
Alexander Dobin, Carrie Davis, Felix Schlesinger et al.|Bioinformatics|2012
Cited by 55.7kOpen Access

MOTIVATION: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. RESULTS: To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. AVAILABILITY AND IMPLEMENTATION: STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

Landscape of transcription in human cells
Cited by 5.4kOpen Access

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene. A description is given of the ENCODE effort to provide a complete catalogue of primary and processed RNAs found either in specific subcellular compartments or throughout the cell, revealing that three-quarters of the human genome can be transcribed, and providing a wealth of information on the range and levels of expression, localization, processing fates and modifications of known and previously unannotated RNAs. These authors describe the ENCODE (Encyclopedia of DNA Elements) effort to provide a complete catalogue of primary and processed RNAs found either in specific sub-cellular compartments or throughout the cell. They show that three-quarters of the human genome can be transcribed, and provide a wealth of information about the range and levels of expression, localization, processing fates and modifications of both known and previously unannotated RNAs. Collectively, these observations suggest that the current concept of a gene should be revisited.

RNA Maps Reveal New RNA Classes and a Possible Function for Pervasive Transcription
Philipp Kapranov, Jill Cheng, Sujit Dike et al.|Science|2007
Cited by 2.6k

Significant fractions of eukaryotic genomes give rise to RNA, much of which is unannotated and has reduced protein-coding potential. The genomic origins and the associations of human nuclear and cytosolic polyadenylated RNAs longer than 200 nucleotides (nt) and whole-cell RNAs less than 200 nt were investigated in this genome-wide study. Subcellular addresses for nucleotides present in detected RNAs were assigned, and their potential processing into short RNAs was investigated. Taken together, these observations suggest a novel role for some unannotated RNAs as primary transcripts for the production of short RNAs. Three potentially functional classes of RNAs have been identified, two of which are syntenically conserved and correlate with the expression state of protein-coding genes. These data support a highly interleaved organization of the human transcriptome.

Expanded encyclopaedias of DNA elements in the human and mouse genomes
Cited by 2.6kOpen Access

Abstract The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal ( https://www.encodeproject.org ), including phase II ENCODE 1 and Roadmap Epigenomics 2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis -regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org ) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.