Generation and analysis of 280,000 human expressed sequence tags.

L. Hillier(Washington University in St. Louis), Gregory G. Lennon(Lawrence Livermore National Laboratory), Michael A. Becker(Washington University in St. Louis), Maria F. Bonaldo(Washington University in St. Louis), Brandi J Chiapelli(Washington University in St. Louis), Stephanie L. Chissoe(Washington University in St. Louis), Nicole Dietrich(Washington University in St. Louis), T DuBuque(Washington University in St. Louis), A. Favello(Washington University in St. Louis), Warren Gish(Washington University in St. Louis), Malcolm Hawkins(Washington University in St. Louis), Maria T. Hultman(Washington University in St. Louis), Tamara A. Kucaba(Washington University in St. Louis), Martha Q. Lacy(Washington University in St. Louis), Minh Lê(Washington University in St. Louis), Nhu D. Le(Washington University in St. Louis), Elaine R. Mardis(Washington University in St. Louis), Barry Moore(Washington University in St. Louis), Michael A. Morris(Washington University in St. Louis), Jeremy Parsons(Washington University in St. Louis), Christa Prange(Washington University in St. Louis), Larry Rifkin(Washington University in St. Louis), Theresa Rohlfing(Washington University in St. Louis), Katja Schellenberg(Washington University in St. Louis), Marco A. Marra(Washington University in St. Louis)
Genome Research
September 1, 1996
Cited by 469Open Access
Full Text

Abstract

We report the generation of 319,311 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' and 3' ends of 194,031 human cDNA clones. Our goal has been to obtain tag sequences from many different genes and to deposit these in the publicly accessible Data Base for Expressed Sequence Tags. Highly efficient automatic screening of the data allows deposition of the annotated sequences without delay. Sequences have been generated from 26 oligo(dT) primed directionally cloned libraries, of which 18 were normalized. The libraries were constructed using mRNA isolated from 17 different tissues representing three developmental states. Comparisons of a subset of our data with nonredundant human mRNA and protein data bases show that the ESTs represent many known sequences and contain many that are novel. Analysis of protein families using Hidden Markov Models confirms this observation and supports the contention that although normalization reduces significantly the relative abundance of redundant cDNA clones, it does not result in the complete removal of members of gene families.


Related Papers

No related papers found

Powered by citation graph analysis