From First Base: The Sequence of the Tip of the <i>X</i> Chromosome of <i>Drosophila melanogaster</i>, a Comparison of Two Sequencing Strategies

Panayiotis V. Benos(European Bioinformatics Institute), Melanie K. Gatt(Centre National de la Recherche Scientifique), Lee Murphy(Centre National de la Recherche Scientifique), David Harris(Centre National de la Recherche Scientifique), Bart Barrell(Centre National de la Recherche Scientifique), Concepción Ferraz(Centre National de la Recherche Scientifique), Sophie Vidal(Centre National de la Recherche Scientifique), Christine Brun(Centre National de la Recherche Scientifique), Jacques Demaille(Centre National de la Recherche Scientifique), Édouard Cadieu(Centre National de la Recherche Scientifique), Stéphane Dréano(Centre National de la Recherche Scientifique), Stéphanie Gloux(Centre National de la Recherche Scientifique), Valérie Lelaure(Centre National de la Recherche Scientifique), Stéphanie Mottier(Centre National de la Recherche Scientifique), Francis Galibert(Centre National de la Recherche Scientifique), Dana Borkova(Centre National de la Recherche Scientifique), Belén Miñana(Centre National de la Recherche Scientifique), Fotis C. Kafatos(Centre National de la Recherche Scientifique), Slava Bolshakov(Centre National de la Recherche Scientifique), Inga Sidén‐Kiamos(Centre National de la Recherche Scientifique), George Papagiannakis(Centre National de la Recherche Scientifique), Lefteris Spanos(Centre National de la Recherche Scientifique), Christos Louis(Centre National de la Recherche Scientifique), Encarnación Madueño(Centre National de la Recherche Scientifique), Beatriz de Pablos(Centre National de la Recherche Scientifique), Juan Modolell(Centre National de la Recherche Scientifique), Annette Peter(Centre National de la Recherche Scientifique), Petra Schöttler(Centre National de la Recherche Scientifique), Meike Werner(Centre National de la Recherche Scientifique), Foteini Mourkioti(Centre National de la Recherche Scientifique), Nicole Beinert(Centre National de la Recherche Scientifique), Gordon Dowe(Centre National de la Recherche Scientifique), Ulrich Schäfer(Centre National de la Recherche Scientifique), Herbert Jäckle(Centre National de la Recherche Scientifique), Alain Bucheton(Centre National de la Recherche Scientifique), Debbie Callister(Centre National de la Recherche Scientifique), Lorna Campbell(Centre National de la Recherche Scientifique), Nadine S. Henderson(Centre National de la Recherche Scientifique), Paul J. McMillan(Centre National de la Recherche Scientifique), Cathy Salles(Centre National de la Recherche Scientifique), Evelyn Tait(Centre National de la Recherche Scientifique), Phillipe Valenti(Centre National de la Recherche Scientifique), Robert D. C. Saunders(Centre National de la Recherche Scientifique), Alain Billaud(Centre National de la Recherche Scientifique), Lior Pachter(Centre National de la Recherche Scientifique), David M. Glover(Centre National de la Recherche Scientifique), Michael Ashburner(Centre National de la Recherche Scientifique)
Genome Research
May 1, 2001
Cited by 28Open Access
Full Text

Abstract

We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different transposable elements. We show that an interval between bands 3A2 and 3C2, believed in the 1970s to show a correlation between the number of bands on the polytene chromosomes and the 20 genes identified by conventional genetics, is predicted to contain 45 genes from its DNA sequence. We have determined the insertion sites of P-elements from 111 mutant lines, about half of which are in a position likely to affect the expression of novel predicted genes, thus representing a resource for subsequent functional genomic analysis. We compare the European Drosophila Genome Project sequence with the corresponding part of the independently assembled and annotated Joint Sequence determined through “shotgun” sequencing. Discounting differences in the distribution of known transposable elements between the strains sequenced in the two projects, we detected three major sequence differences, two of which are probably explained by errors in assembly; the origin of the third major difference is unclear. In addition there are eight sequence gaps within the Joint Sequence. At least six of these eight gaps are likely to be sites of transposable elements; the other two are complex. Of the 275 genes in common to both projects, 60% are identical within 1% of their predicted amino-acid sequence and 31% show minor differences such as in choice of translation initiation or termination codons; the remaining 9% show major differences in interpretation. [All of the sequences analyzed in this paper have been deposited in the EMBL-Bank database under the following accession nos.: AL009146 , AL009147 , AL009171 , AL009188 – AL009196 , AL021067 , AL021086 , AL021106 – AL021108 , AL021726 , AL021728 , AL022017 , AL022018 , AL022139 , AL023873 , AL023874 , AL023893 , AL024453 , AL024455 – AL024457 , AL024485 , AL030993 , AL030994 , AL031024 – AL031028 , AL031128 , AL031173 , AL031366 , AL031367 , AL031581 – AL031583 , AL031640 , AL031765 , AL031883 , AL031884 , AL034388 , AL034544 , AL035104 , AL035105 , AL035207 , AL035245 , AL035331 , AL035632 , AL049535 , AL050231 , AL050232 , AL109630 , AL121804 , AL121806 , AL132651 , AL132792 , AL132797 , AL133503 – AL133506 , <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="AL138678" ext-link-type="gen" xlink:typ


Related Papers

No related papers found

Powered by citation graph analysis