The Bioperl Toolkit: Perl Modules for the Life SciencesThe Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort. [Supplemental material is available online at www.genome.org . Bioperl is available as open-source software free of charge and is licensed under the Perl Artistic License ( http://www.perl.com/pub/a/language/misc/Artistic.html ). It is available for download at http://www.bioperl.org . Support inquiries should be addressed to bioperl-l@bioperl.org .]
Analyses of pig genomes provide insight into porcine demography and evolutionFor 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ∼1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model. This study presents the assembly and analysis of the genome sequence of a female domestic Duroc pig and a comparison with the genomes of wild and domestic pigs from Europe and Asia; the results shed light on the evolutionary relationship between European and Asian wild boars. The domestic pig (Sus scrofa) is an important livestock species, its genome shaped by thousands of years of domestication and, latterly, sophisticated breeding practices. A high-quality draft genome sequence for a female domestic Duroc pig is published in this issue of Nature, under the auspices of the Swine Genome Sequencing Consortium. Comparisons of the genomes of wild and domestic pigs shed light on the evolutionary relationship between European and Asian wild boars, and reveal the rapid evolution of genes involved in the immune response and in olfaction. The authors identify many possible disease-causing gene variants, increasing the potential of the pig as a biomedical model, and present a detailed analysis of endogenous porcine retroviruses, knowledge of which is important for the possible use of pigs in xenotransplantation.
The Genome Sequence of Taurine Cattle: A Window to Ruminant Biology and EvolutionTo understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
The DNA sequence of human chromosome 22Knowledge of the complete genomic DNA sequence of an organism allows a systematic approach to defining its genetic components. The genomic sequence provides access to the complete structures of all genes, including those without known function, their control elements, and, by inference, the proteins they encode, as well as all other biologically important sequences. Furthermore, the sequence is a rich and permanent source of information for the design of further biological studies of the organism and for the study of evolution through cross-species sequence comparison. The power of this approach has been amply demonstrated by the determination of the sequences of a number of microbial and model organisms. The next step is to obtain the complete sequence of the entire human genome. Here we report the sequence of the euchromatic part of human chromosome 22. The sequence obtained consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.
DNA sequence and analysis of human chromosome 9