Phylogeny.fr: robust phylogenetic analysis for the non-specialistPhylogenetic analyses are central to many research areas in biology and typically involve the identification of homologous sequences, their multiple alignment, the phylogenetic reconstruction and the graphical representation of the inferred tree. The Phylogeny.fr platform transparently chains programs to automatically perform these tasks. It is primarily designed for biologists with no experience in phylogeny, but can also meet the needs of specialists; the first ones will find up-to-date tools chained in a phylogeny pipeline to analyze their data in a simple and robust way, while the specialists will be able to easily build and run sophisticated analyses. Phylogeny.fr offers three main modes. The 'One Click' mode targets non-specialists and provides a ready-to-use pipeline chaining programs with recognized accuracy and speed: MUSCLE for multiple alignment, PhyML for tree building, and TreeDyn for tree rendering. All parameters are set up to suit most studies, and users only have to provide their input sequences to obtain a ready-to-print tree. The 'Advanced' mode uses the same pipeline but allows the parameters of each program to be customized by users. The 'A la Carte' mode offers more flexibility and sophistication, as users can build their own pipeline by selecting and setting up the required steps from a large choice of tools to suit their specific needs. Prior to phylogenetic analysis, users can also collect neighbors of a query sequence by running BLAST on general or specialized databases. A guide tree then helps to select neighbor sequences to be used as input for the phylogeny pipeline. Phylogeny.fr is available at: http://www.phylogeny.fr/
BLAST-EXPLORER helps you building datasets for phylogenetic analysisBACKGROUND: The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task. RESULTS: To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the Phylogeny.fr platform. CONCLUSIONS: BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at http://www.phylogeny.fr.
The coffee genome provides insight into the convergent evolution of caffeine biosynthesisCoffee is a valuable beverage crop due to its characteristic flavor, aroma, and the stimulating effects of caffeine. We generated a high-quality draft genome of the species Coffea canephora, which displays a conserved chromosomal gene order among asterid angiosperms. Although it shows no sign of the whole-genome triplication identified in Solanaceae species such as tomato, the genome includes several species-specific gene family expansions, among them N-methyltransferases (NMTs) involved in caffeine production, defense-related genes, and alkaloid and flavonoid enzymes involved in secondary compound synthesis. Comparative analyses of caffeine NMTs demonstrate that these genes expanded through sequential tandem duplications independently of genes from cacao and tea, suggesting that caffeine in eudicots is of polyphyletic origin.
The Banana Genome HubBanana is one of the world's favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/
Genetic diversity, linkage disequilibrium and power of a large grapevine (Vitis vinifera L) diversity panel newly designed for association studiesBACKGROUND: As for many crops, new high-quality grapevine varieties requiring less pesticide and adapted to climate change are needed. In perennial species, breeding is a long process which can be speeded up by gaining knowledge about quantitative trait loci linked to agronomic traits variation. However, due to the long juvenile period of these species, establishing numerous highly recombinant populations for high resolution mapping is both costly and time-consuming. Genome wide association studies in germplasm panels is an alternative method of choice, since it allows identifying the main quantitative trait loci with high resolution by exploiting past recombination events between cultivars. Such studies require adequate panel design to represent most of the available genetic and phenotypic diversity. Assessing linkage disequilibrium extent and panel power is also needed to determine the marker density required for association studies. RESULTS: Starting from the largest grapevine collection worldwide maintained in Vassal (France), we designed a diversity panel of 279 cultivars with limited relatedness, reflecting the low structuration in three genetic pools resulting from different uses (table vs wine) and geographical origin (East vs West), and including the major founders of modern cultivars. With 20 simple sequence repeat markers and five quantitative traits, we showed that our panel adequately captured most of the genetic and phenotypic diversity existing within the entire Vassal collection. To assess linkage disequilibrium extent and panel power, we genotyped single nucleotide polymorphisms: 372 over four genomic regions and 129 distributed over the whole genome. Linkage disequilibrium, measured by correlation corrected for kinship, reached 0.2 for a physical distance between 9 and 458 Kb depending on genetic pool and genomic region, with varying size of linkage disequilibrium blocks. This panel achieved reasonable power to detect associations between traits with high broad-sense heritability (> 0.7) and causal loci with intermediate allelic frequency and strong effect (explaining > 10 % of total variance). CONCLUSIONS: Our association panel constitutes a new, highly valuable resource for genetic association studies in grapevine, and deserves dissemination to diverse field and greenhouse trials to gain more insight into the genetic control of many agronomic traits and their interaction with the environment.