Treeio: An R Package for Phylogenetic Tree Input and Output with Richly Annotated and Associated DataLigen Wang, Tommy Tsan‐Yuk Lam, Shuangbin Xu et al.|Molecular Biology and Evolution|2019 Phylogenetic trees and data are often stored in incompatible and inconsistent formats. The outputs of software tools that contain trees with analysis findings are often not compatible with each other, making it hard to integrate the results of different analyses in a comparative study. The treeio package is designed to connect phylogenetic tree input and output. It supports extracting phylogenetic trees as well as the outputs of commonly used analytical software. It can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context. Treeio also supports export of a phylogenetic tree with heterogeneous-associated data to a single tree file, including BEAST compatible NEXUS and jtree formats; these facilitate data sharing as well as file format conversion for downstream analysis. The treeio package is designed to work with the tidytree and ggtree packages. Tree data can be processed using the tidy interface with tidytree and visualized by ggtree. The treeio package is released within the Bioconductor and rOpenSci projects. It is available at https://www.bioconductor.org/packages/treeio/.
<i>Ggtree</i> : A serialized data object for visualization of a phylogenetic tree and annotation dataAbstract While phylogenetic trees and associated data have been getting easier to generate, it has been difficult to reuse, combine, and synthesize the information they provided, because published trees are often only available as image files and associated data are often stored in incompatible formats. To increase the reproducibility and reusability of phylogenetic data, the ggtree object was designed for storing phylogenetic tree and associated data, as well as visualization directives. The ggtree object itself is a graphic object and can be rendered as a static image. More importantly, the input tree and associated data that are used in visualization can be extracted from the graphic object, making it an ideal data structure for publishing tree (image, tree, and data in one single object) and thus enhancing data reuse and analytical reproducibility, as well as facilitating integrative and comparative studies. The ggtree package is freely available at https://www.bioconductor.org/packages/ggtree .
MicrobiotaProcess: A comprehensive R package for deep mining microbiomeShuangbin Xu, Li Zhan, Wenli Tang et al.|The Innovation|2023 •MicrobiotaProcess is a bioinformatics tool for microbiome profiling.•MicrobiotaProcess defines an MPSE structure to better integrate both primary and intermediate microbiome datasets.•MicrobiotaProcess provides a set of functions under a unified tidy framework, which helps users to explore related datasets more efficiently.•MicrobiotaProcess improves the integration and exploration of downstream data analysis.•MicrobiotaProcess offers many visual methods to quickly render clear and comprehensive visualizations that reveal meaningful insights. The data output from microbiome research is growing at an accelerating rate, yet mining the data quickly and efficiently remains difficult. There is still a lack of an effective data structure to represent and manage data, as well as flexible and composable analysis methods. In response to these two issues, we designed and developed the MicrobiotaProcess package. It provides a comprehensive data structure, MPSE, to better integrate the primary and intermediate data, which improves the integration and exploration of the downstream data. Around this data structure, the downstream analysis tasks are decomposed and a set of functions are designed under a tidy framework. These functions independently perform simple tasks and can be combined to perform complex tasks. This gives users the ability to explore data, conduct personalized analyses, and develop analysis workflows. Moreover, MicrobiotaProcess can interoperate with other packages in the R community, which further expands its analytical capabilities. This article demonstrates the MicrobiotaProcess for analyzing microbiome data as well as other ecological data through several examples. It connects upstream data, provides flexible downstream analysis components, and provides visualization methods to assist in presenting and interpreting results. The data output from microbiome research is growing at an accelerating rate, yet mining the data quickly and efficiently remains difficult. There is still a lack of an effective data structure to represent and manage data, as well as flexible and composable analysis methods. In response to these two issues, we designed and developed the MicrobiotaProcess package. It provides a comprehensive data structure, MPSE, to better integrate the primary and intermediate data, which improves the integration and exploration of the downstream data. Around this data structure, the downstream analysis tasks are decomposed and a set of functions are designed under a tidy framework. These functions independently perform simple tasks and can be combined to perform complex tasks. This gives users the ability to explore data, conduct personalized analyses, and develop analysis workflows. Moreover, MicrobiotaProcess can interoperate with other packages in the R community, which further expands its analytical capabilities. This article demonstrates the MicrobiotaProcess for analyzing microbiome data as well as other ecological data through several examples. It connects upstream data, provides flexible downstream analysis components, and provides visualization methods to assist in presenting and interpreting results.