Fast and accurate protein structure search with FoldseekAs structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing tertiary amino acid interactions within proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of Dali, TM-align and CE, respectively.
Fast and accurate protein structure search with FoldseekMichel van Kempen, Stephanie Kim, Charlotte Tumescheit et al.|bioRxiv (Cold Spring Harbor Laboratory)|2022 As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the structure of a query protein against a database by describing the amino acid backbone of proteins as sequences over a structural alphabet. Foldseek decreases computation times by four to five orders of magnitude with 86%, 88% and 133% of the sensitivities of DALI, TM-align and CE, respectively.
CIAlign: A highly customisable command line tool to clean, interpret and visualise multiple sequence alignmentsBackground: Throughout biology, multiple sequence alignments (MSAs) form the basis of much investigation into biological features and relationships. These alignments are at the heart of many bioinformatics analyses. However, sequences in MSAs are often incomplete or very divergent, which can lead to poor alignment and large gaps. This slows down computation and can impact conclusions without being biologically relevant. Cleaning the alignment by removing common issues such as gaps, divergent sequences, large insertions and deletions and poorly aligned sequence ends can substantially improve analyses. Manual editing of MSAs is very widespread but is time-consuming and difficult to reproduce. Results: We present a comprehensive, user-friendly MSA trimming tool with multiple visualisation options. Our highly customisable command line tool aims to give intervention power to the user by offering various options, and outputs graphical representations of the alignment before and after processing to give the user a clear overview of what has been removed. The main functionalities of the tool include removing regions of low coverage due to insertions, removing gaps, cropping poorly aligned sequence ends and removing sequences that are too divergent or too short. The thresholds for each function can be specified by the user and parameters can be adjusted to each individual MSA. CIAlign is designed with an emphasis on solving specific and common alignment problems and on providing transparency to the user. Conclusion: CIAlign effectively removes problematic regions and sequences from MSAs and provides novel visualisation options. This tool can be used to fine-tune alignments for further analysis and processing. The tool is aimed at anyone who wishes to automatically clean up parts of an MSA and those requiring a new, accessible way of visualising large MSAs.
Detection of novel and recognized RNA viruses in mosquitoes from the Yucatan Peninsula of Mexico using metagenomics and characterization of their in vitro host rangesA metagenomics approach was used to detect novel and recognized RNA viruses in mosquitoes from the Yucatan Peninsula of Mexico. A total of 1359 mosquitoes of 7 species and 5 genera (Aedes, Anopheles, Culex, Mansonia and Psorophora) were sorted into 37 pools, homogenized and inoculated onto monolayers of Aedes albopictus (C6/36) cells. A second blind passage was performed and then total RNA was extracted and analysed by RNA-seq. Two novel viruses, designated Uxmal virus and Mayapan virus, were identified. Uxmal virus was isolated from three pools of Aedes (Ochlerotatus) taeniorhynchus and phylogenetic data indicate that it should be classified within the recently proposed taxon Negevirus. Mayapan virus was recovered from two pools of Psorophora ferox and is most closely related to unclassified Nodaviridae-like viruses. Two recognized viruses were also detected: Culex flavivirus (family Flaviviridae) and Houston virus (family Mesoniviridae), with one and two isolates being recovered, respectively. The in vitro host ranges of all four viruses were determined by assessing their replicative abilities in cell lines of avian, human, monkey, hamster, murine, lepidopteran and mosquito (Aedes, Anopheles and Culex) origin, revealing that all viruses possess vertebrate replication-incompetent phenotypes. In conclusion, we report the isolation of both novel and recognized RNA viruses from mosquitoes collected in Mexico, and add to the growing plethora of viruses discovered recently through the use of metagenomics.
Evolutionary balance between foldability and functionality of a glucose transporterHyun-Kyu Choi, Hyunook Kang, Chanwoo Lee et al.|Nature Chemical Biology|2022