Accurate SHAPE-directed RNA structure determinationKatherine E. Deigan, Tian W. Li, David H. Mathews et al.|Proceedings of the National Academy of Sciences|2008 Almost all RNAs can fold to form extensive base-paired secondary structures. Many of these structures then modulate numerous fundamental elements of gene expression. Deducing these structure-function relationships requires that it be possible to predict RNA secondary structures accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure for a single sequence reliably represents the correct structure, has remained an unsolved problem. Here, we demonstrate that quantitative, nucleotide-resolution information from a SHAPE experiment can be interpreted as a pseudo-free energy change term and used to determine RNA secondary structure with high accuracy. Free energy minimization, by using SHAPE pseudo-free energies, in conjunction with nearest neighbor parameters, predicts the secondary structure of deproteinized Escherichia coli 16S rRNA (>1,300 nt) and a set of smaller RNAs (75-155 nt) with accuracies of up to 96-100%, which are comparable to the best accuracies achievable by comparative sequence analysis.
Accurate SHAPE‐directed RNA structure predictionAlmost all RNAs can fold to form extensive secondary structures. Many of these structures then modulate numerous elements of gene expression. Deducing these structure‐function relationships requires that it be possible to predict RNA secondary structure accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure reliably represents the correct structure, has remained an unsolved problem. Here we demonstrate that quantitative, nucleotide‐resolution information from a SHAPE experiment can be interpreted as a pseudo‐free energy change term and used to determine RNA secondary structure with high accuracy. We use three metrics to evaluate the prediction accuracy for E. coli 16S rRNA (1542 nts). Taking the structure determined by comparative sequence analysis as the standard, we correctly predict 90% of all phylogenetically supported base pairs. Allowing for experimentally supported local refolding relative to the phylogenetic structure, the prediction accuracy is 95%. As judged by the ability to identify helices of 3 base pairs or greater, and thus the overall topology of the RNA, the prediction accuracy is again 95%. This work demonstrates that, given sufficient quantitative in‐solution information, it is possible to predict the structure of an important subset of RNAs with accuracies comparable to those achievable by comparative sequence analysis.