Joseph N. Zadeh

NUPACK: Analysis and design of nucleic acid systems

Joseph N. Zadeh, Conrad Steenberg, Justin S. Bois et al.|Journal of Computational Chemistry|2010

Cited by 1.8kOpen Access

UNLABELLED: The Nucleic Acid Package (NUPACK) is a growing software suite for the analysis and design of nucleic acid systems. The NUPACK web server (http://www.nupack.org) currently enables: ANALYSIS: thermodynamic analysis of dilute solutions of interacting nucleic acid strands. DESIGN: sequence design for complexes of nucleic acid strands intended to adopt a target secondary structure at equilibrium.Utilities: evaluation, display, and annotation of equilibrium properties of a complex of nucleic acid strands. NUPACK algorithms are formulated in terms of nucleic acid secondary structure. In most cases, pseudoknots are excluded from the structural ensemble.

Nucleic acid sequence design via efficient ensemble defect optimization

Joseph N. Zadeh, Brian R. Wolfe, Niles A. Pierce|Journal of Computational Chemistry|2010

Cited by 197

We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user-specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree-decomposition of the target structure. During leaf optimization, defect-weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N(3) ) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N(3) ). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp.

Constrained Multistate Sequence Design for Nucleic Acid Reaction Pathway Engineering

Brian R. Wolfe, Nicholas J. Porubsky, Joseph N. Zadeh et al.|Journal of the American Chemical Society|2017

Cited by 116Open Access

We describe a framework for designing the sequences of multiple nucleic acid strands intended to hybridize in solution via a prescribed reaction pathway. Sequence design is formulated as a multistate optimization problem using a set of target test tubes to represent reactant, intermediate, and product states of the system, as well as to model crosstalk between components. Each target test tube contains a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Optimization of the equilibrium ensemble properties of the target test tubes implements both a positive design paradigm, explicitly designing for on-pathway elementary steps, and a negative design paradigm, explicitly designing against off-pathway crosstalk. Sequence design is performed subject to diverse user-specified sequence constraints including composition constraints, complementarity constraints, pattern prevention constraints, and biological constraints. Constrained multistate sequence design facilitates nucleic acid reaction pathway engineering for diverse applications in molecular programming and synthetic biology. Design jobs can be run online via the NUPACK web application.

Algorithms for Nucleic Acid Sequence Design

Joseph N. Zadeh|PhDT|2010

Cited by 2

Motivated by a growing field of research focused on programming function into biomolecules, we seek to decrease the cost of high-quality rational nucleic acid sequence design while increasing its versatility and availability. We begin by describing an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Using ensemble defect optimization, we seek to minimize the average number of incorrectly paired nucleotides at equilibrium, calculated over the entire ensemble of unpseudoknotted secondary structures. Empirically, the algorithm exhibits asymptotic optimality and costs 4/3 the time of a single objective function evaluation for large structures. We then extend this algorithm to design multi-state systems with an arbitrary number of linked targets and demonstrate its efficacy on systems invented by molecular engineers. To improve the ease of use and availability of nucleic acid analysis and design tools, we present NUPACK, a web application already in wide use that allows the international research community to share a high-performance compute cluster for the analysis and design of systems of interacting nucleic acids.

Is this you? Claim your profile.

Top publicationsby citations