NUPACK
Encyclopedia
The Nucleic Acid Package, is a growing software suite for the analysis
Nucleic acid structure prediction
Nucleic acid structure prediction is a computational method to determine nucleic acid secondary and tertiary structure from its sequence. Secondary structure can be predicted from a single or from several nucleic acid sequences...

 and design
Nucleic acid design
Nucleic acid design is the process of generating a set of nucleic acid base sequences that will associate into a desired conformation. Nucleic acid design is central to the fields of DNA nanotechnology and DNA computing...

 of nucleic acid
Nucleic acid
Nucleic acids are biological molecules essential for life, and include DNA and RNA . Together with proteins, nucleic acids make up the most important macromolecules; each is found in abundance in all living things, where they function in encoding, transmitting and expressing genetic information...

 systems . Jobs can be run online on the NUPACK webserver or NUPACK source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 can be downloaded and compiled locally. NUPACK algorithms are formulated in terms of nucleic acid secondary structure. In most cases, pseudoknots
Pseudoknot
A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. The pseudoknot was first recognized in the turnip yellow mosaic virus in 1982...

 are excluded from the structural ensemble.

Secondary structure model


The secondary structure
Nucleic acid secondary structure
The secondary structure of a nucleic acid molecule refers to the basepairing interactions within a single molecule or set of interacting molecules, and can be represented as a list of bases which are paired in a nucleic acid molecule....

 of multiple interacting strands is defined by a list of base
pairs
Base pair
In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...

 . A polymer graph for a secondary structure can be constructed by ordering the strands around a circle, drawing the backbones in succession from 5’ to 3’ around the circumference with a nick between each strand, and drawing straight lines connecting paired bases. A secondary structure is pseudoknotted
Pseudoknot
A pseudoknot is a nucleic acid secondary structure containing at least two stem-loop structures in which half of one stem is intercalated between the two halves of another stem. The pseudoknot was first recognized in the turnip yellow mosaic virus in 1982...

 if every strand ordering corresponds to a polymer graph with crossing lines. A secondary structure is connected if no subset of the strands is free of the others. Algorithms are formulated in terms of ordered complexes, each corresponding to the structural ensemble of all connected polymer graphs with no crossing lines for a particular ordering of a set of strands. The free energy of an unpseudoknotted secondary structure is calculated using nearest-neighbor empirical parameters for RNA in 1M Na+ or for DNA in user-specified Na+ and Mg++ concentrations ; additional parameters are employed for the analysis of pseudoknots (single RNA strands only) .

Analysis

The Analysis page allows users to analyze the thermodynamic
Nucleic acid thermodynamics
Nucleic acid thermodynamics is the study of the thermodynamics of nucleic acid molecules, or how temperature affects nucleic acid structure. For multiple copies of DNA molecules, the melting temperature is defined as the temperature at which half of the DNA strands are in the double-helical state...

 properties of a dilute solution of interacting nucleic acid strands in the absence of pseudoknots (e.g., a test tube of DNA or RNA strand species) . For a dilute solution containing multiple strand species interacting to form multiple species of ordered complexes, NUPACK calculates for each ordered complex:
  • the partition function
    Partition function (statistical mechanics)
    Partition functions describe the statistical properties of a system in thermodynamic equilibrium. It is a function of temperature and other parameters, such as the volume enclosing a gas...

    ,
  • the minimum free energy (MFE) secondary structure
    Nucleic acid secondary structure
    The secondary structure of a nucleic acid molecule refers to the basepairing interactions within a single molecule or set of interacting molecules, and can be represented as a list of bases which are paired in a nucleic acid molecule....

    ,
  • the equilibrium base-pairing
    Base pair
    In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...

     probabilities,
  • its equilibrium concentration
    Concentration
    In chemistry, concentration is defined as the abundance of a constituent divided by the total volume of a mixture. Four types can be distinguished: mass concentration, molar concentration, number concentration, and volume concentration...

    ,

including rigorous treatment of distinguishability issues that arise in the multi-stranded setting.

Design

The Design page allows users to design sequences for one or more strands intended to adopt an unpseudoknotted target secondary structure at equilibrium . Sequence design is formulated as an optimization problem
Optimization problem
In mathematics and computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. An optimization problem with discrete...

 with the goal of reducing the ensemble defect below a user-specified stop condition . For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired over the structural ensemble of the ordered complex . For a target secondary structure with N nucleotides, the algorithm seeks to achieve an ensemble defect below N/100. Empirically, the design algorithm exhibits asymptotic optimality as N increases: for sufficiently large N, the cost of sequence design is typically only 4/3 the cost of a single evaluation of the ensemble defect .

Utilities

The Utilities page allows users to evaluate, display, and annotate the equilibrium properties of a complex of interacting nucleic acid strands . The page accepts as input either sequence information, structure information, or both, performing diverse functions based on the information provided, including automatic layout and rendering of secondary structures with or without ideal helical geometry. In either case, the structure layout can be edited dynamically within the web application.


Implementation

The NUPACK web application is programmed within the Ruby on Rails framework, employing AJAX and the Dojo Toolkit to implement dynamic features and interactive graphics. Plots and graphics are generated using NumPy and matplotlib. The site is supported on current versions of the Safari, Chrome, and Firefox browsers. The NUPACK library of analysis and design algorithms is written in the C programming language. Dynamic programs are parallelized using MPI.

Terms of use

The NUPACK web server and NUPACK source code are provided for non-commercial research purposes.

Funding

NUPACK development is funded by the National Science Foundation via the Molecular Programming Project and by the Beckman Institute at Caltech
California Institute of Technology
The California Institute of Technology is a private research university located in Pasadena, California, United States. Caltech has six academic divisions with strong emphases on science and engineering...

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK