De novo protein structure prediction
Encyclopedia
In computational biology
Computational biology
Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems...

, de novo protein structure prediction is the task of estimating a protein's tertiary structure
Tertiary structure
In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.-Relationship to primary structure:...

 from its sequence
Primary structure
The primary structure of peptides and proteins refers to the linear sequence of its amino acid structural units. The term "primary structure" was first coined by Linderstrøm-Lang in 1951...

 alone. The problem is very difficult and has occupied leading scientists for decades. Research has focused in three areas: alternate lower-resolution representations of proteins, accurate energy functions, and efficient sampling methods. At present, the most successful methods have a reasonable probability of predicting the fold of a small protein domain within 5 angstrom
Ångström
The angstrom or ångström, is a unit of length equal to 1/10,000,000,000 of a meter . Its symbol is the Swedish letter Å....

s.

De novo protein structure prediction methods attempt to predict tertiary structures from sequences based on general principles that govern protein folding
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

 energetics and/or statistical tendencies of conformational features that native structures acquire, without the use of explicit templates. A general paradigm for de novo prediction involves sampling conformation space
Configuration space
- Configuration space in physics :In classical mechanics, the configuration space is the space of possible positions that a physical system may attain, possibly subject to external constraints...

, guided by scoring functions and other sequence-dependent biases such that a large set of candidate (“decoy") structures are generated. Native-like conformations are then selected from these decoys using scoring functions as well as conformer clustering. High-resolution refinement is sometimes used as a final step to fine-tune native-like structures. There are two major classes of scoring functions. Physics-based functions are based on mathematical models describing aspects of the known physics of molecular interaction. Knowledge-based functions are formed with statistical models capturing aspects of the properties of native protein conformations .

De novo methods tend to require vast computational resources, and have thus only been carried out for relatively small proteins. To predict protein structure de novo for larger proteins will require better algorithms and larger computational resources like those afforded by either powerful supercomputers (such as Blue Gene or MDGRAPE-3) or distributed computing projects (such as Folding@home
Folding@home
Folding@home is a distributed computing project designed to use spare processing power on personal computers to perform simulations of disease-relevant protein folding and other molecular dynamics, and to improve on the methods of doing so...

, Rosetta@home
Rosetta@home
Rosetta@home is a distributed computing project for protein structure prediction on the Berkeley Open Infrastructure for Network Computing platform, run by the Baker laboratory at the University of Washington...

, the Human Proteome Folding Project
Human Proteome Folding Project
The Human Proteome Folding Project is a collaborative effort between New York University , the Institute for Systems Biology and the University of Washington , using the Rosetta software developed by the Rosetta Commons....

, or Nutritious Rice for the World
Nutritious Rice for the World
Nutritious Rice for the World is a World Community Grid research project in the field of agronomy led by the Samudrala Computational Biology Research Group at the University of Washington. It was launched on May 12, 2008. The objective of this project is to predict the structure of proteins of...

). Although computational barriers are vast, the potential benefits of structural genomics (by predicted or experimental methods) make de novo structure prediction an active research field.

See also

  • Protein structure prediction
    Protein structure prediction
    Protein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence — that is, the prediction of its secondary, tertiary, and quaternary structure from its primary structure. Structure prediction is fundamentally different from the inverse...

  • Protein structure prediction software
  • Protein design
    Protein design
    Protein design is the design of new protein molecules, either from scratch or by making calculated variations on a known structure. The use of rational design techniques for proteins is a major aspect of protein engineering....


External links

CASP:
  • http://predictioncenter.org/


Folding@Home:
  • http://folding.stanford.edu/


HPF project:
  • http://www.worldcommunitygrid.org/projects_showcase/viewHpf2Research.do


Foldit:
  • http://fold.it/portal/
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK