In
biochemistryBiochemistry, sometimes called biological chemistry, is the study of chemical processes in living organisms, including, but not limited to, living matter. Biochemistry governs all living organisms and living processes...
and
molecular biologyMolecular biology is the branch of biology that deals with the molecular basis of biological activity. This field overlaps with other areas of biology and chemistry, particularly genetics and biochemistry...
, the
tertiary structure of a
proteinProteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
or any other
macromoleculeA macromolecule is a very large molecule commonly created by some form of polymerization. In biochemistry, the term is applied to the four conventional biopolymers , as well as non-polymeric molecules with large molecular mass such as macrocycles...
is its three-dimensional structure, as defined by the atomic coordinates.
Relationship to primary structure
Tertiary structure is considered to be largely determined by the protein's
primary structureThe primary structure of peptides and proteins refers to the linear sequence of its amino acid structural units. The term "primary structure" was first coined by Linderstrøm-Lang in 1951...
- the sequence of
amino acidAmino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
s of which it is composed. Efforts to predict tertiary structure from the primary structure are known generally as
protein structure predictionProtein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence — that is, the prediction of its secondary, tertiary, and quaternary structure from its primary structure. Structure prediction is fundamentally different from the inverse...
. However, the environment in which a protein is synthesized and allowed to fold are significant determinants of its final shape and are usually not directly taken into account by current prediction methods. Most such methods do rely on comparisons between the sequence to be predicted and sequences of known structure in the
Protein Data BankThe Protein Data Bank is a repository for the 3-D structural data of large biological molecules, such as proteins and nucleic acids....
and thus account for environment indirectly, assuming the target and template sequences share similar cellular contexts. Stanford University's
Folding@HomeFolding@home is a distributed computing project designed to use spare processing power on personal computers to perform simulations of disease-relevant protein folding and other molecular dynamics, and to improve on the methods of doing so...
project is a
distributed computingDistributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
research effort which uses its approximately 5 petaFLOPS (~10 x86 petaFLOPS) of computing power to attempt to model the tertiary (and quaternary) structures of proteins, as well as other aspects of how and why proteins fold into the inordinately complex and varied shapes they take. No currently existing
algorithmIn mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
is yet able to consistently predict a proteins' tertiary or quaternary structure given only its primary structure; learning how to accurately predict the tertiary and quaternary structure of any protein given only its
amino acidAmino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
sequence and the pertinent cellular conditions would be a monumental achievement. Although this ambitious goal has yet to be achieved, researchers have discovered how to combine several of the best of Folding@Home's algorithms to accurately predict the folded structure of
some proteins under certain conditions. The calculations performed by the algorithms are constantly evolving, increasing in complexity and nuance, and involve enormous numbers of variables. These techniques are superficially comparable to
weatherWeather is the state of the atmosphere, to the degree that it is hot or cold, wet or dry, calm or stormy, clear or cloudy. Most weather phenomena occur in the troposphere, just below the stratosphere. Weather refers, generally, to day-to-day temperature and precipitation activity, whereas climate...
models that show hurricane
storm trackStorm tracks are the relatively narrow zones in the Atlantic and Pacific along which most Atlantic or Pacific extratropical cyclones travel.The storm tracks begin in the westernmost parts of Atlantic and Pacific, where the large temperature contrasts between land and sea cause cyclones to form,...
s; each of several algorithms independently models a complex system (the weather, in this case) somewhat differently from each of its sister weather algorithms, and the average of all the algorithms' output is taken to be the most likely "storm track". The shape of proteins can be elucidated through a somewhat similar process.
Researchers are also interested in proteins that can fold into more than one stable configuration; protein aggregation diseases such as
Alzheimer's DiseaseAlzheimer's disease also known in medical literature as Alzheimer disease is the most common form of dementia. There is no cure for the disease, which worsens as it progresses, and eventually leads to death...
and
Huntington's DiseaseHuntington's disease, chorea, or disorder , is a neurodegenerative genetic disorder that affects muscle coordination and leads to cognitive decline and dementia. It typically becomes noticeable in middle age. HD is the most common genetic cause of abnormal involuntary writhing movements called chorea...
as well as
prionA prion is an infectious agent composed of protein in a misfolded form. This is in contrast to all other known infectious agents which must contain nucleic acids . The word prion, coined in 1982 by Stanley B. Prusiner, is a portmanteau derived from the words protein and infection...
diseases such as Mad Cow disease can be better understood by constructing (and deconstructing) disease models; the most common way of doing this is by developing a way of inducing the desired disease state in test animals (administering
MPTPMPTP is a neurotoxin precursor to MPP+, which causes permanent symptoms of Parkinson's disease by destroying dopaminergic neurons in the substantia nigra of the brain...
to give the animals Parkinson's disease, or knocking out a gene essential for the prevention of certain tumors from the animals' genomes). Folding@Home allows for the modelling of disease states that are not as easily induced, without the need for test animals. Perhaps more importantly, fully human proteins encoded by fully human genes can be used without any of the ethical problems that arise in studying living human beings. Due to its enormous flexibility, which has only briefly been discussed here, coupled with its ability to improve over time, Folding@Home and projects like it are quickly becoming indispensable tools among researchers from a broad variety of disciplines. The possibilities in medicine, biology, pathology, nuclear physics, and other scientific disciplines should a reliable way to accurately model the final tertiary or quaternary structure of human proteins are almost limitless. Proteins, due to the precise conformations they fold into, are nature's original nanomachines; developing an inexpensive and practical way to design and target proteins would completely revolutionize medicine and would have incredibly far-reaching implications. The significance of such a discovery cannot be overstated. To date, over 78 scientific papers have been published on discoveries that relied on Folding@Home.
Determinants of tertiary structure
In
globular proteinGlobular proteins, or spheroproteins are one of the two main protein classes, comprising "globe"-like proteins that are more or less soluble in aqueous solutions...
s, tertiary interactions are frequently stabilized by the sequestration of hydrophobic amino acid residues in the protein core, from which water is excluded, and by the consequent enrichment of charged or hydrophilic residues on the protein's water-exposed surface. In secreted proteins that do not spend time in the
cytoplasmThe cytoplasm is a small gel-like substance residing between the cell membrane holding all the cell's internal sub-structures , except for the nucleus. All the contents of the cells of prokaryote organisms are contained within the cytoplasm...
,
disulfide bondIn chemistry, a disulfide bond is a covalent bond, usually derived by the coupling of two thiol groups. The linkage is also called an SS-bond or disulfide bridge. The overall connectivity is therefore R-S-S-R. The terminology is widely used in biochemistry...
s between
cysteineCysteine is an α-amino acid with the chemical formula HO2CCHCH2SH. It is a non-essential amino acid, which means that it is biosynthesized in humans. Its codons are UGU and UGC. The side chain on cysteine is thiol, which is polar and thus cysteine is usually classified as a hydrophilic amino acid...
residues help to maintain the protein's tertiary structure. A variety of common and stable tertiary structures appear in a large number of proteins that are unrelated in both function and evolution - for example, many proteins are shaped like a
TIM barrelThe TIM barrel is a conserved protein fold consisting of eight α-helices and eight parallel β-strands that alternate along the peptide backbone. The structure is named after triosephosphate isomerase, a conserved glycolytic enzyme. TIM barrels are quite common among the conserved protein folds...
, named for the enzyme
triosephosphateisomeraseTriose-phosphate isomerase , is an enzyme that catalyzes the reversible interconversion of the triose phosphate isomers dihydroxyacetone phosphate and D-glyceraldehyde 3-phosphate....
. Another common structure is a highly stable dimeric
coiled coilA coiled coil is a structural motif in proteins, in which 2-7 alpha-helices are coiled together like the strands of a rope . Many coiled coil type proteins are involved in important biological functions such as the regulation of gene expression e.g. transcription factors...
structure composed of 2-7
alpha helicesA common motif in the secondary structure of proteins, the alpha helix is a right-handed coiled or spiral conformation, in which every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier...
. Proteins are classified by the folds they represent in databases like
SCOPThe Structural Classification of Proteins database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins...
and
CATHThe CATH Protein Structure Classification is a semi-automatic, hierarchical classification of protein domains published in 1997 by Christine Orengo, Janet Thornton and their colleagues....
.
Stability of native states
The most typical conformation of a protein in its
cellularThe cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....
environment is generally referred to as the
native stateIn biochemistry, the native state of a protein is its operative or functional form. While all protein molecules begin as simple unbranched chains of amino acids, once completed they assume highly specific three-dimensional shapes; that ultimate shape, known as tertiary structure, is the folded...
or native conformation. It is commonly assumed that this most-populated state is also the most
thermodynamicallyThermodynamics is a physical science that studies the effects on material bodies, and on radiation in regions of space, of transfer of heat and of work done on or by the bodies or radiation...
stable conformation attainable for a given primary structure; this is a reasonable first approximation but the claim assumes that the reaction is not under
kineticChemical kinetics, also known as reaction kinetics, is the study of rates of chemical processes. Chemical kinetics includes investigations of how different experimental conditions can influence the speed of a chemical reaction and yield information about the reaction's mechanism and transition...
control - that is, that the time required for the protein to attain its native conformation before being
translatedIn molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
is small.
In the cell, a variety of protein chaperones assist a newly synthesized polypeptide in attaining its native conformation. Some such proteins are highly specific in their function, such as
protein disulfide isomeraseProtein disulfide isomerase or PDI is an enzyme in the endoplasmic reticulum in eukaryotes that catalyzes the formation and breakage of disulfide bonds between cysteine residues within proteins as they fold...
; others are very general and can be of assistance to most globular proteins - the prokaryotic
GroELGroEL belongs to the chaperonin family of molecular chaperones, and is found in a large number of bacteria. It is required for the proper folding of many proteins. To function properly, GroEL requires the lid-like cochaperonin protein complex GroES...
/
GroESHeat shock 10 kDa protein 1 also known as chaperonin 10 or early-pregnancy factor is a protein that in humans is encoded by the HSPE1 gene. The homolog in E...
system and the homologous eukaryotic
Heat shock proteinHeat shock proteins are a class of functionally related proteins involved in the folding and unfolding of other proteins. Their expression is increased when cells are exposed to elevated temperatures or other stress. This increase in expression is transcriptionally regulated...
s Hsp60/Hsp10 system fall into this category.
Some proteins explicitly take advantage of the fact that they can become kinetically trapped in a relatively high-energy conformation due to folding kinetics. Influenza
hemagglutininInfluenza hemagglutinin or haemagglutinin is a type of hemagglutinin found on the surface of the influenza viruses. It is an antigenic glycoprotein. It is responsible for binding the virus to the cell that is being infected...
, for example, is synthesized as a single polypeptide chain that acts as a kinetic trap. The "mature" activated protein is
proteolyticallyProteolysis is the directed degradation of proteins by cellular enzymes called proteases or by intramolecular digestion.-Purposes:Proteolysis is used by the cell for several purposes...
cleaved to form two polypeptide chains that are trapped in a high-energy conformation. Upon encountering a drop in
pHIn chemistry, pH is a measure of the acidity or basicity of an aqueous solution. Pure water is said to be neutral, with a pH close to 7.0 at . Solutions with a pH less than 7 are said to be acidic and solutions with a pH greater than 7 are basic or alkaline...
, the protein undergoes an energetically favorable conformational rearrangement that enables it to penetrate a host cell membrane.
Many
serpinSerpins are a group of proteins with similar structures that were first identified as a set of proteins able to inhibit proteases. The acronym serpin was originally coined because many serpins inhibit chymotrypsin-like serine proteases .The first members of the serpin superfamily to be extensively...
s (serine protease inhibitors) are metastable, and undergo a
conformational changeA macromolecule is usually flexible and dynamic. It can change its shape in response to changes in its environment or other factors; each possible shape is called a conformation, and a transition between them is called a conformational change...
when a loop of the protein is cut by a
proteaseA protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein....
.
Experimental determination
The majority of protein structures known to date have been solved with the experimental technique of X-ray
crystallographyCrystallography is the experimental science of the arrangement of atoms in solids. The word "crystallography" derives from the Greek words crystallon = cold drop / frozen drop, with its meaning extending to all solids with some degree of transparency, and grapho = write.Before the development of...
, which typically provides data of high resolution but provides no time-dependent information on the protein's conformational flexibility. A second common way of solving protein structures uses NMR, which provides somewhat lower-resolution data in general and is limited to relatively small proteins, but can provide time-dependent information about the motion of a protein in solution.
Dual polarisation interferometryDual polarization interferometry is an analytical technique that can probe molecular scale layers adsorbed to the surface of a waveguide by using the evanescent wave of a laser beam confined to the waveguide...
is a time resolved analytical method for determining the overall conformation and
conformational changeA macromolecule is usually flexible and dynamic. It can change its shape in response to changes in its environment or other factors; each possible shape is called a conformation, and a transition between them is called a conformational change...
s in surface captured proteins providing complementary information to these high resolution methods. More is known about the tertiary structural features of soluble globular proteins than about
membrane proteinA membrane protein is a protein molecule that is attached to, or associated with the membrane of a cell or an organelle. More than half of all proteins interact with membranes.-Function:...
s because the latter class is extremely difficult to study using these methods.
Interactions stabilizing tertiary structure
- Disulfide bonds
- Hydrophobic interactions
- Hydrogen bonds
- Ionic bonds
History
Since the tertiary structure of proteins is an important problem in biochemistry, and since structure determination is relatively difficult,
protein structure predictionProtein structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence — that is, the prediction of its secondary, tertiary, and quaternary structure from its primary structure. Structure prediction is fundamentally different from the inverse...
has been a long-standing problem. The first predicted structure of
globular proteinGlobular proteins, or spheroproteins are one of the two main protein classes, comprising "globe"-like proteins that are more or less soluble in aqueous solutions...
s was the
cyclolThe cyclol hypothesis is the first structural model of a folded, globular protein. It was developed by Dorothy Wrinch in the late 1930s, and was based on three assumptions. Firstly, the hypothesis assumes that two peptide groups can be crosslinked by a cyclol reaction ; these crosslinks are...
model of
Dorothy WrinchDorothy Maud Wrinch was a mathematician and biochemical theorist best known for her attempt to deduce protein structure using mathematical principles....
, but this was quickly discounted as being inconsistent with experimental data. Modern methods are sometimes able to predict the tertiary structure
de novo to within 5
ÅThe angstrom or ångström, is a unit of length equal to 1/10,000,000,000 of a meter . Its symbol is the Swedish letter Å....
for small proteins (<120 residues) and under favorable conditions, e.g., confident
secondary structureIn biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...
predictions.
See also
- Folding (chemistry)
In chemistry, folding is the process by which a molecule assumes its shape or conformation. The process can also be described as intramolecular self-assembly where the molecule is directed to form a specific shape through noncovalent interactions, such as hydrogen bonding, metal coordination,...
- Quaternary structure
In biochemistry, quaternary structure is the arrangement of multiple folded protein or coiling protein molecules in a multi-subunit complex.-Description and examples:...
- Structural biology
Structural biology is a branch of molecular biology, biochemistry, and biophysics concerned with the molecular structure of biological macromolecules, especially proteins and nucleic acids, how they acquire the structures they have, and how alterations in their structures affect their function...
- Protein Contact Map
A protein contact map represents the distance between all possible residue pairs of a three-dimensional protein structure using a binary two-dimensional matrix. For two residues i and j, the ij element of the matrix is 1 if the two residues are closer than a predetermined threshold, and 0 otherwise...
- Proteopedia
Proteopedia is a wiki, 3D encyclopedia of proteins and other molecules..The site contains a page for every entry in the Protein Data Bank , as well as pages that are more descriptive of protein structures in general such as acetylcholinesterase, hemoglobin, and the photosystem II with a Jmol view...
, a collaborative encyclopedia of proteins and other molecules.
External links