All Topics  
Base pair

 

   Email Print
   Bookmark   Link






 

Base pair



 
 
In molecular biology
Molecular biology

Molecular biology is the study of biology at a molecule level. The field overlaps with other areas of biology and chemistry, particularly genetics and biochemistry....
, two nucleotide
Nucleotide

Nucleotides are molecules that comprise the structural units of RNA and DNA. Additionally, nucleotides play central roles in metabolism. In that capacity, they serve as sources of chemical energy , participate in cell signaling , and are incorporated into important cofactors of enzymatic reactions ....
s on opposite complementary
Complementarity (molecular biology)

In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA and RNA as well as DNA:RNA duplexes. Each strand is complementary to the other in that the base pairs between them are non-covalent bond connected via two or three hydrogen bonds....
 DNA
DNA

Deoxyribonucleic acid is a nucleic acid that contains the genetics instructions used in the development and functioning of all known living organisms and some viruses....
 or RNA
RNA

Ribonucleic acid is a type of molecule that consists of a long chain of nucleotide units. Each nucleotide consists of a nucleobase, a ribose sugar, and a phosphate....
 strands that are connected via hydrogen bond
Hydrogen bond

A hydrogen bond is the attractive force between one electronegative atom and a hydrogen covalently bonded to another electronegative atom. It results from a dipole-dipole force with a hydrogen atom bonded to nitrogen, oxygen or fluorine ....
s are called a base pair (often abbreviated bp). In the canonical Watson-Crick base pairing, adenine
Adenine

Adenine is a nucleobase with a variety of roles in biochemistry including cellular respiration, in the form of both the energy-rich adenosine triphosphate and the cofactor s nicotinamide adenine dinucleotide and flavin adenine dinucleotide , and Protein biosynthesis, as a chemical component of DNA and RNA....
 (A) forms a base pair with thymine
Thymine

Thymine is one of the four bases in the nucleic acid of DNA that make up the letters GCAT. The others are adenine, guanine, and cytosine. Thymine always pairs with adenine....
 (T), as does guanine
Guanine

Guanine is one of the five main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, thymine, and uracil. In DNA, guanine is paired with cytosine....
 (G) with cytosine
Cytosine

Cytosine is one of the five main bases found in DNA and RNA. It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached ....
 (C) in DNA. In RNA, thymine
Thymine

Thymine is one of the four bases in the nucleic acid of DNA that make up the letters GCAT. The others are adenine, guanine, and cytosine. Thymine always pairs with adenine....
 is replaced by uracil
Uracil

Uracil is a common and naturally occurring pyrimidine derivative. Originally discovered in 1900, it was isolated by hydrolysis of yeast nuclein that was found in bovine thymus and spleen, herring, sperm, and wheat germ....
 (U). Non-Watson-Crick base pairing with alternate hydrogen bonding patterns also occur, especially in RNA; common such patterns are Hoogsteen base pair
Hoogsteen base pair

A Hoogsteen base pair is a minor variation of base-pairing in nucleic acids such as the A?T pair. In this manner, two nucleobases on each strand can be held together by hydrogen bonds in the major groove....
s. Pairing is also the mechanism by which codons on messenger RNA
Messenger RNA

Messenger ribonucleic acid is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcription from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes....
 molecules are recognized by anticodons on transfer RNA
Transfer RNA

Transfer RNA is a small RNA that transfers a specific active amino acid to a growing polypeptide chain at the ribosomal site of protein synthesis during translation ....
 during protein translation
Translation (genetics)

Translation is the first stage of protein biosynthesis . Translation is the production of proteins by decoding mRNA produced in Transcription ....
.






Discussion
Ask a question about 'Base pair'
Start a new discussion about 'Base pair'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In molecular biology
Molecular biology

Molecular biology is the study of biology at a molecule level. The field overlaps with other areas of biology and chemistry, particularly genetics and biochemistry....
, two nucleotide
Nucleotide

Nucleotides are molecules that comprise the structural units of RNA and DNA. Additionally, nucleotides play central roles in metabolism. In that capacity, they serve as sources of chemical energy , participate in cell signaling , and are incorporated into important cofactors of enzymatic reactions ....
s on opposite complementary
Complementarity (molecular biology)

In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA and RNA as well as DNA:RNA duplexes. Each strand is complementary to the other in that the base pairs between them are non-covalent bond connected via two or three hydrogen bonds....
 DNA
DNA

Deoxyribonucleic acid is a nucleic acid that contains the genetics instructions used in the development and functioning of all known living organisms and some viruses....
 or RNA
RNA

Ribonucleic acid is a type of molecule that consists of a long chain of nucleotide units. Each nucleotide consists of a nucleobase, a ribose sugar, and a phosphate....
 strands that are connected via hydrogen bond
Hydrogen bond

A hydrogen bond is the attractive force between one electronegative atom and a hydrogen covalently bonded to another electronegative atom. It results from a dipole-dipole force with a hydrogen atom bonded to nitrogen, oxygen or fluorine ....
s are called a base pair (often abbreviated bp). In the canonical Watson-Crick base pairing, adenine
Adenine

Adenine is a nucleobase with a variety of roles in biochemistry including cellular respiration, in the form of both the energy-rich adenosine triphosphate and the cofactor s nicotinamide adenine dinucleotide and flavin adenine dinucleotide , and Protein biosynthesis, as a chemical component of DNA and RNA....
 (A) forms a base pair with thymine
Thymine

Thymine is one of the four bases in the nucleic acid of DNA that make up the letters GCAT. The others are adenine, guanine, and cytosine. Thymine always pairs with adenine....
 (T), as does guanine
Guanine

Guanine is one of the five main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, thymine, and uracil. In DNA, guanine is paired with cytosine....
 (G) with cytosine
Cytosine

Cytosine is one of the five main bases found in DNA and RNA. It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached ....
 (C) in DNA. In RNA, thymine
Thymine

Thymine is one of the four bases in the nucleic acid of DNA that make up the letters GCAT. The others are adenine, guanine, and cytosine. Thymine always pairs with adenine....
 is replaced by uracil
Uracil

Uracil is a common and naturally occurring pyrimidine derivative. Originally discovered in 1900, it was isolated by hydrolysis of yeast nuclein that was found in bovine thymus and spleen, herring, sperm, and wheat germ....
 (U). Non-Watson-Crick base pairing with alternate hydrogen bonding patterns also occur, especially in RNA; common such patterns are Hoogsteen base pair
Hoogsteen base pair

A Hoogsteen base pair is a minor variation of base-pairing in nucleic acids such as the A?T pair. In this manner, two nucleobases on each strand can be held together by hydrogen bonds in the major groove....
s. Pairing is also the mechanism by which codons on messenger RNA
Messenger RNA

Messenger ribonucleic acid is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcription from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes....
 molecules are recognized by anticodons on transfer RNA
Transfer RNA

Transfer RNA is a small RNA that transfers a specific active amino acid to a growing polypeptide chain at the ribosomal site of protein synthesis during translation ....
 during protein translation
Translation (genetics)

Translation is the first stage of protein biosynthesis . Translation is the production of proteins by decoding mRNA produced in Transcription ....
. Some DNA- or RNA-binding enzymes can recognize specific base pairing patterns that identify particular regulatory regions of genes.

The size of an individual gene
Gene

A gene is the basic unit of heredity in a living organism. All living things depend on genes. Genes hold the information to build and maintain their cell and pass genetic trait to offspring....
 or an organism's entire genome
Genome

In classical genetics, the genome of a diploid organism including eukarya refers to a full set of chromosomes or genes in a gamete; thereby, a regular somatic cell contains two full sets of genomes....
 is often measured in base pairs because DNA is usually double-stranded. Hence, the number of total base pairs is equal to the number of nucleotide
Nucleotide

Nucleotides are molecules that comprise the structural units of RNA and DNA. Additionally, nucleotides play central roles in metabolism. In that capacity, they serve as sources of chemical energy , participate in cell signaling , and are incorporated into important cofactors of enzymatic reactions ....
s in one of the strands (with the exception of non-coding single-stranded regions of telomere
Telomere

A telomere is a region of repetitive DNA at the end of chromosomes, which protects the end of the chromosome from destruction. Its name is derived from the Greek nouns telos "end" and mer?s "part"....
s). The haploid human genome
Human genome

The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs. Twenty-two of these are autosome, while the remaining pair is XY sex-determination system....
 (23 chromosomes) is estimated to be about 3 billion base pairs long and to contain 20,000-25,000 distinct genes.

A kilobase (kb) is a unit of measurement in molecular biology
Molecular biology

Molecular biology is the study of biology at a molecule level. The field overlaps with other areas of biology and chemistry, particularly genetics and biochemistry....
 equal to 1000 base pairs of DNA
DNA

Deoxyribonucleic acid is a nucleic acid that contains the genetics instructions used in the development and functioning of all known living organisms and some viruses....
 or RNA
RNA

Ribonucleic acid is a type of molecule that consists of a long chain of nucleotide units. Each nucleotide consists of a nucleobase, a ribose sugar, and a phosphate....
.

Examples

The following DNA sequences illustrate pair double-stranded patterns. By convention, the top strand is written from the 5' end to the 3' end; thus the bottom strand is written 3' to 5'.

A base-paired DNA sequence:
ATCGAT TAGCTA
The corresponding RNA sequence, in which [Uracil] is substituted for thymine:
AUCGAU UAGCUA

Length measurements

The following abbreviations are commonly used to describe the length of a DNA/RNA molecule:
  • bp = base pair(s)—one bp corresponds to circa 3.4 Å of length along the strand
  • kb (= kbp) = kilo base pairs = 1,000 bp
  • Mb = mega base pairs = 1,000,000 bp
  • Gb = giga base pairs = 1,000,000,000 bp


In case of single stranded DNA/RNA units of nucleotide
Nucleotide

Nucleotides are molecules that comprise the structural units of RNA and DNA. Additionally, nucleotides play central roles in metabolism. In that capacity, they serve as sources of chemical energy , participate in cell signaling , and are incorporated into important cofactors of enzymatic reactions ....
s are used, abbreviated nt (or knt, Mnt, Gnt), as they are not paired. For distinction between units of computer storage
Computer storage

Computer data storage, often called storage or memory, refers to computer components, devices, and recording medium that retain digital data used for computing for some interval of time....
 and bases kbp, Mbp, Gbp etc may be used for basepairs.

The Centimorgan
Centimorgan

In genetics, a centimorgan or map unit is a unit of recombinant frequency for measuring genetic linkage. It is often used to imply distance along a chromosome....
 is also often used to imply distance along a chromosome, but the number of base-pairs it corresponds to varies widely. In the Human genome, it is about 1 million base pairs . .

Hydrogen bonding and stability


Hydrogen bond
Hydrogen bond

A hydrogen bond is the attractive force between one electronegative atom and a hydrogen covalently bonded to another electronegative atom. It results from a dipole-dipole force with a hydrogen atom bonded to nitrogen, oxygen or fluorine ....
ing is the chemical mechanism that underlies the base-pairing rules described above. Appropriate geometrical correspondence of hydrogen bond donors and acceptors allows only the "right" pairs to form stably. DNA with high GC-content is more stable than DNA with low GC-content, but contrary to popular belief, the hydrogen bonds do not stabilize the DNA significantly and stabilization is mainly due to stacking interactions.

The larger nucleobases, adenine and guanine, are members of a class of doubly-ringed chemical structures called purine
Purine

Purine is a heterocyclic compound aromatic organic compound, consisting of a pyrimidine ring fused to an imidazole ring. Purines, including substituted purines and their tautomers, are the most widely distributed kind of nitrogen-containing heterocycle in nature....
s; the smaller nucleobases, cytosine and thymine (and uracil), are members of a class of singly-ringed chemical structures called pyrimidine
Pyrimidine

Pyrimidine is a heterocyclic aromatic organic compound similar to benzene and pyridine, containing two nitrogen atoms at positions 1 and 3 of the six-member ring....
s. Purines are only complementary with pyrimidines: pyrimidine-pyrimidine pairings are energetically unfavorable because the molecules are too far apart for hydrogen bonding to be established; purine-purine pairings are energetically unfavorable because the molecules are too close, leading to overlap repulsion. The only other possible pairings are GT and AC; these pairings are mismatches because the pattern of hydrogen donors and acceptors do not correspond. (It should be noted that the GU pairing, with two hydrogen bonds, does occur fairly often in RNA
RNA

Ribonucleic acid is a type of molecule that consists of a long chain of nucleotide units. Each nucleotide consists of a nucleobase, a ribose sugar, and a phosphate....
 but rarely in DNA
DNA

Deoxyribonucleic acid is a nucleic acid that contains the genetics instructions used in the development and functioning of all known living organisms and some viruses....
.)

Paired DNA and RNA molecules are comparatively stable at room temperature but the two nucleotide strands will separate above a melting point that is determined by the length of the molecules, the extent of mispairing (if any), and the GC content. Higher GC content results in higher melting temperatures; it is therefore unsurprising that the genomes of extremophile
Extremophile

An extremophile is an organism that thrives in and may even require physically or geochemically extreme environment that are detrimental to the majority of life on Earth....
 organisms such as Thermus thermophilus
Thermus thermophilus

Thermus thermophilus is a gram stain eubacterium , aerobic used in a range of biotechnological applications, including as a model organism for genetic manipulation, structural genomics, and systems biology....
 are particularly GC-rich. Conversely, regions of a genome that need to separate frequently - for example, the promoter regions for often-transcribed
Transcription (genetics)

Transcription is the synthesis of RNA under the direction of DNA. RNA synthesis, or transcription, is the process of transcribing DNA nucleotide sequence information into RNA sequence information....
 genes - are comparatively GC-poor (for example, see TATA box
TATA box

The TATA box is a DNA sequence found in the promoter region of most genes in eukaryotes and Archaea. Considered to be the core promoter sequence, it is the binding site of either transcription factors or histones and is involved in the process of Transcription by RNA polymerase....
). GC content and melting temperature must also be taken into account when designing primers
Primer (molecular biology)

A primer is a strand of nucleic acid that serves as a starting point for DNA replication. They are required because the enzymes that catalyze replication, DNA polymerases, can only add new nucleotides to an existing strand of DNA....
 for PCR reactions.

Base stacking


Base stacking
Stacking (chemistry)

Stacking in supramolecular chemistry refers to a stacked arrangement of aromatic molecules, which interact through aromatic interactions. The most popular example of a stacked system is found for consecutive base pairs in DNA....
 interactions between the pi orbitals
Pi bond

In chemistry, pi bonds are covalent bond chemical bonds where two lobes of one involved electron atomic orbital overlap two lobes of the other involved electron orbital....
 of the bases' aromatic rings also contribute to stability, and again GC stacking interactions with adjacent bases tend to be more favorable. (Note, though, that a GC stacking interaction with the next base pair is geometrically different from a CG interaction.) Base stacking effects are especially important in the secondary structure of RNA; for example, RNA stem-loop
Stem-loop

Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded DNA or, more commonly, in RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same molecule, usually palindrome in nucleotide sequence, base-pair to form a double helix that ends in an unpaired loop....
 structures are stabilized by base stacking in the loop region.

Base analogs and intercalators


Chemical analogs of nucleotides can take the place of proper nucleotides and establish non-canonical base-pairing, leading to errors (mostly point mutation
Point mutation

A point mutation, or single base substitution, is a type of mutation that causes the replacement of a single base nucleotide with another nucleotide of the genetic material, DNA or RNA....
s) in DNA replication
DNA replication

DNA replication, the basis for heredity, is a fundamental process occurring in all living organisms to copy their DNA. This process is "semiconservative replication" in that each strand of the original double-stranded DNA molecule serves as template for the reproduction of the complementary strand....
 and DNA transcription
Transcription (genetics)

Transcription is the synthesis of RNA under the direction of DNA. RNA synthesis, or transcription, is the process of transcribing DNA nucleotide sequence information into RNA sequence information....
. One common mutagenic base analog is 5-bromouracil
5-Bromouracil

5-Bromouracil is a brominated derivative of uracil that acts as an antimetabolite or base analog, substituting for thymine in DNA and can induce DNA mutation in the same way as 2-aminopurine....
, which resembles thymine but can base-pair to guanine in its enol
Enol

Enols are alkenes with a hydroxyl group affixed to one of the carbon atoms composing the double bond. Enols and carbonyl compounds are in fact isomers; this is called keto-enol tautomerism:...
 form.

Other chemicals, known as DNA intercalators, fit into the gap between adjacent bases on a single strand and induce frameshift mutation
Frameshift mutation

A frameshift mutation genetics mutation caused by indels, ie. gene insertion or genetic deletion of a number of nucleotides that is not evenly divisible by three from a DNA sequence....
s by "masquerading" as a base, causing the DNA replication machinery to skip or insert additional nucleotides at the intercalated site. Most intercalators are large polyaromatic compounds and are known or suspected carcinogen
Carcinogen

The term carcinogen refers to any substance, radionuclide or radiation that is an agent directly involved in the promotion of cancer or in the increase of its propagation....
s. Examples include ethidium bromide
Ethidium bromide

Ethidium bromide is an intercalation agent commonly used as a fluorescent tag in molecular biology laboratories for techniques such as agarose gel electrophoresis....
 and acridine
Acridine

Acridine, C13H9N, is an organic compound and a heterocyclic compound. Acridine is also used to describe compounds containing the C13N tricycle....
.

See also

  • DNA
    DNA

    Deoxyribonucleic acid is a nucleic acid that contains the genetics instructions used in the development and functioning of all known living organisms and some viruses....
  • Nucleobase
    Nucleobase

    Nucleobases are the parts of DNA and RNA that may be involved in pairing . The main ones are cytosine, guanine, adenine , thymine and uracil , abbreviated as C, G, A, T, and U, respectively....
  • Wobble base pair
    Wobble base pair

    A wobble base pair is a G-U and I-U / I-A / I-C pair fundamental in RNA secondary structure. Its thermodynamic stability is comparable to that of the Base pair....
  • Hoogsteen base pair
    Hoogsteen base pair

    A Hoogsteen base pair is a minor variation of base-pairing in nucleic acids such as the A?T pair. In this manner, two nucleobases on each strand can be held together by hydrogen bonds in the major groove....
  • List of binary polymorphisms


External links

  • - webserver version of the EMBOSS tool for calculating melting temperatures


Cited references



General references

  • Watson JD, Baker TA, Bell SP, Gann A, Levine M, Losick R. (2004). Molecular Biology of the Gene. 5th ed. Pearson Benjamin Cummings: CSHL Press. See esp. ch. 6 and 9.