Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Pseudogene

Pseudogene

Overview
Pseudogenes are dysfunctional relatives of known gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

s that have lost their protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

-coding ability or are otherwise no longer expressed
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

 in the cell. Although some do not have intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

s or promoters (these pseudogenes are copied from mRNA and incorporated into the chromosome and are called processed pseudogenes), most have some gene-like features (such as promoters, CpG island
CpG island
In genetics, CpG islands or CG islands are genomic regions that contain a high frequency of CpG sites but to date objective definitions for CpG islands are limited. In mammalian genomes, CpG islands are typically 300-3,000 base pairs in length. They are in and near approximately 40% of promoters of...

s, and splice sites
Splicing (genetics)
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

), they are nonetheless considered nonfunctional
Function (biology)
A function is part of an answer to a question about why some object or process occurred in a system that evolved through a process of selection. Thus, function refers forward from the object or process, along some chain of causation, to the goal or success...

, due to their lack of protein-coding ability resulting from various genetic disablements (premature stop codon
Stop codon
In the genetic code, a stop codon is a nucleotide triplet within messenger RNA that signals a termination of translation. Proteins are based on polypeptides, which are unique sequences of amino acids. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide...

s, frameshifts, or a lack of transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

) or their inability to encode RNA (such as with rRNA pseudogenes). Thus the term, coined in 1977 by Jacq, et al., is composed of the prefix pseudo, which means false, and the root gene, which is the central unit of molecular genetics
Molecular genetics
Molecular genetics is the field of biology and genetics that studies the structure and function of genes at a molecular level. The field studies how the genes are transferred from generation to generation. Molecular genetics employs the methods of genetics and molecular biology...

.
Discussion
Ask a question about 'Pseudogene'
Start a new discussion about 'Pseudogene'
Answer questions from other users
Full Discussion Forum
 
Unanswered Questions
Encyclopedia
Pseudogenes are dysfunctional relatives of known gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

s that have lost their protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

-coding ability or are otherwise no longer expressed
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

 in the cell. Although some do not have intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

s or promoters (these pseudogenes are copied from mRNA and incorporated into the chromosome and are called processed pseudogenes), most have some gene-like features (such as promoters, CpG island
CpG island
In genetics, CpG islands or CG islands are genomic regions that contain a high frequency of CpG sites but to date objective definitions for CpG islands are limited. In mammalian genomes, CpG islands are typically 300-3,000 base pairs in length. They are in and near approximately 40% of promoters of...

s, and splice sites
Splicing (genetics)
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

), they are nonetheless considered nonfunctional
Function (biology)
A function is part of an answer to a question about why some object or process occurred in a system that evolved through a process of selection. Thus, function refers forward from the object or process, along some chain of causation, to the goal or success...

, due to their lack of protein-coding ability resulting from various genetic disablements (premature stop codon
Stop codon
In the genetic code, a stop codon is a nucleotide triplet within messenger RNA that signals a termination of translation. Proteins are based on polypeptides, which are unique sequences of amino acids. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide...

s, frameshifts, or a lack of transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

) or their inability to encode RNA (such as with rRNA pseudogenes). Thus the term, coined in 1977 by Jacq, et al., is composed of the prefix pseudo, which means false, and the root gene, which is the central unit of molecular genetics
Molecular genetics
Molecular genetics is the field of biology and genetics that studies the structure and function of genes at a molecular level. The field studies how the genes are transferred from generation to generation. Molecular genetics employs the methods of genetics and molecular biology...

.

Because pseudogenes are generally thought of as the last stop for genomic material that is to be removed from the genome, they are often labeled as junk DNA. Nonetheless, pseudogenes contain fascinating biological and evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...

ary histories within their sequences. This is due to a pseudogene's shared ancestry with a functional gene: in the same way that Darwin
Charles Darwin
Charles Robert Darwin FRS was an English naturalist. He established that all species of life have descended over time from common ancestry, and proposed the scientific theory that this branching pattern of evolution resulted from a process that he called natural selection.He published his theory...

 thought of two species as possibly having a shared common ancestry followed by millions of years of evolutionary divergence (see speciation
Speciation
Speciation is the evolutionary process by which new biological species arise. The biologist Orator F. Cook seems to have been the first to coin the term 'speciation' for the splitting of lineages or 'cladogenesis,' as opposed to 'anagenesis' or 'phyletic evolution' occurring within lineages...

), a pseudogene and its associated functional gene also share a common ancestor and have diverged as separate genetic entities over millions of years.

Properties of pseudogenes


Pseudogenes are characterized by a combination of homology
Homology (biology)
Homology forms the basis of organization for comparative biology. In 1843, Richard Owen defined homology as "the same organ in different animals under every variety of form and function". Organs as different as a bat's wing, a seal's flipper, a cat's paw and a human hand have a common underlying...

to a known gene and nonfunctionality. That is, although every pseudogene has a DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 sequence that is similar to some functional gene, they are nonetheless unable to produce functional final products. Pseudogenes are quite difficult to identify and characterize in genomes, because the two requirements of homology and nonfunctionality are implied through sequence calculations and alignments rather than biologically proven.
  1. Homology is implied by sequence identity between the DNA sequences of the pseudogene and parent gene. After aligning
    Sequence alignment
    In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are...

     the two sequences, the percentage of identical base pair
    Base pair
    In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...

    s is computed. A high sequence identity (usually between 40% and 100%) means that it is highly likely that these two sequences diverged from a common ancestral sequence (are homologous), and highly unlikely that these two sequences were independently created (see Convergent evolution
    Convergent evolution
    Convergent evolution describes the acquisition of the same biological trait in unrelated lineages.The wing is a classic example of convergent evolution in action. Although their last common ancestor did not have wings, both birds and bats do, and are capable of powered flight. The wings are...

    ).
  2. Nonfunctionality can manifest itself in many ways. Normally, a gene must go through several steps in going from a genetic DNA sequence to a fully functional protein: transcription
    Transcription (genetics)
    Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

    , pre-mRNA processing, translation
    Translation (genetics)
    In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...

    , and protein folding
    Protein folding
    Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

     are all required parts of this process. If any of these steps fails, then the sequence may be considered nonfunctional. In high-throughput pseudogene identification, the most commonly identified disablements are stop codon
    Stop codon
    In the genetic code, a stop codon is a nucleotide triplet within messenger RNA that signals a termination of translation. Proteins are based on polypeptides, which are unique sequences of amino acids. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide...

    s and frameshifts
    Frameshift mutation
    A frameshift mutation is a genetic mutation caused by indels of a number of nucleotides that is not evenly divisible by three from a DNA sequence...

    , which almost universally prevent the translation of a functional protein product.


Pseudogenes for RNA genes are often easier to discover. Many RNA genes occur as multiple copy genes, and pseudogenes are identified through sequence identity and location within the region.

Types and origin of pseudogenes


There are three main types of pseudogenes, all with distinct mechanisms of origin and characteristic features. The classifications of pseudogenes are as follows:
  1. Processed (or retrotransposed) pseudogenes. In higher eukaryotes, particularly mammals, retrotransposition is a fairly common event that has had a huge impact on the composition of the genome. For example, somewhere between 30% - 44% of the human genome
    Human genome
    The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs plus the small mitochondrial DNA. 22 of the 23 chromosomes are autosomal chromosome pairs, while the remaining pair is sex-determining...

     consists of repetitive elements such as SINEs and LINEs (see retrotransposons). In the process of retrotransposition, a portion of the mRNA
    Messenger RNA
    Messenger RNA is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcribed from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes. Here, the nucleic acid polymer is translated into a polymer of amino acids: a protein...

     transcript of a gene is spontaneously reverse transcribed back into DNA and inserted into chromosomal DNA. Although retrotransposons usually create copies of themselves, it has been shown in an in vitro system that they can create retrotransposed copies of random genes, too. Once these pseudogenes are inserted back into the genome, they usually contain a poly-A tail
    Polyadenylation
    Polyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...

    , and usually have had their introns spliced out
    Splicing (genetics)
    In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

    ; these are both hallmark features of cDNAs. However, because they are derived from a mature mRNA product, processed pseudogenes also lack the upstream promoters of normal genes; thus, they are considered "dead on arrival", becoming non-functional pseudogenes immediately upon the retrotransposition event. However, occasionally these insertions contribute exons to existing genes and usually via alternatively spliced transcripts. A further characteristic of processed pseudogenes is common truncation of the 5' end relative to the parent sequence, which is a result of the relatively non-processive retrotransposition mechanism that creates processed pseudogenes.
  2. Non-processed (or duplicated) pseudogenes. Gene duplication
    Gene duplication
    Gene duplication is any duplication of a region of DNA that contains a gene; it may occur as an error in homologous recombination, a retrotransposition event, or duplication of an entire chromosome.The second copy of the gene is often free from selective pressure — that is, mutations of it have no...

     is another common and important process in the evolution of genomes. A copy of a functional gene may arise as a result of a gene duplication event and subsequently acquire mutation
    Mutation
    In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...

    s that cause it to become nonfunctional. Duplicated pseudogenes usually have all the same characteristics of genes, including an intact exon
    Exon
    An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing. The mature RNA molecule can be a messenger RNA...

    -intron
    Intron
    An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

     structure and promoter sequences. The loss of a duplicated gene's functionality usually has little effect on an organism's fitness
    Fitness (biology)
    Fitness is a central idea in evolutionary theory. It can be defined either with respect to a genotype or to a phenotype in a given environment...

    , since an intact functional copy still exists. According to some evolutionary models, shared duplicated pseudogenes indicate the evolutionary relatedness of humans and the other primates.
  3. Disabled genes, or unitary pseudogenes. Various mutations can stop a gene from being successfully transcribed or translated, and a gene may become nonfunctional or deactivated if such a mutation becomes fixed in the population. This is the same mechanism by which non-processed genes become deactivated, but the difference in this case is that the gene was not duplicated before becoming disabled. Normally, such gene deactivation would be unlikely to become fixed in a population, but various population effects, such as genetic drift
    Genetic drift
    Genetic drift or allelic drift is the change in the frequency of a gene variant in a population due to random sampling.The alleles in the offspring are a sample of those in the parents, and chance has a role in determining whether a given individual survives and reproduces...

    , a population bottleneck
    Population bottleneck
    A population bottleneck is an evolutionary event in which a significant percentage of a population or species is killed or otherwise prevented from reproducing....

    , or in some cases, natural selection
    Natural selection
    Natural selection is the nonrandom process by which biologic traits become either more or less common in a population as a function of differential reproduction of their bearers. It is a key mechanism of evolution....

    , can lead to fixation. The classic example of a unitary pseudogene is the gene that presumably coded the enzyme L-gulono-γ-lactone oxidase
    L-gulonolactone oxidase
    L-gulonolactone oxidase is an enzyme that catalyzes the reaction of D-glucuronolactone with oxygen to L-xylo-hex-3-gulonolactone and hydrogen peroxide. It uses FAD as a cofactor...

     (GULO) in primates. In all mammals studied besides primates (except guinea pigs), GULO aids in the biosynthesis of Ascorbic acid
    Ascorbic acid
    Ascorbic acid is a naturally occurring organic compound with antioxidant properties. It is a white solid, but impure samples can appear yellowish. It dissolves well in water to give mildly acidic solutions. Ascorbic acid is one form of vitamin C. The name is derived from a- and scorbutus , the...

     (vitamin C), but it exists as a disabled gene (GULOP) in humans and other primates. Another interesting and more recent example of a disabled gene, which links the deactivation of the caspase 12
    Caspase 12
    Caspase 12 is an enzyme known as a cysteine protease. It belongs to a family of enzymes called caspases that cleave their substrates at C-terminal aspartic acid residues...

     gene (through a nonsense mutation
    Nonsense mutation
    In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and in a truncated, incomplete, and usually nonfunctional protein product. It differs from a missense mutation, which is a point mutation...

    ) to positive selection in humans.


Pseudogenes can complicate molecular genetic studies. For example, a researcher who wants to amplify a gene by PCR
Polymerase chain reaction
The polymerase chain reaction is a scientific technique in molecular biology to amplify a single or a few copies of a piece of DNA across several orders of magnitude, generating thousands to millions of copies of a particular DNA sequence....

 may simultaneously amplify a pseudogene that shares similar sequences. This is known as PCR bias or amplification bias. Similarly, pseudogenes are sometimes annotated as genes in genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

 sequences.

Processed pseudogenes often pose a problem for gene prediction
Gene prediction
In computational biology gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions...

 programs, often being misidentified as real genes or exons. It has been proposed that identification of processed pseudogenes can help improve the accuracy of gene prediction methods.

It has also been shown that the parent sequences that give rise to processed pseudogenes lose their coding potential faster than those giving rise to non-processed pseudogenes.

Functional pseudogenes?


By definition, pseudogenes lack a function. However, the classification of pseudogenes generally relies on computational analysis of genomic sequences using complex algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

s. This has led to the incorrect identification of pseudogenes. For example the functional, chimeric gene jingwei in Drosophila
Drosophila
Drosophila is a genus of small flies, belonging to the family Drosophilidae, whose members are often called "fruit flies" or more appropriately pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species to linger around overripe or rotting fruit...

was once thought to be a processed pseudogene.

It has been established that quite a few pseudogenes can go through the process of transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

, either if their own promoter is still intact or in some cases using the promoter of a nearby gene; this expression of pseudogenes also appears to be tissue-specific. In 2003, Hirotsune et al. identified a retrotransposed pseudogene whose transcript purportedly plays a trans-regulatory
Trans-acting
In the field of molecular biology, trans-acting , in general, means "acting from a different molecule"...

 role in the expression of its homologous gene, Makorin1 (MKRN1) (see also RING finger domain
RING finger domain
In molecular biology, a RING finger domain is a protein structural domain of zinc finger type which contains a Cys3HisCys4 amino acid motif which binds two zinc cations. This protein domain contains from 40 to 60 amino acids...

 and ubiquitin ligase
Ubiquitin ligase
A ubiquitin ligase is a protein that in combination with an E2 ubiquitin-conjugating enzyme causes the attachment of ubiquitin to a lysine on a target protein via an isopeptide bond; the E3 ubiquitin ligase targets specific protein substrates for degradation by the proteasome...

s), and suggested this as a general model under which pseudogenes may play an important biological role. Other researchers have since hypothesized similar roles for other pseudogenes. A bioinformatics analysis has shown that processed pseudogenes can be inserted into introns of annotated genes and be incorporated into alternatively spliced transcripts. Hirotsune's report prompted two molecular biologists to carefully review scientific literature on the subject of pseudogenes. To the surprise of many, they found a number of examples in which pseudogenes play a role in gene regulation and expression, forcing Hirotsune's group to rescind their claim that they were the first to identify pseudogene function. Furthermore, the original findings of Hirotsune et al. concerning Makorin1 have recently been strongly contested; thus, the possibility that some pseudogenes could have important biological functions was disputed.
Additionally, University of Chicago and University of Cincinnati scientists reported in 2002 that a processed pseudogene called phosphoglycerate mutase 3 actually produces a functional protein.

Two 2008 publications in Nature discuss that some endogenous siRNA
Sírna
Sírna Sáeglach , son of Dian mac Demal, son of Demal mac Rothechtaid, son of Rothechtaid mac Main, was, according to medieval Irish legend and historical tradition, a High King of Ireland...

s are derived from pseudogenes, and thus some pseudogenes play a role in regulating protein-coding transcripts.
In June 2010, Nature published an article showing the mRNA levels of tumour suppressor PTEN and oncogenic KRAS is affected by their pseudogenes PTENP1 and KRAS1P. This discovery demonstrated an miRNA decoy function for pseudogenes and identified their transcripts as biologically active units in tumor biology; thus attributing a novel biological role to expressed pseudogenes, as they can regulate coding gene expression, and reveal a non-coding function for mRNAs in disease progression.

External links


See also

  • Human disabled pseudogenes list
  • Molecular evolution
    Molecular evolution
    Molecular evolution is in part a process of evolution at the scale of DNA, RNA, and proteins. Molecular evolution emerged as a scientific field in the 1960s as researchers from molecular biology, evolutionary biology and population genetics sought to understand recent discoveries on the structure...

  • Retrotransposon
    Retrotransposon
    Retrotransposons are genetic elements that can amplify themselves in a genome and are ubiquitous components of the DNA of many eukaryotic organisms. They are a subclass of transposon. They are particularly abundant in plants, where they are often a principal component of nuclear DNA...

  • Retroposon
    Retroposon
    Retroposons are repetitive DNA fragments which are inserted into chromosomes after they had been reverse transcribed from any RNA molecule. In contrast to retrotransposons, they never encode Reverse Transcriptase . Therefore, they are non-autonomous elements with regard to transposition activity...