All Topics  
Phylogenetic tree

 

   Email Print
   Bookmark   Link






 

Phylogenetic tree



 
 
A phylogenetic tree or evolutionary tree is a tree
Tree (graph theory)

In mathematics, more specifically graph theory, a tree is a graph in which any two Vertex are connected by exactly one path . Alternatively, any connectedness graph with no Cycle is a tree....
 showing the evolution
Evolution

In biology, evolution is change in the heritability trait of a population of organisms from one generation to the next. These changes are caused by a combination of three main processes: variation, reproduction, and selection....
ary relationships among various biological species
Species

In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring....
 or other entities that are believed to have a common ancestor
Common descent

A group of organisms is said to have common descent if they have a common ancestor. In modern biology, it is generally accepted that all living organisms on Earth are descended from a common ancestor or ancestral gene pool....
. In a phylogenetic tree, each node with descendants represents the most recent common ancestor
Most recent common ancestor

In genetics, the most recent common ancestor of any set of organisms is the most recent individual from which all organisms in the group are directly Common descent....
 of the descendants, and the edge lengths in some trees correspond to time
Time

Time is a component of the measurement used to sequence events, to compare the durations of events and the intervals between them, and to quantify the motions of objects....
 estimates. Each node is called a taxonomic unit. Internal nodes are generally called hypothetical taxonomic units (HTUs) as they cannot be directly observed.

ough the idea of a "tree of life
Tree of life (science)

Charles Darwin believed that phylogeny, the ascent of all species through time, was expressible as a metaphor he termed the Tree of Life. The modern development of this idea is called the Phylogenetic tree....
" arose from ancient notions of a ladder-like progression from lower to higher forms of life
Life

Life is a characteristic of organisms that exhibit certain biological processes such as chemical reactions or other events that results in a transformation....
 (such as in the Great Chain of Being
Great chain of being

The great chain of being or scala naturae is a classical and western medieval concept of God?s strict and natural hierarchical structure over the universe....
), Charles Darwin
Charles Darwin

Charles Robert Darwin Royal Society was an English people natural history who realised and presented compelling evidence that all species of life have evolution over time from common descent, through the process he called natural selection....
 (1859) first illustrated and popularized the notion of an evolutionary "tree"
Natural selection

Natural selection is the process by which favorable heritable trait become more common in successive generations of a population of Reproduction organisms, and unfavorable heritable traits become less common, due to differential reproduction of genotypes....
 in his seminal book The Origin of Species
The Origin of Species

Charles Darwin's On the Origin of Species is a seminal work in scientific literature and a landmark work in evolutionary biology. The book's full title is On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life....
.






Discussion
Ask a question about 'Phylogenetic tree'
Start a new discussion about 'Phylogenetic tree'
Answer questions from other users
Full Discussion Forum



Encyclopedia


A phylogenetic tree or evolutionary tree is a tree
Tree (graph theory)

In mathematics, more specifically graph theory, a tree is a graph in which any two Vertex are connected by exactly one path . Alternatively, any connectedness graph with no Cycle is a tree....
 showing the evolution
Evolution

In biology, evolution is change in the heritability trait of a population of organisms from one generation to the next. These changes are caused by a combination of three main processes: variation, reproduction, and selection....
ary relationships among various biological species
Species

In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring....
 or other entities that are believed to have a common ancestor
Common descent

A group of organisms is said to have common descent if they have a common ancestor. In modern biology, it is generally accepted that all living organisms on Earth are descended from a common ancestor or ancestral gene pool....
. In a phylogenetic tree, each node with descendants represents the most recent common ancestor
Most recent common ancestor

In genetics, the most recent common ancestor of any set of organisms is the most recent individual from which all organisms in the group are directly Common descent....
 of the descendants, and the edge lengths in some trees correspond to time
Time

Time is a component of the measurement used to sequence events, to compare the durations of events and the intervals between them, and to quantify the motions of objects....
 estimates. Each node is called a taxonomic unit. Internal nodes are generally called hypothetical taxonomic units (HTUs) as they cannot be directly observed.

History

Although the idea of a "tree of life
Tree of life (science)

Charles Darwin believed that phylogeny, the ascent of all species through time, was expressible as a metaphor he termed the Tree of Life. The modern development of this idea is called the Phylogenetic tree....
" arose from ancient notions of a ladder-like progression from lower to higher forms of life
Life

Life is a characteristic of organisms that exhibit certain biological processes such as chemical reactions or other events that results in a transformation....
 (such as in the Great Chain of Being
Great chain of being

The great chain of being or scala naturae is a classical and western medieval concept of God?s strict and natural hierarchical structure over the universe....
), Charles Darwin
Charles Darwin

Charles Robert Darwin Royal Society was an English people natural history who realised and presented compelling evidence that all species of life have evolution over time from common descent, through the process he called natural selection....
 (1859) first illustrated and popularized the notion of an evolutionary "tree"
Natural selection

Natural selection is the process by which favorable heritable trait become more common in successive generations of a population of Reproduction organisms, and unfavorable heritable traits become less common, due to differential reproduction of genotypes....
 in his seminal book The Origin of Species
The Origin of Species

Charles Darwin's On the Origin of Species is a seminal work in scientific literature and a landmark work in evolutionary biology. The book's full title is On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life....
. Over a century later, evolutionary biologist
Evolutionary biology

Evolutionary biology is a sub-field of biology concerned with the origin of species from a common descent and descent of species, as well as their evolution, multiplication and diversity over time....
s still use tree diagram
Tree structure

A tree structure is a way of representing the hierarchy nature of a structure in a graphical form.It is named a "tree structure" because the graph looks a bit like a tree, even though the tree is generally shown upside down compared with a real tree; that is to say with the root at the top and the leaves at the bottom....
s to depict evolution
Evolution

In biology, evolution is change in the heritability trait of a population of organisms from one generation to the next. These changes are caused by a combination of three main processes: variation, reproduction, and selection....
 because the flora
Flora

In botany, flora has two meanings. The first meaning, flora of an area or of time period, refers to all plant life occurring in an area or time period, especially the naturally occurring or indigenous plant life....
l analogy
Analogy

Analogy is both the cognition process of transferring information from a particular subject to another particular subject , and a language expression corresponding to such a process....
 effectively conveys the concept that speciation
Speciation

Speciation is the evolutionary process by which new biological species arise. The biologist Orator F. Cook seems to have been the first to coin the term 'speciation' for the splitting of lineages or 'cladogenesis,' as opposed to 'anagenesis' or 'phyletic evolution' occurring within lineages....
 occurs through the adaptive
Adaptation

Adaptation is the process, which takes place under natural selection, whereby an organism becomes better suited to its habitat. Also, the term may refer to some characteristic which stands out as being especially significant in the organism's survival....
 and random splitting of lineages. Over time, species classification has become less static and more dynamic.

Adolf Engler
Adolf Engler

Heinrich Gustav Adolf Engler was a Germany botanist. He is notable for his work on alpha taxonomy and Phytogeography, like Die Nat?rlichen Pflanzenfamilien , edited with Karl Anton Eugen Prantl....
 (1844 - 1930) and Karl A. E. Prantl (1849 - 1893) published a system of plant classification in their monograph Die Natürlichen Pflanzenfamilien. In it, they arranged the families and orders of flowering plants on the basis of complexity of floral morphology. Characters like a perianth with one whorl
Whorl

Whorl is a type of spiral pattern.Other meanings of whorl include:* Whorl , a single, complete 360? turn in the spiral growth of a mollusc shell...
, unisexual flowers and pollination
Pollination

Pollination in flowering plants and gymnosperms is the process that transfers pollen, which contain the male gametes to where the female gamete are contained within the carpel; in gymnosperms the pollen is directly applied to the ovule itself....
 by wind were considered primitive as compared to perianth with two whorls, bisexual flowers and pollination by insects.

The plant kingdom is further divided into divisions, sub-divisions, classes
Class (biology)

A class is the taxonomic rank in the biological classification of organisms in biology below phylum and above Order .The orders of taxonomy are life, Domain , kingdom , phylum, class , order , family , genus, and species....
, orders
Order (biology)

In Biological classification used in biology, the order is a taxonomic rank between class and family . The superorder is a rank between class and order....
 and families. According to this system, monocotyledons are considered more primitive than dicotyledons. It also considers evolution of angiosperms from a single source and the sequence of orders and families show parallel evolution
Parallel evolution

Parallel evolution is the independent evolution of similar traits, starting from a similar ancestral condition due to similar environments or other evolutionary pressures....
.

Types

Myosinunrootedtree
A rooted phylogenetic tree is a directed
Directed graph

A directed graph or digraph is a pair G= of:* a Set V, whose element are called vertices or nodes,* a set A of ordered pairs of vertices, called arcs, directed edges, or arrows....
 tree
Tree (data structure)

In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked Vertex_. It is an acyclic connected graph where each node has a set of zero or more children nodes, and at most one parent node....
 with a unique node corresponding to the (usually imputed
Imputation (statistics)

In statistics, imputation is the substitution of some value for a missing data point or a missing component of a data point. Once all missing values have been imputed, the dataset can then be analysed using standard techniques for complete data....
) most recent common ancestor of all the entities at the leaves
Leaf node

In computer science, a leaf node or external node is a node of a tree data structure that has zero child nodes. Often, leaf nodes are the nodes farthest from the root node....
 of the tree. The most common method for rooting trees is the use of an uncontroversial outgroup
Outgroup

In cladistics, whenever three or more monophyletic groups of organisms are compared, and all but one of them are more closely related to each other than any single one of them is to the last, the latter group is known as the outgroup....
 — close enough to allow inference from sequence or trait data, but far enough to be a clear outgroup.

Unrooted trees illustrate the relatedness of the leaf nodes without making assumptions about common ancestry. While unrooted trees can always be generated from rooted ones by simply omitting the root, a root cannot be inferred from an unrooted tree without some means of identifying ancestry; this is normally done by including an outgroup in the input data or introducing additional assumptions about the relative rates of evolution on each branch, such as an application of the molecular clock
Molecular clock

The molecular clock is a technique in molecular evolution to relate the time that two species speciation to the number of molecular differences measured between the species' DNA sequences or proteins....
 hypothesis
Hypothesis

A hypothesis consists either of a suggested explanation for an observable phenomenon or of a reasoned proposal predicting a possible causal correlation among multiple phenomena....
. Figure 1 depicts an unrooted phylogenetic tree for myosin
Myosin

Myosins are a large family of motor proteins found in eukaryotic Biological tissue. They are responsible for actin-based motility.Following the discovery, by Pollard and Korn, of enzymes with myosin-like function in Acanthamoeba, a large number of divergent myosin genes have been discovered throughout eukaryotes....
, a superfamily
Gene family

A gene family is a set of genes with a known homology . They are generally biochemically similar. Genes are categorized this way into families, depending on shared nucleotide or protein sequences....
 of protein
Protein

Proteins are organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid Residue ....
s.

Both rooted and unrooted phylogenetic trees can be either bifurcating
Bifurcation theory

Bifurcation theory is the Mathematics study of changes in the qualitative or topological structure of a given family. Examples of such families are the integral curves of a family of vector field or, the solutions of a family of differential equation....
 or multifurcating, and either labeled or unlabeled. A bifurcating tree has a maximum of two descendants arising from each interior node, while a multifurcating tree may have more than two. A labeled tree has specific values assigned to its leaves, while an unlabeled tree, sometimes called a tree shape, only defines a topology. The number of possible trees for a given number of leaf nodes depends on the specific type of tree, but there are always more multifurcating than bifurcating trees, more labeled than unlabeled trees, and more rooted than unrooted trees. The last distinction is the most biologically relevant; it arises because there are many places on an unrooted tree to put the root. For labeled bifurcating trees, there are

total rooted trees and

total unrooted trees, where n represents the number of leaf nodes. The number of unrooted trees for n input sequences or species is equal to the number of rooted trees for n-1 sequences.

A dendrogram
Dendrogram

A dendrogram is a Tree diagram frequently used to illustrate the arrangement of the clusters produced by a Cluster analysis. Dendrograms are often used in computational biology to illustrate the clustering of genes....
 is a broad term for the diagrammatic representation of a phylogenetic tree.

A cladogram is a tree formed using cladistic
Cladistics

Cladistics is the hierarchical classification of species based on evolutionary ancestry. Cladistics is distinguished from other taxonomic systems because it focuses on evolution rather than similarities between species, and because it places heavy emphasis on objective, quantitative analysis....
 methods. This type of tree only represents a branching pattern, i.e., its branch lengths do not represent time.

A phylogram is a phylogenetic tree that explicitly represents number of character changes through its branch lengths.

An ultrametric tree or chronogram is a phylogenetic tree that explicitly represents evolutionary time through its branch lengths.

Construction

Phylogenetic trees among a nontrivial number of input sequences are constructed using computational phylogenetics
Computational phylogenetics

Computational phylogenetics is the application of computational algorithms, methods and programs to Phylogenetics analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa....
 methods. Distance-matrix methods such as neighbor-joining
Neighbor-joining

In bioinformatics, neighbor-joining is a bottom-up clustering method used for the construction of phylogeny tree data structures. Usually used for trees based on DNA or protein primary structure data, the algorithm requires knowledge of the distance between each pair of taxa in the tree....
 or UPGMA
UPGMA

UPGMA is a simple Cluster analysis or bottom-up data cluster analysis used in bioinformatics for the creation of phylogenetic trees. UPGMA assumes a constant rate of evolution , and is not a well-regarded method for inferring phylogenetic trees unless this assumption has been tested and justified for the data set being used....
, which calculate genetic distance
Genetic distance

Genetic distance is a measure of the dissimilarity of genetic material between different species or individuals of the same species. By comparing the percentage difference between the same genes or junk DNA of different species, a figure can be obtained, which is a measure of "genetic distance"....
 from multiple sequence alignment
Multiple sequence alignment

A multiple sequence alignment is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In general, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a common ancestor....
s, are simplest to implement, but do not invoke an evolutionary model. Many sequence alignment methods such as ClustalW also create trees by using the simpler algorithms (i.e. those based on distance) of tree construction. Maximum parsimony
Maximum parsimony

Parsimony is a non-parametric statistics method commonly used in computational phylogenetics for estimating phylogeny. Under parsimony, the preferred phylogenetic tree is the tree that requires the least evolutionary change to explain some observed data....
 is another simple method of estimating phylogenetic trees, but implies an implicit model of evolution (i.e. parsimony). More advanced methods use the optimality criterion
Optimality criterion

In statistics, an optimality criterion provides a measure of the fit of the data to a given hypothesis. The selection process is determined by the solution that optimizes the criteria used to evaluate the alternative hypotheses....
 of maximum likelihood
Maximum likelihood

Maximum likelihood estimation is a popular statistics method used for fitting a mathematical model to data. The modeling of real world data using estimation by maximum likelihood offers a way of tuning the free parameters of the model to provide a good fit....
, often within a Bayesian Framework
Bayesian inference

Bayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true....
, and apply an explicit model of evolution to phylogenetic tree estimation. Identifying the optimal tree using many of these techniques is NP-hard
NP-hard

NP-hard , in computational complexity theory, is a class of problems informally "at least as hard as the hardest problems in NP ." A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial-time Turing reduction to H, i.e....
, so heuristic
Heuristic

Heuristic is an adjective for methods that help in problem solving, in turn leading to learning and discovery. These methods in most cases employ experimentation and trial-and-error techniques....
 search and optimization
Optimization (mathematics)

In mathematics, the simplest case of optimization, or mathematical programming, refers to the study of problems in which one seeks to maxima and minima or maxima and minima a Function of a real variable by systematically choosing the values of Real number or integer variables from within an allowed set....
 methods are used in combination with tree-scoring functions to identify a reasonably good tree that fits the data.

Tree-building methods can be assessed on the basis of several criteria:
  • efficiency (how long does it take to compute the answer, how much memory does it need?)
  • power (does it make good use of the data, or is information being wasted?)
  • consistency (will it converge on the same answer repeatedly, if each time given different data for the same model problem?)
  • robustness (does it cope well with violations of the assumptions of the underlying model?)
  • falsifiability (does it alert us when it is not good to use, i.e. when assumptions are violated?)


Tree-building techniques have also gained the attention of mathematicians. Trees can also be built using T-theory
T-theory

T-theory is a branch of discrete mathematics dealing with analysis of tree s and discrete metric spaces....
.

Limitations

Although phylogenetic trees produced on the basis of sequenced gene
Gene

A gene is the basic unit of heredity in a living organism. All living things depend on genes. Genes hold the information to build and maintain their cell and pass genetic trait to offspring....
s or genomic
Genome

In classical genetics, the genome of a diploid organism including eukarya refers to a full set of chromosomes or genes in a gamete; thereby, a regular somatic cell contains two full sets of genomes....
 data in different species can provide evolutionary insight, they have important limitations. They do not necessarily accurately represent the species evolutionary history. The data on which they are based is noisy
Signal noise

In science, and especially in physics and telecommunication, noise is fluctuations in and the addition of external factors to the stream of target information being received at a detector....
; the analysis can be confounded by horizontal gene transfer
Horizontal gene transfer

Horizontal gene transfer , also Lateral gene transfer , is any process in which an organism incorporates genetic material from another organism without being the Reproduction of that organism....
, hybridisation between species that were not nearest neighbors on the tree before hybridisation takes place, convergent evolution
Convergent evolution

Convergent evolution describes the acquisition of the same biological trait in unrelated lineages.The wing is a classic example of convergent evolution in action....
, and conserved
Conservation (genetics)

Conservation may refer to:* Conservation genetics - "an interdisciplinary science that aims to apply genetic methods to the conservation and restoration of biodiversity."...
 sequences.

Also, there are problems in basing the analysis on a single type of character, such as a single gene
Gene

A gene is the basic unit of heredity in a living organism. All living things depend on genes. Genes hold the information to build and maintain their cell and pass genetic trait to offspring....
 or protein
Protein

Proteins are organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid Residue ....
 or only on morphological analysis, because such trees constructed from another unrelated data source often differ from the first, and therefore great care is needed in inferring phylogenetic relationships among species. This is most true of genetic material that is subject to lateral gene transfer and recombination
Recombination

Recombination may refer to:* Genetic recombination, the process by which genetic material is broken and joined to other genetic material* Carrier generation and recombination, processes by which mobile electrons and electron holes are created and eliminated...
, where different haplotype
Haplotype

The term haplotype is a contraction of the term "Ploidy genotype." In genetics, a haplotype is a combination of alleles at multiple locus that are transmitted together on the same chromosome....
 blocks can have different histories. In general, the output tree of a phylogenetic analysis is an estimate of the characters phylogeny (i.e. a gene tree) and not the phylogeny of the taxa (i.e. species tree) from which these characters were sampled, though ideally, both should be very close. For this reason, serious phylogenetic studies generally use a combination of genes that come from different genomic sources (e.g., from mitochondrial or plastid vs. nuclear genomes), or genes that would be expected to evolve under different selective regimes, so that homoplasy (false homology) would be unlikely to result from natural selection.

When extinct species are included in a tree, they are terminal node
Leaf node

In computer science, a leaf node or external node is a node of a tree data structure that has zero child nodes. Often, leaf nodes are the nodes farthest from the root node....
s, as it is unlikely that they are direct ancestors of any extant species. Scepticism must apply when extinct species are included in trees that are wholly or partly based on DNA sequence data, due to the fact that little useful "ancient DNA
Ancient DNA

Ancient DNA can be loosely described as any DNA recovered from biological samples that have not been preserved specifically for later DNA analyses....
" is preserved for longer than 100,000 years, and except in the most unusual circumstances no DNA sequences long enough for use in phylogenetic analyses have yet been recovered from material over 1 million years old.

In some organisms, endosymbiont
Endosymbiont

An endosymbiont is any organism that lives within the body or cells of another organism, i.e. forming an endosymbiosis . Examples are nitrogen-fixing bacterium which live in root nodules on legume roots, single-celled algae inside reef-building corals, and bacterial endosymbionts that provide essential nutrients to about 10%?15% of in...
s have an independent genetic history from the host.

Phylogenetic network
Phylogenetic network

A phylogenetic network is any Graph used to visualize evolutionary relationships between species or organisms. It is employed when reticulate events such as Hybrid , horizontal gene transfer, recombination, or gene duplication and loss are believed to be involved....
s are used when bifurcating trees are not suitable, due to these complications which suggest a more reticulate evolutionary history of the organisms sampled.

See also


The "tree of life"

  • Life
    Life

    Life is a characteristic of organisms that exhibit certain biological processes such as chemical reactions or other events that results in a transformation....
     - The top level for Wikipedia articles on living species, reflecting a diversity of classification systems.
  • Wikispecies
    Wikispecies

    Wikispecies is a wiki-based online project supported by the Wikimedia Foundation that aims to create a comprehensive free content catalogue of all species....
     - An external Wikimedia Foundation project to construct a "tree of life" appropriate for use by scientists
  • Evolutionary history of life
    Evolutionary history of life

    The evolutionary history of life on Earth traces the processes by which living and fossil organisms evolution. It stretches back over , possibly as far as , and there is evidence that evolution continues, even in humans....
     - An overview of the major time periods of life on earth
  • Three-domain_system
    Three-domain system

    The three-domain system is a biological classification introduced by Carl Woese in 1990 that divides cellular life forms into archaea, bacteria, and eukaryote domain s....
     (cell types)


Fields of study

  • Evolutionary biology
    Evolutionary biology

    Evolutionary biology is a sub-field of biology concerned with the origin of species from a common descent and descent of species, as well as their evolution, multiplication and diversity over time....
  • Phylogenetics
    Phylogenetics

    In biology, phylogenetics is the study of evolutionary relatedness among various groups of organisms , which is discovered through molecular sequencing data and morphological data matrices....
  • Comparative phylogenetics
  • Computational phylogenetics
    Computational phylogenetics

    Computational phylogenetics is the application of computational algorithms, methods and programs to Phylogenetics analyses. The goal is to assemble a phylogenetic tree representing a hypothesis about the evolutionary ancestry of a set of genes, species, or other taxa....
  • Cladistics
    Cladistics

    Cladistics is the hierarchical classification of species based on evolutionary ancestry. Cladistics is distinguished from other taxonomic systems because it focuses on evolution rather than similarities between species, and because it places heavy emphasis on objective, quantitative analysis....


Further reading

  • The Ancestor's Tale
    The Ancestor's Tale

    The Ancestor's Tale is a 2004 popular science book by Richard Dawkins, with contributions from Dawkins' research assistant Yan Wong. It follows the path of humans backwards through evolutionary history, meeting humanity's cousins as they converge on common ancestors....
    by Richard Dawkins
    Richard Dawkins

    Clinton Richard Dawkins, Royal Society#Fellowship, Royal Society of Literature is a United Kingdom ethology, evolutionary biology and popular science author....


External links


Images

  • In 2003, the Science
    Science (journal)

    Science is the academic journal of the American Association for the Advancement of Science and is considered one of the world's most prestigious scientific journals....
     journal dedicated a special issue to the tree of life, including an .


General

  • An interactive tree based on the U.S. National Science Foundation's Assembling the Tree of Life Project
  • ? The most detailed and comprehensive family tree of dinosaurs yet available