All Topics  
Haplotype

 

   Email Print
   Bookmark   Link






 

Haplotype



 
 
The term haplotype is a contraction of the term "haploid
Ploidy

Ploidy is the number of complete sets of non-homologous chromosomes in a biological cell. In humans, the somatic cells that comprise the body are diploid , but sex cells are haploid....
 genotype
Genotype

The genotype is the trait we can't see. The genotype is the Genetics constitution of a cell, an organism, or an individual usually with reference to a specific character under consideration....
."
In genetics
Genetics

Genetics , a discipline of biology, is the science of heredity and Genetic variation in living organisms. The fact that living things inherit traits from their parents has been used since prehistoric times to improve crop plants and animals through selective breeding....
, a haplotype (from the , haploûs, "onefold, single, simple") is a combination of allele
Allele

An allele is one member of a pair or series of different forms of a gene. Usually alleles are coding region, but sometimes the term is used to refer to a junk DNA....
s at multiple loci
Locus (genetics)

In the fields of genetics and evolutionary computation, a locus is a fixed position on a chromosome such as the position of a genetic marker that may be occupied by one or more genes....
 that are transmitted together on the same chromosome. Haplotype may refer to as few as two loci
Locus (genetics)

In the fields of genetics and evolutionary computation, a locus is a fixed position on a chromosome such as the position of a genetic marker that may be occupied by one or more genes....
 or to an entire chromosome
Chromosome

A chromosome is an organized structure of DNA and protein that is found in Cell . A chromosome is a single piece of DNA that contains many genes, regulatory sequence and other genetic sequence....
 depending on the number of recombination
Genetic recombination

Genetic recombination is the process by which a strand of genetic material is broken and then joined to a different DNA molecule. In eukaryotes recombination commonly occurs during meiosis as chromosomal crossover between paired chromosomes....
 events that have occurred between a given set of loci.

In a second meaning, haplotype is a set of single nucleotide polymorphism
Single nucleotide polymorphism

A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide — adenine, thymine, cytosine, or guanine — in the genome differs between members of a species ....
s (SNPs) on a single chromatid
Chromatid

A chromatid is one among the two identical copies of DNA making up a replicated chromosome, which are joined at their centromeres, for the process of cell division ....
 that are statistically associated
Association (statistics)

In statistics, an association comes from two variables that are related and is often confused with causality though association does not imply a causal relationship....
.






Discussion
Ask a question about 'Haplotype'
Start a new discussion about 'Haplotype'
Answer questions from other users
Full Discussion Forum



Encyclopedia


The term haplotype is a contraction of the term "haploid
Ploidy

Ploidy is the number of complete sets of non-homologous chromosomes in a biological cell. In humans, the somatic cells that comprise the body are diploid , but sex cells are haploid....
 genotype
Genotype

The genotype is the trait we can't see. The genotype is the Genetics constitution of a cell, an organism, or an individual usually with reference to a specific character under consideration....
."
In genetics
Genetics

Genetics , a discipline of biology, is the science of heredity and Genetic variation in living organisms. The fact that living things inherit traits from their parents has been used since prehistoric times to improve crop plants and animals through selective breeding....
, a haplotype (from the , haploûs, "onefold, single, simple") is a combination of allele
Allele

An allele is one member of a pair or series of different forms of a gene. Usually alleles are coding region, but sometimes the term is used to refer to a junk DNA....
s at multiple loci
Locus (genetics)

In the fields of genetics and evolutionary computation, a locus is a fixed position on a chromosome such as the position of a genetic marker that may be occupied by one or more genes....
 that are transmitted together on the same chromosome. Haplotype may refer to as few as two loci
Locus (genetics)

In the fields of genetics and evolutionary computation, a locus is a fixed position on a chromosome such as the position of a genetic marker that may be occupied by one or more genes....
 or to an entire chromosome
Chromosome

A chromosome is an organized structure of DNA and protein that is found in Cell . A chromosome is a single piece of DNA that contains many genes, regulatory sequence and other genetic sequence....
 depending on the number of recombination
Genetic recombination

Genetic recombination is the process by which a strand of genetic material is broken and then joined to a different DNA molecule. In eukaryotes recombination commonly occurs during meiosis as chromosomal crossover between paired chromosomes....
 events that have occurred between a given set of loci.

In a second meaning, haplotype is a set of single nucleotide polymorphism
Single nucleotide polymorphism

A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide — adenine, thymine, cytosine, or guanine — in the genome differs between members of a species ....
s (SNPs) on a single chromatid
Chromatid

A chromatid is one among the two identical copies of DNA making up a replicated chromosome, which are joined at their centromeres, for the process of cell division ....
 that are statistically associated
Association (statistics)

In statistics, an association comes from two variables that are related and is often confused with causality though association does not imply a causal relationship....
. It is thought that these associations, and the identification of a few alleles of a haplotype block, can unambiguously identify all other polymorphic sites in its region. Such information is very valuable for investigating the genetics behind common diseases, and has been investigated in the human species by the International HapMap Project
International HapMap Project

The International HapMap Project is an organization whose goal is to develop a haplotype map of the human genome , which will describe the common patterns of human genetic variability....
.

Many genetic testing companies use the term "haplotype" to refer to an individual collection of short tandem repeat
Short tandem repeat

A short tandem repeat in DNA is a class of polymorphism that occurs when a pattern of two or more nucleotides are repeated and the repeated sequences are directly adjacent to each other....
 (STR) allele mutations within a genetic segment, while using the term "haplogroup
Haplogroup

In the study of molecular evolution, a haplogroup is a group of similar haplotypes that share a common ancestor with a single nucleotide polymorphism mutation....
" to refer to the SNP/unique event polymorphism
Unique event polymorphism

In genetic genealogy a unique event polymorphism is a genetic marker that corresponds to a mutation which is likely to occur so infrequently that it is believed overwhelmingly probable that all the individuals who share the marker, worldwide, will have inherited it from the same common ancestor, and the same single mutation event....
 (UEP) mutations which represents the clade
Clade

A clade is a term used in modern alpha taxonomy, the scientific classification of living and fossil organisms, to describe a monophyletic group, defined as a group consisting of a single common ancestor and all its descendants.The term "monophyletic group" is used in this article in the conventional sense of "an a...
 to which a collection of potential haplotypes belong.

Haplotype resolution


An organism's genotype may not uniquely define its haplotype. For example, consider a diploid organism and two bi-allelic loci
Locus (genetics)

In the fields of genetics and evolutionary computation, a locus is a fixed position on a chromosome such as the position of a genetic marker that may be occupied by one or more genes....
 on the same chromosome such as single nucleotide polymorphisms (SNPs). The first locus has alleles A and T with three possible genotypes AA, AT, and TT, the second locus having G and C, again giving three possible genotypes GG, GC, and CC. For a given individual, there are therefore nine possible configurations for the genotypes at these two loci, as shown in the punnett square
Punnett square

The 'Punnett square' is a diagram that is used to predict the outcome of a particular cross or breeding experiment. It is named after Reginald C. Punnett, who devised the approach, and is used by biology to determine the probability of an offspring having a particular genotype....
 below, which shows the possible genotypes that an individual may carry and the corresponding haplotypes that these resolve to. For individuals that are homozygous at one or both loci, it is clear what the haplotypes are; it is only when an individual is heterozygous at both loci that the phase is ambiguous.

AA AT TT
GG AG AG AG TG TG TG
GC AG AC AG TC
or
AC TG
TG TC
CC AC AC AC TC TC TC


The only unequivocal method of resolving phase ambiguity is by sequencing. However, it is possible to estimate the probability of a particular haplotype when phase is ambiguous using a sample of individuals.

Given the genotypes for a number of individuals, the haplotypes can be inferred by haplotype resolution or haplotype phasing techniques. These methods work by applying the observation that certain haplotypes are common in certain genomic regions. Therefore, given a set of possible haplotype resolutions, these methods choose those that use fewer different haplotypes overall. The specifics of these methods vary - some are based on combinatorial approaches (e.g., parsimony
Parsimony

Parsimony is a 'less is better' concept of frugality, economy or caution in arriving at a hypothesis or course of action. The word derives from Middle English parcimony, from Latin parsimonia, from parsus, past participle of parcere: to spare....
), whereas others use likelihood functions based on different models and assumptions such as the Hardy-Weinberg principle
Hardy-Weinberg principle

The Hardy?Weinberg principle states that both allele and genotype frequencies in a population remain constant—that is, they are in equilibrium—from generation to generation unless specific disturbing influences are introduced....
, the coalescent theory
Coalescent theory

In genetics, coalescent theory is a retrospective model of population genetics. It employs a sample of individuals from a population to trace all alleles of a gene shared by all members of the population to a single ancestral copy, known as the most recent common ancestor ....
 model, or perfect phylogeny. These models are combined with optimization algorithms such as expectation-maximization algorithm
Expectation-maximization algorithm

An expectation-maximization algorithm is used in statistics for finding maximum likelihood estimates of parameters in probabilistic models, where the model depends on unobserved latent variables....
 (EM) or Markov chain Monte Carlo
Markov chain Monte Carlo

Markov chain Monte Carlo method methods , are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its Markov chain#Steady-state_analysis_and_limiting_distributions....
 (MCMC).

Y-DNA haplotypes from genealogical DNA tests


Unlike other chromosomes, Y chromosomes do not come in pairs. Every human male has only one copy of that chromosome. This means that there is no lottery as to which copy to inherit, and also (for most of the chromosome) no shuffling between copies by recombination
Recombination

Recombination may refer to:* Genetic recombination, the process by which genetic material is broken and joined to other genetic material* Carrier generation and recombination, processes by which mobile electrons and electron holes are created and eliminated...
; so, unlike autosomal haplotypes, there is therefore effectively no randomisation of the Y-chromosome haplotype between generations, and a human male should largely share the same Y chromosome as his father, give or take a few mutations.

In particular, the Y-DNA that is the numbered results of a Y-DNA genealogical DNA test
Genealogical DNA test

A genealogical DNA test examines the nucleotides at specific locations on a person's DNA for genetic genealogy purposes. The test results are not meant to have any informative medical value and do not determine specific genetic diseases or disorders ; they are intended only to give genealogical information....
 should match, barring mutations. Within genealogical and popular discussion, this is sometimes referred to as the "DNA signature" of a particular male human, or of his paternal bloodline.

UEP results (SNP results)

UEPs like SNPs represent Haplogroup
Haplogroup

In the study of molecular evolution, a haplogroup is a group of similar haplotypes that share a common ancestor with a single nucleotide polymorphism mutation....
s. STRs represent Haplotype
Haplotype

The term haplotype is a contraction of the term "Ploidy genotype." In genetics, a haplotype is a combination of alleles at multiple locus that are transmitted together on the same chromosome....
s: The results that make up the full Y-DNA haplotype from the Y chromosome DNA test can be divided into two parts: the results for unique event polymorphism
Unique event polymorphism

In genetic genealogy a unique event polymorphism is a genetic marker that corresponds to a mutation which is likely to occur so infrequently that it is believed overwhelmingly probable that all the individuals who share the marker, worldwide, will have inherited it from the same common ancestor, and the same single mutation event....
s (UEPs), sometimes loosely called the SNP results as most UEPs are single nucleotide polymorphisms, and the results for microsatellite
Microsatellite

Microsatellites, or Simple Sequence Repeats , are Polymorphism loci present in nuclear DNA and organellar DNA DNA that consist of repeating units of 1-6 base pairs in length....
 short tandem repeat
Short tandem repeat

A short tandem repeat in DNA is a class of polymorphism that occurs when a pattern of two or more nucleotides are repeated and the repeated sequences are directly adjacent to each other....
 sequences (Y-STR
Y-STR

A Y-STR is a short tandem repeat on the Y chromosome. Y-STRs are often designated by DYS .Y-STRs are often used in genealogical DNA testing....
s), often designated by DYS numbers
DYS (DNA)

DYS is short for DNA Y-chromosome Segment, and is used to designate a segment of DNA on the Y chromosome where a sequence of nucleotides repeats....
.

The UEP results reflect the inheritance of events it is believed can be assumed to have happened only once in all human history. These can be used to directly identify the individual's Y-DNA haplogroup
Human Y-chromosome DNA haplogroups

In human genetics, a Human Y-chromosome DNA haplogroup is a haplogroup defined by differences in the non-genetic recombination portions of DNA from the Y chromosome ....
, his place on the broad family tree of the whole of humanity. Different Y-DNA haplogroups identify genetic populations which are often intricately geographically oriented, reflecting the migrations of current individuals' direct patrilineal ancestors tens of thousands of years ago.

Y-STR haplotypes


The other possible part of the genetic results is the Y-STR haplotype, the set of results from the Y-STR markers tested.

Unlike the UEPs, the Y-STRs mutate much more easily, which gives them much more resolution to distinguish recent genealogy. But it also means that, rather than the population of descendants of a genetic event all sharing the same result, the Y-STR haplotypes are likely to have spread apart, to form a cluster of more or less similar results. Typically, this cluster will have a definite most probable center, the modal haplotype
Modal haplotype

A modal haplotype is an ancestral haplotype derived from the Genealogical DNA test results of a specific group of people, using genetic genealogy....
 (presumably close to the haplotype of the original founding event), and also a haplotype diversity - the degree to which it has become spread out. The further in the past the defining event occurred, and the more that subsequent population growth occurred early, the greater the haplotype diversity for a particular number of descendants will be. On the other hand, if the haplotype diversity is smaller for a particular number of descendants, this may indicate a more recent common ancestor, or that a population expansion has occurred more recently.

It is important to note that, unlike for UEPs, there is no guarantee that two individuals with a similar Y-STR haplotype will necessarily share a similar ancestry. There is no uniqueness about Y-STR events. Instead, the clusters of Y-STR haplotype results inheriting from different events and different histories all tend to overlap.

Thus, although sometimes a Y-STR haplotype may be directly indicative of a particular Y-DNA haplogroup, it is in most cases a long time since the haplogroups' defining events, so typically the cluster of Y-STR haplotype results associated with descendents of that event has become rather broad, and will tend to significantly overlap the (similarly broad) clusters of Y-STR haplotypes associated with other haplogroups, making it impossible to predict with absolute certainty to which Y-DNA haplogroup a Y-STR haplotype would point. All that can be done from the Y-STRs, if the UEPs are not actually tested, is to predict probabilities for haplogroup ancestry (as this [https://home.comcast.net/~whitathey/hapest5/ online program] does), but not certainties.

A similar scenario exists for surnames. A cluster of similar Y-STR haplotypes may indicate a shared common ancestor, with an identifiable modal haplotype, but only if the cluster is sufficiently distinct from what may have arisen by chance from different individuals historically having adopted the same name independently. This may require the typing of quite an extensive haplotype to establish, which has fuelled DNA testing companies to offer ever-larger sets of markers - 24 then 37 then 67, and perhaps soon even more.

Plausibly establishing relatedness between different surnames data-mined from a database is significantly harder, because now it must be established not that a randomly-selected member of the population is unlikely to have such a close match by accident, but rather that the very nearest member of the population in question, chosen purposely from the population for that very reason, would even under those circumstances be unlikely to match by accident. This is for the foreseeable future likely to be impossible, except in special cases where there is further information to drastically limit the size of that population of candidates under consideration.

See also

  • International HapMap Project
    International HapMap Project

    The International HapMap Project is an organization whose goal is to develop a haplotype map of the human genome , which will describe the common patterns of human genetic variability....
  • genealogical DNA test
    Genealogical DNA test

    A genealogical DNA test examines the nucleotides at specific locations on a person's DNA for genetic genealogy purposes. The test results are not meant to have any informative medical value and do not determine specific genetic diseases or disorders ; they are intended only to give genealogical information....
  • Haplogroup
    Haplogroup

    In the study of molecular evolution, a haplogroup is a group of similar haplotypes that share a common ancestor with a single nucleotide polymorphism mutation....


Software


  • — EM based haplotype estimation and association tests in unrelated and nuclear families.


  • — A software package for analyses of haplotype block structure.


  • Haploview
    Haploview

    Haploview is a commonly used bioinformatics software tool which is designed to analyze and visualize patterns of linkage disequilibrium in genetic data....
     — Visualisation of linkage disequilibrium
    Linkage disequilibrium

    In population genetics, linkage disequilibrium is the non-random association of alleles at two or more locus , not necessarily on the same chromosome....
    , haplotype estimation and haplotype tagging ().


  • — Haplotype analysis software - Haplotype Trend Regression (HTR), haplotypic association tests, and haplotype frequency estimation using both the expectation-maximization (EM) algorithm and composite haplotype method (CHM).


  • — A software for haplotype reconstruction, and recombination rate estimation from population data.


  • — EM based software for estimating haplotype frequencies from unphased genotypes.




  • haplotype based association analysis.


External links

  • — Comprehensive resource for DNA testing.
  • — homepage for the International HapMap Project.
  • — the difference between haplogroup & haplotype explained.