Polya urn model
Encyclopedia
In statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, a Polya urn model (also known as a Polya urn scheme or simply as Pólya's urn), named after George Pólya
George Pólya
George Pólya was a Hungarian mathematician. He was a professor of mathematics from 1914 to 1940 at ETH Zürich and from 1940 to 1953 at Stanford University. He made fundamental contributions to combinatorics, number theory, numerical analysis and probability theory...

, is a type of statistical model
Statistical model
A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...

 used as an idealized mental exercise
Thought experiment
A thought experiment or Gedankenexperiment considers some hypothesis, theory, or principle for the purpose of thinking through its consequences...

 to understand the nature of certain statistical distributions.

In an urn model, objects of real interest (such as atoms, people, cars, etc.) are represented as colored balls in an urn
Urn
An urn is a vase, ordinarily covered, that usually has a narrowed neck above a footed pedestal. "Knife urns" placed on pedestals flanking a dining-room sideboard were an English innovation for high-style dining rooms of the late 1760s...

 or other container. In the basic urn model, the urn contains x white and y black balls; one ball is drawn randomly from the urn and its color observed; it is then placed back in the urn, and the selection process is repeated. Questions can then be asked about the probability of drawing one color or another, or some other properties.

The Polya urn model differs only in that, when a ball of a particular color is drawn, that ball is put back along with a new ball of the same color. Thus, unlike in the basic model, the contents of the urn change over time, with a self-reinforcing property sometimes expressed as the rich get richer
The rich get richer (statistics)
In probability and statistics, the phrase "the rich get richer" is used to describe the self-reinforcing behavior of certain probability distributions and stochastic processes, such as the Dirichlet process and Chinese restaurant process...

.

Note that in some sense, the Polya urn model is the "opposite" of the model of sampling without replacement. When sampling without replacement, every time a particular value is observed, it is less likely to be observed again, whereas in a Polya urn model, an observed value is more likely to be observed again. In both of these models, the act of measurement has an effect on the outcome of future measurements. (For comparison, when sampling with replacement, observation of a particular value has no effect on how likely it is to observe that value again.) Note also that in a Polya urn model, successive acts of measurement over time have less and less effect on future measurements, whereas in sampling without replacement, the opposite is true: After a certain number of measurements of a particular value, that value will never be seen again.

Distributions related to the Polya urn

  • beta-binomial distribution: The distribution of the number of successful draws (trials), e.g. number of extractions of white ball, given draws from a Polya urn.
  • multivariate Pólya distribution
    Multivariate Polya distribution
    The multivariate Pólya distribution, named after George Pólya, also called the Dirichlet compound multinomial distribution, is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector \alpha, and a set of discrete samples is...

     (also known as the Dirichlet compound multinomial distribution): The distribution over the number of balls of each color, given draws from a Polya urn where there are different colors instead of only two.
  • martingales
    Martingale (probability theory)
    In probability theory, a martingale is a model of a fair game where no knowledge of past events can help to predict future winnings. In particular, a martingale is a sequence of random variables for which, at a particular time in the realized sequence, the expectation of the next value in the...

    , the Beta-binomial distribution and the beta distribution: Let w and b be the number of white and black balls initially in the urn, and the number of white balls currently in the urn after n draws. Then the sequence of values for is a normalized version of the Beta-binomial distribution. It is a martingale
    Martingale (probability theory)
    In probability theory, a martingale is a model of a fair game where no knowledge of past events can help to predict future winnings. In particular, a martingale is a sequence of random variables for which, at a particular time in the realized sequence, the expectation of the next value in the...

     and converges to the beta distribution when n → ∞.
  • Dirichlet process
    Dirichlet process
    In probability theory, a Dirichlet process is a stochastic process that can be thought of as a probability distribution whose domain is itself a random distribution...

    , Chinese restaurant process: Imagine a modified Polya urn scheme as follows. We start with an urn with black balls. When drawing a ball from the urn, if we draw a black ball, put the ball back along with a new ball of a new non-black color randomly generated from a uniform distribution, and consider the newly generated color to be the "value" of the draw. Otherwise, put the ball back along with another ball of the same color, as for the standard Polya urn scheme. The colors of an infinite sequence of draws from this modified Polya urn scheme follow a Chinese restaurant process. If, instead of generating a new color, we draw a random value from a given base distribution and use that value to label the ball, the labels of an infinite sequence of draws follow a Dirichlet process
    Dirichlet process
    In probability theory, a Dirichlet process is a stochastic process that can be thought of as a probability distribution whose domain is itself a random distribution...

    .
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK