All Topics  
Law of large numbers

 

   Email Print
   Bookmark   Link






 

Law of large numbers



 
 
The law of large numbers (LLN) is a theorem in probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 that describes the long-term stability of the mean
Arithmetic mean

In mathematics and statistics, the arithmetic mean of a list of numbers is the sum of all of the list divided by the number of items in the list....
 of a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
. Given a random variable with a finite expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
, if its values are repeatedly sampled, as the number of these observations increases, their mean will tend to approach and stay close to the expected value.

The LLN can easily be illustrated using the rolls of a die. That is, outcomes of a multinomial distribution
Multinomial distribution

In probability theory, the multinomial distribution is a generalization of the binomial distribution.The binomial distribution is the probability distribution of the number of "successes" in n statistical independence Bernoulli trials, with the same probability of "success" on each trial....
 in which the numbers 1, 2, 3, 4, 5, and 6 are equally likely to be chosen.






Discussion
Ask a question about 'Law of large numbers'
Start a new discussion about 'Law of large numbers'
Answer questions from other users
Full Discussion Forum



Encyclopedia


The law of large numbers (LLN) is a theorem in probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 that describes the long-term stability of the mean
Arithmetic mean

In mathematics and statistics, the arithmetic mean of a list of numbers is the sum of all of the list divided by the number of items in the list....
 of a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
. Given a random variable with a finite expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
, if its values are repeatedly sampled, as the number of these observations increases, their mean will tend to approach and stay close to the expected value.

The LLN can easily be illustrated using the rolls of a die. That is, outcomes of a multinomial distribution
Multinomial distribution

In probability theory, the multinomial distribution is a generalization of the binomial distribution.The binomial distribution is the probability distribution of the number of "successes" in n statistical independence Bernoulli trials, with the same probability of "success" on each trial....
 in which the numbers 1, 2, 3, 4, 5, and 6 are equally likely to be chosen. The population mean (or "expected value") of the outcomes is:

(1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5.


The graph to the right plots the results of an experiment of rolls of a die. In this experiment we see that the average of die rolls deviates wildly at first. As predicted by LLN the average stabilizes around the expected value of 3.5 as the number of observations becomes large.

Another example is the flip of a coin. Given repeated flips of a fair coin
Fair coin

In probability theory and statistics, a sequence of statistical independence Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin....
, the frequency of heads (or tails) will increasingly approach 50% over a large number of trials. It is certain that the absolute difference in the number of heads and tails will tend to get large as the number of flips becomes large. That is, the probability that the absolute difference is a small number approaches zero as number of trials becomes large. It is also certain that the ratio of the absolute difference to number of flips will approach zero. Intuitively, expected absolute difference grows, but at a slower rate that the number of flips, as the number of flips grows.

For example, we may see 520 heads after 1000 flips and 5096 heads after 10000 flips. While the average has moved from 0.52 to 0.5096, closer to the expected 50%, the total difference from the expected mean has increased from 20 to 96.

The LLN is important because it "guarantees" stable long-term results for random events. For example, while a casino may lose money in a single spin of the roulette
Roulette

Roulette is a casino and gambling game named after the French language word meaning "small wheel". In the game, players may choose to place bets on either a number, a range of numbers, the color red or black, or whether the number is odd or even....
 wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. It is important to remember that the LLN only applies (as the name indicates) when a large number of observations are considered. There is no principle that a small number of observations will converge to the expected value or that a streak of one value will immediately be "balanced" by the others. See the Gambler's fallacy
Gambler's fallacy

The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated statistical independence trials of some random process then these deviations are likely to be evened out by opposite deviations in the future....
.

History

The LLN was first described by Jacob Bernoulli. It took him over 20 years to develop a sufficiently rigorous mathematical proof which was published in his Ars Conjectandi
Ars Conjectandi

Ars Conjectandi is a mathematics paper written by Jakob Bernoulli and published eight years after his death by his nephew, Nicolaus II Bernoulli, in 1713....
 (The Art of Conjecturing) in 1713. He named this his "Golden Theorem" but it became generally known as "Bernoulli's Theorem". This should not be confused with the principle in physics with the same name
Bernoulli's principle

In fluid dynamics, Bernoulli's principle states that for an inviscid flow, an increase in the speed of the fluid occurs simultaneously with a decrease in pressure or a decrease in the fluid's potential energy....
, named after Jacob Bernoulli's nephew Daniel Bernoulli. In 1835, S.D. Poisson
Siméon Denis Poisson

Sim?on-Denis Poisson , was a France mathematician, geometer, and physicist. The name is in French language....
 further described it under the name "La loi des grands nombres" ("The law of large numbers"). Thereafter, it was known under both names, but the "Law of large numbers" is most frequently used.

After Bernoulli and Poisson published their efforts, other mathematicians also contributed to refinement of the law, including Chebyshev
Pafnuty Chebyshev

Pafnuty Lvovich Chebyshev was a Russians mathematician. His name can be alternatively Romanization of Russian as Chebychev, Chebyshov, Tchebycheff or Tschebyscheff ....
, Markov
Andrey Markov

Andrey Andreyevich Markov was a Russian mathematician. He is best known for his work on theory of stochastic processes. His research later became known as Markov chains....
, Borel
Émile Borel

F?lix ?douard Justin ?mile Borel was a France mathematician and politician.Along with Ren?-Louis Baire and Henri Lebesgue, he was among the pioneers of measure and its application to probability theory....
, Cantelli
Francesco Paolo Cantelli

Francesco Paolo Cantelli was an Italian mathematician. He was the founder of the Istituto Italiano degli Attuari for the applications of mathematics and probability to economics....
 and Kolmogorov
Andrey Kolmogorov

Andrey Nikolaevich Kolmogorov was a Soviet Union Russian mathematician, preeminent in the 20th century who advanced various scientific fields ....
. These further studies have given rise to two prominent forms of the LLN. One is called the "weak" law and the other the "strong" law. These forms do not describe different laws but instead refer to different ways of describing the mode of convergence
Convergence

In the absence of a more specific context, convergence denotes the approach toward a definite value, as time goes on; or to a definite point, a common view or opinion, or toward a fixed or equilibrium point state....
 of the cumulative sample means to the expected value, and the strong form implies the weak.

Forms

Both versions of the law state that the sample average

converges to the expected value

where X1, X2, ... is an infinite sequence of i.i.d. random variables with finite expected value E(X1) = E(X2) = ... = µ < 8.

An assumption of finite variance Var(X1) = Var(X2) = ... = s2 < 8 is not necessary. Large or infinite variance will make the convergence slower, but the LLN holds anyway. This assumption is often used because it makes the proofs easier and shorter.

The difference between the strong and the weak version is concerned with the mode of convergence being asserted.

The weak law

The weak law of large numbers states that the sample average converges in probability
Convergence of random variables

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some Limit ing random variable is an important concept in probability theory, and its applications to statistics and stochastic processes....
  towards the expected value

That is to say that for any positive number e,

(Proof
Law of large numbers/Proof

Given X1, X2, ... an infinite sequence of i.i.d. random variables with finite expected value E = E = ... = ? < 8, we are interested in the convergence of the sample average...
)


Interpreting this result, the weak law essentially states that for any nonzero margin specified, no matter how small, with a sufficiently large sample there will be a very high probability that the average of the observations will be close to the expected value, that is, within the margin.

Convergence in probability is also called weak convergence of random variables. This version is called the weak law because random variables may converge weakly (in probability) as above without converging strongly (almost surely) as below.

A consequence of the weak LLN is the asymptotic equipartition property
Asymptotic equipartition property

In information theory the asymptotic equipartition property is a general property of the output samples of a stochastic process. It is fundamental to the concept of typical set used in theories of data compression....
.

The strong law

The strong law of large numbers states that the sample average converges
Convergence of random variables

In probability theory, there exist several different notions of convergence of random variables. The convergence of sequences of random variables to some Limit ing random variable is an important concept in probability theory, and its applications to statistics and stochastic processes....
 almost surely
Almost surely

In probability theory, one says that an event happens almost surely if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory....
 to the expected value

That is,

The proof is more complex than that of the weak law. This law justifies the intuitive interpretation of the expected value of a random variable as the "long-term average when sampling repeatedly."

Almost sure convergence is also called strong convergence of random variables. This version is called the strong law because random variables which converge strongly (almost surely) are guaranteed to converge weakly (in probability). The strong law implies the weak law.

The strong law of large numbers can itself be seen as a special case of the pointwise ergodic theorem
Ergodic theory

Ergodic theory is a branch of mathematics that studies dynamical systemswith an invariant measure and related problems. Its initial development was motivated by problems of statistical physics....
.

Differences between the weak law and the strong law


The Weak Law states that, for a specified large , is likely to be near . Thus, it leaves open the possibility that || > e happens an infinite number of times, although it happens at infrequent intervals.

The strong law shows that this cannot occur. In particular, it implies that with probability 1, we have for any positive value , the inequality || > is true only a finite number of times (as opposed to an infinite, but infrequent, number of times).

Activities and demonstrations

There are varieties of ways to illustrate the theory and applications of the laws of large numbers using interactive aids. The SOCR
SOCR

The Statistics Online Computational Resource is a suite of online tools and interactive aids for hands-on learning and teaching concepts in statistical analysis and probability developed at the University of California, Los Angeles....
 resource provides a paired with a that demonstrate the power and usability of the law of large numbers.

See also

  • Central limit theorem
    Central limit theorem

    The central limit theorem states that the re-averaged sum of a sufficiently large number of Independent and identically-distributed random variables Statistical independence random variables each with finite mean and variance will be approximately normal distribution ....
  • Gambler's fallacy
    Gambler's fallacy

    The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated statistical independence trials of some random process then these deviations are likely to be evened out by opposite deviations in the future....
  • Law of averages
    Law of averages

    The law of averages is a Layman's terms used to express a belief that outcomes of a random event shall "even out" within a small sample.As invoked in everyday life, the "law" usually reflects bad statistics or wishful thinking rather than any mathematical principle....


External links

  • by Yihui Xie using the R
    R (programming language)

    In computing, R is a programming language and software environment for statistics computing and graphics. It is an implementation of the S programming language with lexical scoping semantics inspired by Scheme ....
     package
  • and