All Topics  
Binomial distribution

 

   Email Print
   Bookmark   Link






 

Binomial distribution



 
 
Also see: Negative binomial distribution
Negative binomial distribution

In probability and statistics the negative binomial distribution is a discrete probability distribution. It can be used to describe the distribution arising from an experiment consisting of a sequence of independent trials, subject to several constraints....
.
In probability theory
Probability theory

Probability theory is the branch of mathematics concerned with analysis of Statistical randomness phenomena. The central objects of probability theory are random variables, stochastic processes, and event s: mathematical abstractions of determinism events or measured quantities that may either be single occurrences or evolve over time in an a...
 and statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, the binomial distribution is the discrete probability distribution
Discrete probability distribution

Discrete probability distributions arise in the mathematical description of probability theory and statistical analysis in which the values that might be observed are restricted to being within a pre-defined list of possible values....
 of the number of successes in a sequence of n independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
 yes/no experiments, each of which yields success with probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial
Bernoulli trial

IntroductionIn the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure"....
. In fact, when n = 1, the binomial distribution is a Bernoulli distribution
Bernoulli distribution

In probability theory and statistics, the Bernoulli distribution, named after Swiss scientist Jacob Bernoulli, is a discrete probability distribution probability distribution, which takes value 1 with success probability and value 0 with failure probability ....
. The binomial distribution is the basis for the popular binomial test
Binomial test

In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories....
 of statistical significance
Statistical significance

In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important, or significant in the common meaning of the word....
.






Discussion
Ask a question about 'Binomial distribution'
Start a new discussion about 'Binomial distribution'
Answer questions from other users
Full Discussion Forum



Recent Posts









Encyclopedia


Also see: Negative binomial distribution
Negative binomial distribution

In probability and statistics the negative binomial distribution is a discrete probability distribution. It can be used to describe the distribution arising from an experiment consisting of a sequence of independent trials, subject to several constraints....
.
In probability theory
Probability theory

Probability theory is the branch of mathematics concerned with analysis of Statistical randomness phenomena. The central objects of probability theory are random variables, stochastic processes, and event s: mathematical abstractions of determinism events or measured quantities that may either be single occurrences or evolve over time in an a...
 and statistics
Statistics

Statistics is a Mathematics pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data....
, the binomial distribution is the discrete probability distribution
Discrete probability distribution

Discrete probability distributions arise in the mathematical description of probability theory and statistical analysis in which the values that might be observed are restricted to being within a pre-defined list of possible values....
 of the number of successes in a sequence of n independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
 yes/no experiments, each of which yields success with probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial
Bernoulli trial

IntroductionIn the theory of probability and statistics, a Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure"....
. In fact, when n = 1, the binomial distribution is a Bernoulli distribution
Bernoulli distribution

In probability theory and statistics, the Bernoulli distribution, named after Swiss scientist Jacob Bernoulli, is a discrete probability distribution probability distribution, which takes value 1 with success probability and value 0 with failure probability ....
. The binomial distribution is the basis for the popular binomial test
Binomial test

In statistics, the binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories....
 of statistical significance
Statistical significance

In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important, or significant in the common meaning of the word....
. A binomial distribution should not be confused with a bimodal distribution.

It is frequently used to model number of successes in a sample of size n from a population of size N. Since the samples are not independent (this is sampling without replacement), the resulting distribution is a hypergeometric distribution
Hypergeometric distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement, just as the binomial distribution describes the number of successes for draws with replacement....
, not a binomial one. However, for N much larger than n, the binomial distribution is a good approximation, and widely used.

Examples

An elementary example is this: Roll a standard dice
Dice

A die is a small polyhedron object, usually cubic, used for generating Statistical randomnesss or other symbols. This makes dice suitable as gambling devices, especially for craps or sic bo, or for use in non-gambling tabletop games....
 ten times and count the number of sixes. The distribution of this random number is a binomial distribution with n = 10 and p = 1/6.

As another example, assume 5% of a very large population to be green-eyed. You pick 100 people randomly. The number of green-eyed people you pick is a random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
 X which approximately follows a binomial distribution with n = 100 and p = 0.05 (strictly a hypergeometric distribution
Hypergeometric distribution

In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement, just as the binomial distribution describes the number of successes for draws with replacement....
).

Specification


Probability mass function


In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(np). The probability of getting exactly k successes in n trials is given by the probability mass function
Probability mass function

In probability theory, a probability mass function is a function that gives the probability that a discrete random variable random variable is exactly equal to some value....
:

for k = 0, 1, 2, ..., n and where

is the binomial coefficient
Binomial coefficient

In mathematics, the binomial coefficient is the coefficient of the x k term in the polynomial expansion of the binomial exponentiation  n....
 (hence the name of the distribution) "n choose k", also denoted C(nk),  nCk, or nCk. The formula can be understood as follows: we want k successes (pk) and n − k failures (1 − p)n − k. However, the k successes can occur anywhere among the n trials, and there are C(nk) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

So, one must look to a different k and a different p (the binomial is not symmetrical in general). However, its behavior is not arbitrary. There is always an integer m that satisfies

As a function of k, the expression ƒ(knp) is monotone increasing for k < m and monotone decreasing for k > m, with the exception of one case where (n + 1)p is an integer. In this case, there are two maximum values for m = (n + 1)p and m − 1. m is known as the most probable (most likely) outcome of Bernoulli trials. Note that the probability of it occurring can be fairly small.

Cumulative distribution function


The cumulative distribution function
Cumulative distribution function

In probability theory and statistics, the cumulative distribution function or just distribution function, completely describes the probability distribution of a real-valued random variable X....
 can be expressed as:

where is the "floor" under x, i.e. the greatest integer less than or equal to x.

It can also be represented in terms of the regularized incomplete beta function, as follows:

For k = np, upper bounds
Chernoff bound

The Chernoff bound of probability theory is named after Herman Chernoff. It gives exponentially decreasing bounds on tail distributions of sums of independent random variables....
 for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality
Hoeffding's inequality

Hoeffding's inequality, named after Wassily Hoeffding, is a result in probability theory that gives an upper bound on the probability for the sum of random variables to deviate from its expected value....
 yields the bound

and Chernoff's inequality can be used to derive the bound

Mean, variance, and mode


If X ~ B(n, p) (that is, X is a binomially distributed random variable), then the expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
 of X is

and the variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
 is

This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by µ = p. Using the definition of variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
, we have

Now suppose that we want the variance for n such trials (i.e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving

The mode
Mode (statistics)

In statistics, the mode is the value that occurs the most frequently in a data set or a probability distribution. In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score....
 of X is the greatest integer less than or equal to (n + 1)p; if m = (n + 1)p is an integer, then m − 1 and m are both modes.

Algebraic derivations of mean and variance


We derive these quantities from first principles. Certain particular sums occur in these two derivations. We rearrange the sums and terms so that sums solely over complete binomial probability mass functions (pmf
Binomial distribution

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n statistical independence yes/no experiments, each of which yields success with probability p....
) arise, which are always unity

We apply the definition of the expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
 of a discrete random variable to the binomial distribution

The first term of the series (with index k = 0) has value 0 since the first factor, k, is zero. It may thus be discarded, i.e. we can change the lower limit to: k = 1

We've pulled factors of n and k out of the factorials, and one power of p has been split off. We are preparing to redefine the indices.

We rename m = n − 1 and s = k − 1. The value of the sum is not changed by this, but it now becomes readily recognizable

The ensuing sum is a sum over a complete binomial pmf
Binomial distribution

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n statistical independence yes/no experiments, each of which yields success with probability p....
 (of one order lower than the initial sum, as it happens). Thus

Variance


It can be shown that the variance is equal to (see: Computational formula for the variance
Computational formula for the variance

In probability theory, the computational formula for the variance Var of a random variable X is the formulawhere E is the expected value of X....
):

In using this formula we see that we now also need the expected value of X 2:

We can use our experience gained above in deriving the mean. We know how to process one factor of k. This gets us as far as

(again, with m = n − 1 and s = k − 1). We split the sum into two separate sums and we recognize each one

The first sum is identical in form to the one we calculated in the Mean (above). It sums to mp. The second sum is unity.

Using this result in the expression for the variance, along with the Mean (E(X) = np), we get

Using falling factorials to find E(X2)


We have

But

So



Thus

Relationship to other distributions


Sums of binomials

If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables, then X + Y is again a binomial variable; its distribution is

Normal approximation


Bindistapprox Large
If n is large enough, the skew of the distribution is not too great, and a suitable continuity correction
Continuity correction

In probability theory, if a random variable X has a binomial distribution with parameters n and p, i.e., X is distributed as the number of "successes" in n independent Bernoulli trials with probability p of success on each trial, then...
 is used, then an excellent approximation to B(n, p) is given by the normal distribution
Normal distribution

The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields....


Various rules of thumb
Rule of thumb

A rule of thumb is a principle with broad application that is not intended to be strictly accurate or reliable for every situation. It is an easily learned and easily applied procedure for approximately calculating or recalling some value, or for making some determination....
 may be used to decide whether n is large enough. One rule is that both np and n(1 − p) must be greater than 5. However, the specific number varies from source to source, and depends on how good an approximation one wants; some sources give 10. Another commonly used rule holds that the above normal approximation is appropriate only if everything within 3 standard deviations of its mean is within the range of possible values, that is if

The following is an example of applying a continuity correction
Continuity correction

In probability theory, if a random variable X has a binomial distribution with parameters n and p, i.e., X is distributed as the number of "successes" in n independent Bernoulli trials with probability p of success on each trial, then...
: Suppose one wishes to calculate Pr(X = 8) for a binomial random variable X. If Y has a distribution given by the normal approximation, then Pr(X = 8) is approximated by Pr(Y = 8.5). The addition of 0.5 is the continuity correction; the uncorrected normal approximation gives considerably less accurate results.

This approximation is a huge time-saver (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre
Abraham de Moivre

Abraham de Moivre was a France mathematician famous for de Moivre's formula, which links complex numbers and trigonometry, and for his work on the normal distribution and probability theory....
's book The Doctrine of Chances
The Doctrine of Chances

The Doctrine of Chances was the first textbook on probability theory, written by 18th-century French mathematician Abraham de Moivre and first published in 1718....
 in 1733. Nowadays, it can be seen as a consequence of the central limit theorem
Central limit theorem

The central limit theorem states that the re-averaged sum of a sufficiently large number of Independent and identically-distributed random variables Statistical independence random variables each with finite mean and variance will be approximately normal distribution ....
 since B(n, p) is a sum of n independent, identically distributed Bernoulli variables
Bernoulli distribution

In probability theory and statistics, the Bernoulli distribution, named after Swiss scientist Jacob Bernoulli, is a discrete probability distribution probability distribution, which takes value 1 with success probability and value 0 with failure probability ....
 with parameter p.

For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation s = (p(1 − p)n)1/2. Large sample size
Sample size

The sample size of a statistical sample is the number of observations that constitute it. It is typically denoted n, a positive integer ....
s n are good because the standard deviation, as a proportion of the expected value, gets smaller, which allows a more precise estimate of the unknown parameter p.

Poisson approximation


The binomial distribution converges towards the Poisson distribution
Poisson distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and Statistical independence of the time since the last event....
 as the number of trials goes to infinity while the product np remains fixed. Therefore the Poisson distribution with parameter ? = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n = 20 and p = 0.05, or if n = 100 and np = 10.

Limits of binomial distributions


  • As n approaches 8 and p approaches 0 while np remains fixed at ? > 0 or at least np approaches ? > 0, then the Binomial(np) distribution approaches the Poisson distribution
    Poisson distribution

    In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and Statistical independence of the time since the last event....
     with expected value
    Expected value

    In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
     ?.


  • As n approaches 8 while p remains fixed, the distribution of




approaches the normal distribution
Normal distribution

The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields....
 with expected value 0 and variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
 1 (this is just a specific case of the Central Limit Theorem
Central limit theorem

The central limit theorem states that the re-averaged sum of a sufficiently large number of Independent and identically-distributed random variables Statistical independence random variables each with finite mean and variance will be approximately normal distribution ....
).


Generating binomial random variates


  • Luc Devroye, Non-Uniform Random Variate Generation, New York: Springer-Verlag, 1986. See especially .


  • Voratas Kachitvichyanukul and Bruce W. Schmeiser, Binomial random variate generation, Communications of the ACM
    Communications of the ACM

    Communications of the ACM is the flagship monthly journal of the Association for Computing Machinery . First published in 1957, CACM is sent to all ACM members, currently numbering about 80,000....
     31(2):216–222, February 1988.


See also

  • Bean machine
    Bean machine

    The bean machine, also known as the quincunx or Galton box, is a device invented by Sir Francis Galton to demonstrate the law of error and the normal distribution....
     / Galton box
  • Beta distribution
    Beta distribution

    In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] parameterized by two positive shape parameters, typically denoted by α and β....
  • Hypergeometric distribution
    Hypergeometric distribution

    In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement, just as the binomial distribution describes the number of successes for draws with replacement....
  • Multinomial distribution
    Multinomial distribution

    In probability theory, the multinomial distribution is a generalization of the binomial distribution.The binomial distribution is the probability distribution of the number of "successes" in n statistical independence Bernoulli trials, with the same probability of "success" on each trial....
  • Negative binomial distribution
    Negative binomial distribution

    In probability and statistics the negative binomial distribution is a discrete probability distribution. It can be used to describe the distribution arising from an experiment consisting of a sequence of independent trials, subject to several constraints....
  • Poisson distribution
    Poisson distribution

    In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and Statistical independence of the time since the last event....
  • SOCR
    SOCR

    The Statistics Online Computational Resource is a suite of online tools and interactive aids for hands-on learning and teaching concepts in statistical analysis and probability developed at the University of California, Los Angeles....
  • Normal distribution
    Normal distribution

    The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields....
  • Binomial proportion confidence interval
    Binomial proportion confidence interval

    In statistics, a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population. It uses the proportion estimated in a statistical sample and allows for sampling error....


External links

  • (does not require java)
  • Many resources for teaching Statistics including Binomial Distribution
  • by Chris Boucher, Wolfram Demonstrations Project
    Wolfram Demonstrations Project

    The Wolfram Demonstrations Project is a website developed by Wolfram Research, whose stated goal is to bring computational exploration to the widest possible audience....
    , 2007.
  • Properties and Java simulation from cut-the-knot
    Cut-the-knot

    Cut-the-knot is an educational website maintained by Alexander Bogomolny and devoted to popular exposition of a great variety of topics in mathematics....