All Topics  
Probability density function

 

   Email Print
   Bookmark   Link






 

Probability density function



 
 
In mathematics
Mathematics

Mathematics is the study of quantity, structure, space, change, and related topics of pattern and form. Mathematicians seek out patterns whether found in numbers, space, natural science, computers, imaginary abstractions, or elsewhere....
, a probability density function (pdf) is a function that represents a probability distribution
Probability distribution

In probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval ....
 in terms of integral
Integral

Integration is an important concept in mathematics, specifically in the field of calculus and, more broadly, mathematical analysis. Given a function ƒ of a Real number variable x and an interval [ab] of the real line, the integral...
s.

Formally, a probability distribution has density ƒ, if ƒ is a non-negative Lebesgue-integrable
Lebesgue integration

Lebesgue integration refers to both the general theory of integration of a function with respect to a general measure , and to the specific case of integration of a function defined on a sub-domain of the real line or a higher dimensional Euclidean space with respect to the Lebesgue measure....
 function such that the probability of the interval [ab] is given by

for any two numbers a and b.






Discussion
Ask a question about 'Probability density function'
Start a new discussion about 'Probability density function'
Answer questions from other users
Full Discussion Forum



Encyclopedia


Boxplot Vs Pdf
In mathematics
Mathematics

Mathematics is the study of quantity, structure, space, change, and related topics of pattern and form. Mathematicians seek out patterns whether found in numbers, space, natural science, computers, imaginary abstractions, or elsewhere....
, a probability density function (pdf) is a function that represents a probability distribution
Probability distribution

In probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval ....
 in terms of integral
Integral

Integration is an important concept in mathematics, specifically in the field of calculus and, more broadly, mathematical analysis. Given a function ƒ of a Real number variable x and an interval [ab] of the real line, the integral...
s.

Formally, a probability distribution has density ƒ, if ƒ is a non-negative Lebesgue-integrable
Lebesgue integration

Lebesgue integration refers to both the general theory of integration of a function with respect to a general measure , and to the specific case of integration of a function defined on a sub-domain of the real line or a higher dimensional Euclidean space with respect to the Lebesgue measure....
 function such that the probability of the interval [ab] is given by

for any two numbers a and b. This implies that the total integral of f must be 1. Conversely, for any non-negative Lebesgue-integrable function f with total integral 1, there must be some probability distribution for which f represents the probability density.

Intuitively, if a probability distribution has density ƒ, then the infinitesimal interval
Interval (mathematics)

In mathematics, a interval is a set of real numbers with the property that any number that lies between two numbers in the set is also included in the set....
 [xx + dx] has probability ƒ(xdx.

Informally, a probability density function can be seen as a "smoothed out" version of a histogram
Histogram

In statistics, a histogram is a graphical display of tabulated frequency , shown as bars. It shows what proportion of cases fall into each of several Categorization....
: if one empirically samples enough values of a continuous random variable, producing a histogram depicting relative frequencies of output ranges, then this histogram will resemble the random variable's probability density, assuming that the output ranges are sufficiently narrow.

Any function ƒ that describes the probability density in terms of the input variable x is a probability density function if and only if it is non-negative and the area under the graph is 1:



The actual probability can then be calculated by taking the integral of the function by the integration interval of the input variable x.

For example: the probability of the variable X being within the interval [4.3, 7.8] would be

Further details

For example, the continuous uniform distribution
Uniform distribution (continuous)

In probability theory and statistics, the continuous uniform distribution is a family of probability distributions such that for each member of the family, all interval s of the same length on the distribution's support are equally probable....
 on the interval [0, 1] has probability density ƒ(x) = 1 for 0 ≤ x ≤ 1 and ƒ(x) = 0 elsewhere.

The standard normal distribution
Normal distribution

The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields....
 has probability density

If a random variable X is given and its distribution admits a probability density function ƒ, then the expected value
Expected value

In probability theory and statistics, the expected value of a random variable is the Lebesgue integral of the random variable with respect to its probability measure....
 of X (if it exists) can be calculated as

Not every probability distribution has a density function: the distributions of discrete random variables do not; nor does the Cantor distribution
Cantor distribution

The Cantor distribution is the probability distribution whose cumulative distribution function is the Cantor function.This distribution has neither a probability density function nor a probability mass function, as it is not absolute continuity with respect to Lebesgue measure, nor has it any point-masses....
, even though it has no discrete component, i.e., does not assign positive probability to any individual point.

A distribution has a density function if and only if its cumulative distribution function
Cumulative distribution function

In probability theory and statistics, the cumulative distribution function or just distribution function, completely describes the probability distribution of a real-valued random variable X....
 F(x) is absolutely continuous
Absolute continuity

In mathematics, absolute continuity is a smoothness property which is stricter than continuity and uniform continuity. Both absolute continuity of functions and absolute continuity of measures are defined....
. In this case: F is almost everywhere
Almost everywhere

In measure theory , one says that a property holds almost everywhere if the set of elements for which the property does not hold is a null set, i.e....
 differentiable
Derivative

In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much a quantity is changing at a given point....
, and its derivative can be used as probability density:

If a probability distribution admits a density, then the probability of every one-point set is zero.

It is a common mistake to think of ƒ(x) as the probability of , but this is incorrect; in fact, ƒ(x) will often be bigger than 1 — consider a random variable that is uniformly distributed
Uniform distribution

Uniform distribution can refer to:...
 between 0 and ½. Loosely, one may think of
ƒ(xdx as the probability that a random variable (whose probability density function is ƒ) is in the interval from x to x + dx, where dx is a small increment that may be considered infinitely small in the usual way.

Two probability densities
ƒ and g represent the same probability distribution
Probability distribution

In probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval ....
 precisely if they differ only on a set of Lebesgue
Lebesgue measure

In mathematics, the Lebesgue measure, named after Henri Lebesgue, is the standard way of assigning a length, area or volume to subsets of Euclidean space....
 measure zero.

In the field of statistical physics
Statistical physics

Statistical physics is the area of physics that uses methods of probability theory and statistics, and particularly the Mathematics tools for dealing with large populations, in solving physical problems....
, a non-formal reformulation of the relation above between the derivative of the cumulative distribution function
Cumulative distribution function

In probability theory and statistics, the cumulative distribution function or just distribution function, completely describes the probability distribution of a real-valued random variable X....
 and the probability density function is generally used as the definition of the probability density function. This alternate definition is the following:

If
dt is an infinitely small number, the probability that X is included within the interval (tt + dt) is equal to ƒ(tdt, or:

Link between discrete and continuous distributions


The definition of a probability density function at the start of this page makes it possible to describe the variable associated with a continuous distribution using a set of binary discrete variables associated with the intervals [
ab] (for example, a variable being worth 1 if X is in [ab], and 0 if not).

It is also possible to represent certain discrete random variables using a density of probability, via the Dirac delta function
Dirac delta function

The Dirac delta or Dirac's delta is a mathematics construct introduced by theoretical physicist Paul Dirac. Informally, it is a function representing an infinitely sharp peak bounding unit area: a function d that has the value 0 everywhere except at x = 0 where its value is infinity in such a way that its total integral is 1....
. For example, let us consider a binary discrete random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
 taking −1 or 1 for values, with probability ½ each.

The density of probability associated with this variable is:

More generally, if a discrete variable can take
n different values among real numbers, then the associated probability density function is:

where are the discrete values accessible to the variable and are the probabilities associated with these values.

This expression allows for determining statistical characteristics of such a discrete variable (such as its mean
Mean

In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
, its variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
 and its kurtosis
Kurtosis

In probability theory and statistics, kurtosis is a measure of the "peakedness" of the probability distribution of a real number-valued random variable....
), starting from the formulas given for a continuous distribution.

Probability functions associated with multiple variables


For continuous random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
s , it is also possible to define a probability density function associated to the set as a whole, often called joint probability density function. This density function is defined as a function of the
n variables, such that, for any domain D in the n-dimensional space of the values of the variables , the probability that a realisation of the set variables falls inside the domain D is

For
i=1, 2, …,n, let be the probability density function associated to variable alone. This is called the "marginal" density function, and can be deduced from the probability densities associated of the random variables by integrating on all values of the n − 1 other variables:

Independence


Continuous random variables are all independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
 from each other if and only if

Corollary


If the joint probability density function of a vector of
n random variables can be factored into a product of n functions of one variable

then the
n variables in the set are all independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
 from each other, and the marginal probability density function of each of them is given by

Example


This elementary example illustrates the above definition of multidimensional probability density functions in the simple case of a function of a set of two variables. Let us call a 2-dimensional random vector of coordinates : the probability to obtain in the quarter plane of positive
x and y is

Sums of independent random variables


The probability density function of the sum of two independent
Statistical independence

In probability theory, to say that two event s are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs....
 random variables
U and V, each of which has a probability density function, is the convolution
Convolution

In mathematics and, in particular, functional analysis, convolution is a mathematical operator on two function s f and g, producing a third function that is typically viewed as a modified version of one of the original functions....
 of their separate density functions:

Dependent variables and change of variables


If the probability density function of an independent random variable
x is given as , it is possible (but often not necessary; see below) to calculate the probability density function of some variable . This is also called a "change of variable" and is in practice used to generate a random variable of arbitrary shape using a known (for instance uniform) random number generator.

If the function
g is monotonic
Monotonic function

In mathematics, a monotonic function is a function which preserves the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order theory....
, then the resulting density function is



Here denotes the inverse function
Inverse function

In mathematics, if ƒ is a function from A to B then an inverse function for ƒ is a function in the opposite direction, from B to A, with the property that a round trip from A to B to A returns each element of the initial set to itself....
 and
g denotes the derivative
Derivative

In calculus, a branch of mathematics, the derivative is a measure of how a function changes as its input changes. Loosely speaking, a derivative can be thought of as how much a quantity is changing at a given point....
.

This follows from the fact that the probability contained in a differential area must be invariant under change of variables. That is,



or



For functions which are not monotonic the probability density function for y is

where is the number of solutions in x for the equation , and are these solutions.

It is tempting to think that in order to find the expected value one must first find the probability density of the new random variable . However, rather than computing

one may find instead

The values of the two integrals are the same in all cases in which both X and actually have probability density functions. It is not necessary that g be a one-to-one function. In some cases the latter integral is computed much more easily than the former.

Multiple variables


The above formulas can be generalized to variables (which we will again call y) depending on more than one other variable. shall denote the probability density function of the variables y depends on, and the dependence shall be . Then, the resulting density function is



where the integral is over the entire (m-1)-dimensional solution of the subscripted equation and the symbolic dV must be replaced by a parametrization of this solution for a particular calculation; the variables are then of course functions of this parametrization.

This derives from the following, perhaps more intuitive representation: Suppose x is an n-dimensional random variable with joint density f. If , where H is a bijective, differentiable function, then y has density g:



with the differential regarded as the Jacobian
Jacobian

In vector calculus, the Jacobian is shorthand for either the Jacobian matrix or its determinant, the Jacobian determinant.In algebraic geometry the Jacobian of a algebraic curve means the Jacobian variety: a group variety associated to the curve, in which the curve can be embedded....
 of the inverse of H, evaluated at y.

Finding moments and variance


In particular, the nth moment
Moment (mathematics)

The concept of moment in mathematics evolved from the concept of moment in physics. The nth moment of a real-valued function f of a real variable about a value c is...
  of the probability distribution of a random variable X is given by

and the variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
 is

or, expanding, gives:

Bibliography

The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités. The modern measure-theoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitrechnung) appeared in 1933. Chapters 7 to 9 are about continuous variables. This book is filled with theory and mathematical proofs.

See also

  • Likelihood function
    Likelihood function

    In statistics, the likelihood function is a function of the parameters of a statistical model that plays a key role in statistical inference. In non-technical usage, "likelihood" is a synonym for "probability", but throughout this article only the technical definition is used....
  • Density estimation
    Density estimation

    In probability and statistics,density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function....
  • Secondary measure
    Secondary measure

    In mathematics, the secondary measure associated with a measure of positive density when there is one, is a measure of positive density , turning the secondary polynomials associated with the orthogonal polynomials for into an orthogonal system....