Normalizing constant
Encyclopedia
The concept of a normalizing constant arises in probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

 and a variety of other areas of mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

.

Definition and examples

In probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

, a normalizing constant is a constant by which an everywhere non-negative function must be multiplied so the area under its graph is 1, e.g., to make it a probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...

 or a probability mass function
Probability mass function
In probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...

. For example, if we define


we have


if we define function as


so that


Function is a probability density function. This is the density of the standard normal distribution. (Standard, in this case, means the expected value
Expected value
In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...

 is 0 and the variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...

 is 1.)

And constant is the normalizing constant of function .

Similarly,


and consequently


is a probability mass function on the set of all nonnegative integers. This is the probability mass function of the Poisson distribution
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

 with expected value λ.

Note that if the probability density function is a function of various parameters, so too will be its normalizing constant. The parametrised normalizing constant for the Boltzmann distribution
Boltzmann distribution
In chemistry, physics, and mathematics, the Boltzmann distribution is a certain distribution function or probability measure for the distribution of the states of a system. It underpins the concept of the canonical ensemble, providing its underlying distribution...

 plays a central role in statistical mechanics
Statistical mechanics
Statistical mechanics or statistical thermodynamicsThe terms statistical mechanics and statistical thermodynamics are used interchangeably...

. In that context, the normalizing constant is called the partition function
Partition function (statistical mechanics)
Partition functions describe the statistical properties of a system in thermodynamic equilibrium. It is a function of temperature and other parameters, such as the volume enclosing a gas...

.

Bayes' theorem

Bayes' theorem
Bayes' theorem
In probability theory and applications, Bayes' theorem relates the conditional probabilities P and P. It is commonly used in science and engineering. The theorem is named for Thomas Bayes ....

 says that the posterior probability measure is proportional to the product of the prior probability measure and the likelihood function
Likelihood function
In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...

. Proportional to implies that one must multiply or divide by a normalizing constant to assign measure 1 to the whole space, i.e., to get a probability measure. In a simple discrete case we have


where P(H0) is the prior probability that the hypothesis is true; P(D|H0) is the conditional probability
Conditional probability
In probability theory, the "conditional probability of A given B" is the probability of A if B is known to occur. It is commonly notated P, and sometimes P_B. P can be visualised as the probability of event A when the sample space is restricted to event B...

 of the data given that the hypothesis is true, but given that the data are known it is the likelihood
Likelihood function
In statistics, a likelihood function is a function of the parameters of a statistical model, defined as follows: the likelihood of a set of parameter values given some observed outcomes is equal to the probability of those observed outcomes given those parameter values...

 of the hypothesis (or its parameters) given the data; P(H0|D) is the posterior probability that the hypothesis is true given the data. P(D) should be the probability of producing the data, but on its own is difficult to calculate, so an alternative way to describe this relationship is as one of proportionality:


Since P(H|D) is a probability, the sum over all possible (mutually exclusive) hypotheses should be 1, leading to the conclusion that


In this case, the reciprocal
Multiplicative inverse
In mathematics, a multiplicative inverse or reciprocal for a number x, denoted by 1/x or x−1, is a number which when multiplied by x yields the multiplicative identity, 1. The multiplicative inverse of a fraction a/b is b/a. For the multiplicative inverse of a real number, divide 1 by the...

 of the value


is the normalizing constant. It can be extended from countably many hypotheses to uncountably many by replacing the sum by an integral.

Non-probabilistic uses

The Legendre polynomials are characterized by orthogonality
Orthogonality
Orthogonality occurs when two things can vary independently, they are uncorrelated, or they are perpendicular.-Mathematics:In mathematics, two vectors are orthogonal if they are perpendicular, i.e., they form a right angle...

 with respect to the uniform measure on the interval [− 1, 1] and the fact that they are normalized so that their value at 1 is 1. The constant by which one multiplies a polynomial so its value at 1 is 1 is a normalizing constant.

Orthonormal functions are normalized such that
with respect to some inner product <fg>.

The constant 1/√2 is used to establish the hyperbolic functions cosh and sinh from the lengths of the adjacent and opposite sides of a hyperbolic triangle
Hyperbolic triangle
In mathematics, the term hyperbolic triangle has more than one meaning.-Hyperbolic geometry:In hyperbolic geometry, a hyperbolic triangle is a figure in the hyperbolic plane, analogous to a triangle in Euclidean geometry, consisting of three sides and three angles...

.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK