In
probability theoryProbability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...
, an
empirical measure is a
random measure arising from a particular realization of a (usually finite) sequence of
random variableIn probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...
s. The precise definition is found below. Empirical measures are relevant to
mathematical statisticsMathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...
.
The motivation for studying empirical measures is that it is often impossible to know the true underlying
probability measureIn mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as countable additivity...

. We collect observations

and compute relative frequencies. We can estimate

, or a related distribution function

by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of
empirical processThe study of empirical processes is a branch of mathematical statistics and a sub-area of probability theory. It is a generalization of the central limit theorem for empirical measures...
es provide rates of this convergence.
Definition
Let

be a sequence of independent identically distributed
random variableIn probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...
s with values in the state space
S with
probability measureIn mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as countable additivity...
P.
Definition
- The empirical measure
is defined for measurable subsets of S and given by

- where
is the indicator function and
is the Dirac measure.
For a fixed measurable set
A,
nPn(
A) is a
binomial random variable with mean
nP(
A) and variance
nP(
A)(1 −
P(
A)). In particular,

is an
unbiased estimatorIn statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...
of
P(
A).
Definition
is the
empirical measure indexed by

, a collection of measurable subsets of
S.
To generalize this notion further, observe that the empirical measure
Pn maps
measurable functionIn mathematics, particularly in measure theory, measurable functions are structure-preserving functions between measurable spaces; as such, they form a natural context for the theory of integration...
s

to their
empirical mean,
In particular, the empirical measure of
A is simply the empirical mean of the indicator function,

.
For a fixed measurable function
f,

is a random variable with mean

and variance

.
By the strong
law of large numbersIn probability theory, the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times...
,

converges to
P(A) almost surelyIn probability theory, one says that an event happens almost surely if it happens with probability one. The concept is analogous to the concept of "almost everywhere" in measure theory...
for fixed
A. Similarly

converges to

almost surely for a fixed measurable function
f. The problem of uniform convergence of

to
P was open until Vapnik and Chervonenkis solved it in 1968.
If the class

(or

) is Glivenko–Cantelli with respect to
P then

converges to
P uniformly over

(or

). In other words, with probability 1 we have

Empirical distribution function
The
empirical distribution function provides an example of empirical measures. For real-valued iid random variables

it is given by
In this case, empirical measures are indexed by a class

It has been shown that

is a uniform Glivenko–Cantelli class, in particular,
with probability 1.