Covariance
Encyclopedia
In probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

 and statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, covariance is a measure of how much two variables change together. Variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...

 is a special case of the covariance when the two variables are identical.

Definition

The covariance between two real
Real number
In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...

-valued random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...

s X and Y with finite second moments is


where E[X] is the expected value
Expected value
In probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...

 of X. By using some properties of expectations, this can be simplified to


For random vectors X and Y (of dimension m and n respectively) the m×n covariance matrix
Covariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...

 is equal to


where MT is the transpose
Transpose
In linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...

 of a matrix (or vector) M.

The (i,j)-th element of this matrix is equal to the covariance Cov(Xi, Yj) between the i-th scalar component of X and the j-th scalar component of Y. In particular, Cov(YX) is the transpose
Transpose
In linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...

  of Cov(XY).

Random variables whose covariance is zero are called uncorrelated
Uncorrelated
In probability theory and statistics, two real-valued random variables are said to be uncorrelated if their covariance is zero. Uncorrelatedness is by definition pairwise; i.e...

.

The units of measurement of the covariance Cov(XY) are those of X times those of Y. By contrast, correlation
Correlation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....

, which depends on the covariance, is a dimensionless measure of linear dependence.

Properties

If X, Y, W, and V are real-valued random variables and a, b, c, d are constant ("constant" in this context means non-random), then the following facts are a consequence of the definition of covariance:


For sequences X1, ..., Xn and Y1, ..., Ym of random variables, we have


For a sequence X1, ..., Xn of random variables, and constants a1, ..., an, we have

Uncorrelatedness and independence

If X and Y are independent
Statistical independence
In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...

, then their covariance is zero. This follows because under independence,


The converse, however, is generally not true: Some pairs of random variables have covariance zero although they are not independent.

In order to understand how the converse of this proposition is not generally true, consider the example where Y = X2, E[X] = 0, and E[X3] = 0. In this case, X and Y are obviously not independently distributed.

Relationship to inner products

Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar properties to those of an inner product:
  1. bilinear
    Bilinear operator
    In mathematics, a bilinear operator is a function combining elements of two vector spaces to yield an element of a third vector space that is linear in each of its arguments. Matrix multiplication is an example.-Definition:...

    : for constants a and b and random variables X, Y, and U, Cov(aX + bYU) = a Cov(XU) + b Cov(YU)
  2. symmetric: Cov(XY) = Cov(YX)
  3. positive semi-definite
    Definite bilinear form
    In mathematics, a definite bilinear form is a bilinear form B over some vector space V such that the associated quadratic formQ=B \,...

    : Var(X) = Cov(XX) ≥ 0, and Cov(XX) = 0 implies that X is a constant random variable (K).


In fact these properties imply that the covariance defines an inner product over the quotient vector space
Quotient space (linear algebra)
In linear algebra, the quotient of a vector space V by a subspace N is a vector space obtained by "collapsing" N to zero. The space obtained is called a quotient space and is denoted V/N ....

 obtained by taking the subspace of random variables with finite second moment and identifying any two that differ by a constant. (This identification turns the positive semi-definiteness above into positive definiteness.) That quotient vector space is isomorphic to the subspace of random variables with finite second moment and mean zero; on that subspace, the covariance is exactly the L2
Lp space
In mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces...

 inner product of real-valued functions on the sample space.

As a result for random variables with finite variance the following inequality holds via the Cauchy–Schwarz inequality
Cauchy–Schwarz inequality
In mathematics, the Cauchy–Schwarz inequality , is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, and other areas...

:


Proof: If Var(Y) = 0, then it holds trivially. Otherwise, let random variable


Then we have:


QED.

Calculating the sample covariance

The sample covariance of N observations of K variables is the K-by-K matrix
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

  with the entries given by


The sample mean and the sample covariance matrix are unbiased estimates
Bias of an estimator
In statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...

 of the mean
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....

 and the covariance matrix
Covariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...

 of the random vector , a row vector whose jth element (j = 1, ..., K) is one of the random variables. The reason the sample covariance matrix has in the denominator rather than is essentially that the population mean is not known and is replaced by the sample mean . If the population mean is known, the analogous unbiased estimate

Comments

The covariance is sometimes called a measure of "linear dependence" between the two random variables. That does not mean the same thing as in the context of linear algebra
Linear algebra
Linear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...

 (see linear dependence). When the covariance is normalized, one obtains the correlation matrix. From it, one can obtain the Pearson coefficient, which gives us the goodness of the fit for the best possible linear function describing the relation between the variables. In this sense covariance is a linear gauge of dependence.

See also

  • Covariance function
    Covariance function
    In probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field...

  • Covariance matrix
    Covariance matrix
    In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...

  • Covariance operator
    Covariance operator
    in probability theory, for a probability measure P on a Hilbert space H with inner product \langle \cdot,\cdot\rangle , the covariance of P is the bilinear form Cov: H × H → R given by...

  • Correlation
    Correlation
    In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....

  • Eddy covariance
    Eddy covariance
    The eddy covariance technique is a key atmospheric flux measurement technique to measure and calculate vertical turbulent fluxes within atmospheric boundary layers...

  • Law of total covariance
    Law of total covariance
    In probability theory, the law of total covariance or covariance decomposition formula states that if X, Y, and Z are random variables on the same probability space, and the covariance of X and Y is finite, then...

  • Autocovariance
    Autocovariance
    In statistics, given a real stochastic process X, the autocovariance is the covariance of the variable with itself, i.e. the variance of the variable against a time-shifted version of itself...

  • Analysis of covariance
  • Algorithms for calculating variance#Covariance

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK