Endogeneity (economics)

# Endogeneity (economics)

Discussion

Encyclopedia
In an econometric
Econometrics
Econometrics has been defined as "the application of mathematics and statistical methods to economic data" and described as the branch of economics "that aims to give empirical content to economic relations." More precisely, it is "the quantitative analysis of actual economic phenomena based on...

model
Model (economics)
In economics, a model is a theoretical construct that represents economic processes by a set of variables and a set of logical and/or quantitative relationships between them. The economic model is a simplified framework designed to illustrate complex processes, often but not always using...

, a parameter or variable is said to be endogenous when there is a correlation between the parameter or variable and the error term. Endogeneity can arise as a result of measurement error, autoregression with autocorrelated errors, simultaneity, omitted variables, and sample selection errors. Broadly, a loop of causality between the independent and dependent variables of a model leads to endogeneity.

For example, in a simple supply and demand
Supply and demand
Supply and demand is an economic model of price determination in a market. It concludes that in a competitive market, the unit price for a particular good will vary until it settles at a point where the quantity demanded by consumers will equal the quantity supplied by producers , resulting in an...

model, when predicting the quantity demanded in equilibrium, the price is endogenous because producers change their price in response to demand and consumers change their demand in response to price. In this case, the price variable is said to have total endogeneity once the demand and supply curves are known. In contrast, a change in consumer
Consumer
Consumer is a broad label for any individuals or households that use goods generated within the economy. The concept of a consumer occurs in different contexts, so that the usage and significance of the term may vary.-Economics and marketing:...

tastes or preference
Preference
-Definitions in different disciplines:The term “preferences” is used in a variety of related, but not identical, ways in the scientific literature. This makes it necessary to make explicit the sense in which the term is used in different social sciences....

s would be an exogenous
Exogenous
Exogenous refers to an action or object coming from outside a system. It is the opposite of endogenous, something generated from within the system....

change on the demand curve
Demand curve
In economics, the demand curve is the graph depicting the relationship between the price of a certain commodity, and the amount of it that consumers are willing and able to purchase at that given price. It is a graphic representation of a demand schedule...

.

## Exogeneity vs. Endogeneity

In a stochastic model, the notion of the usual exogeneity, sequential exogeneity, strong\strict exogeneity can be defined. Exogeneity is articulated in such a way that a variable or variables is exogenous for parameter . Even if a variable is exogenous for parameter , it might be endogenous for parameter .

When the explanatory variables are not stochastic, then they are strong exogenous for all the parameters.
In econometrics
Econometrics
Econometrics has been defined as "the application of mathematics and statistical methods to economic data" and described as the branch of economics "that aims to give empirical content to economic relations." More precisely, it is "the quantitative analysis of actual economic phenomena based on...

the problem of endogeneity occurs when the independent variable
Independent variable
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...

is correlated
Correlation
In statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....

with the error term in a regression
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

model. This implies that the regression coefficient in an Ordinary Least Squares (OLS)
Ordinary least squares
In statistics, ordinary least squares or linear least squares is a method for estimating the unknown parameters in a linear regression model. This method minimizes the sum of squared vertical distances between the observed responses in the dataset and the responses predicted by the linear...

regression is biased
Bias of an estimator
In statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...

, however if the correlation is not contemporaneous, then it may still be consistent. There are many methods of overcoming this, including instrumental variable
Instrumental variable
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables is used to estimate causal relationships when controlled experiments are not feasible....

regression and Heckman selection correction
Heckman correction
The Heckman correction is any of a number of related statistical methods developed by James Heckman in 1976 through 1979 which allow the researcher to correct for selection bias...

.

#### Omitted Variable

In this case, the endogeneity comes from an uncontrolled confounding variable. A variable is both correlated with an independent variable in the model and with the error term. (Equivalently, the omitted variable both affects the independent variable and separately affects the dependent variable.) Assume that the "true" model to be estimated is,

but we omit (perhaps because we don't have a measure for it) when we run our regression. will get absorbed by the error term and we will actually estimate,
(where )

If the correlation of and is not 0 and separately affects (meaning ), then is correlated with the error term .

Here, x and 1 are not exogenous for alpha and beta since, given x and 1, the distribution of y depends not only on alpha and beta, but also on z and gamma.

#### Measurement Error

Suppose that we do not get a perfect measure of one of our independent variables. Imagine that instead of observing we observe where is the measurement "noise". When we try to estimate the following univariate regression,

we actually end up estimating,
(where )

Since both and depend on , they are correlated. Measurement error in the dependent variable, however, does not cause endogeneity (though it does increase the variance of the error term).

### Dynamic Models

The endogeneity problem is particularly relevant in the context of time series
Time series
In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...

analysis of causal processes. It is common for some factors within a causal system to be dependent for their value in period t on the values of other factors in the causal system in period t-1. Suppose that the level of pest infestation is independent of all other factors within a given period, but is influenced by the level of rainfall and fertilizer in the preceding period. In this instance it would be correct to say that infestation is exogenous
Exogenous
Exogenous refers to an action or object coming from outside a system. It is the opposite of endogenous, something generated from within the system....

within the period, but endogenous
Endogenous
Endogenous substances are those that originate from within an organism, tissue, or cell. Endogenous retroviruses are caused by ancient infections of germ cells in humans, mammals and other vertebrates...

over time.

Let the model be y=f(x,z)+u, then if the variable x is sequential exogenous for parameter , and y does not cause x in Granger sense, then the variable x is strong/strict exogenous for the parameter .

#### Simultaneity

Generally speaking, simultaneity occurs in the dynamic model, but this example is static one.

Suppose that two variables are codetermined, with each affecting the other. Suppose that we have two "structural" equations,
We can show that estimating either equation results in endogeneity. In the case of the first structural equation, we will show that . First, solving for we get (assuming that ),
Assuming that and are uncorrelated with , we find that,
Therefore, attempts at estimating either structural equation will be hampered by endogeneity.