All Topics  
Margin of error

 

   Email Print
   Bookmark   Link






 

Margin of error



 
 
The margin of error is a statistic expressing the amount of random sampling error
Sampling error

In statistics, sampling error or estimation error is the Errors and residuals in statistics caused by observing a sample instead of the whole population....
 in a survey
Statistical survey

Statistical surveys are used to collect quantitative information about items in a population. Surveys of human populations and institutions are common in political polling and government, health, social science and marketing research....
's results. The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population
Statistical population

In statistics, a statistical population is a Set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population....
.

margin of error is usually defined as the "radius" (or half the width) of a confidence interval
Confidence interval

In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
 for a particular statistic
Statistic

A statistic is the result of applying a function to a Data set.More formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the sample's distribution: the term is used both for the function and for the value of the function on a given sample....
 from a survey.






Discussion
Ask a question about 'Margin of error'
Start a new discussion about 'Margin of error'
Answer questions from other users
Full Discussion Forum



Encyclopedia


The margin of error is a statistic expressing the amount of random sampling error
Sampling error

In statistics, sampling error or estimation error is the Errors and residuals in statistics caused by observing a sample instead of the whole population....
 in a survey
Statistical survey

Statistical surveys are used to collect quantitative information about items in a population. Surveys of human populations and institutions are common in political polling and government, health, social science and marketing research....
's results. The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population
Statistical population

In statistics, a statistical population is a Set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population....
.

Explanation

The margin of error is usually defined as the "radius" (or half the width) of a confidence interval
Confidence interval

In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
 for a particular statistic
Statistic

A statistic is the result of applying a function to a Data set.More formally, statistical theory defines a statistic as a function of a sample where the function itself is independent of the sample's distribution: the term is used both for the function and for the value of the function on a given sample....
 from a survey. One example is the percent of people who prefer product A versus product B. When a single, global margin of error is reported for a survey, it refers to the maximum margin of error for all reported percentage
Percentage

In mathematics, a percentage is a way of expressing a number as a fraction of 100 . It is often denoted using the percent sign, "%". For example, 45% is equal to 45 / 100, or 0.45....
s using the full sample from the survey. If the statistic is a percentage, this maximum margin of error can be calculated as the radius of the confidence interval for a reported percentage of 50%.

The margin of error has been described as an "absolute" quantity, equal to a confidence interval radius for the statistic. For example, if the true value is 50 percentage points, and the statistic has a confidence interval radius of 5 percentage points, then we say the margin of error is 5 percentage points. As another example, if the true value is 50 people, and the statistic has a confidence interval radius of 5 people, then we might say the margin of error is 5 people.

In some cases, the margin of error is not expressed as an "absolute" quantity; rather it is expressed as a "relative" quantity. For example, suppose the true value is 50 people, and the statistic has a confidence interval radius of 5 people. If we use the "absolute" definition, the margin of error would be 5 people. If we use the "relative" definition, then we express this absolute margin of error as a percent of the true value. So in this case, the absolute margin of error is 5 people, but the "percent relative" margin of error is 10% (because 5 people are ten percent of 50 people). Often, however, the distinction is not explicitly made, yet usually is apparent from context.

Like confidence intervals, the margin of error can be defined for any desired confidence level, but usually a level of 90%, 95% or 99% is chosen (typically 95%). This level is the probability
Probability

Probability, or wikt:chance, is a way of expressing knowledge or belief that an Event will occur or has occurred. In mathematics the concept has been given an exact meaning in probability theory, that is used extensively in such areas of study as mathematics, statistics, finance, gambling, science, and philosophy to draw conclusions about t...
 that a margin of error around the reported percentage would include the "true" percentage. Along with the confidence level, the sample design
Sampling (statistics)

Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference....
 for a survey, and in particular its sample size
Sample size

The sample size of a statistical sample is the number of observations that constitute it. It is typically denoted n, a positive integer ....
, determines the magnitude of the margin of error. A larger sample size produces a smaller margin of error, all else remaining equal.

If the exact confidence intervals are used, then the margin of error takes into account both sampling error and non-sampling error. If an approximate confidence interval is used (for example, by assuming the distribution is normal and then modeling the confidence interval accordingly), then the margin of error may only take random sampling error
Sampling error

In statistics, sampling error or estimation error is the Errors and residuals in statistics caused by observing a sample instead of the whole population....
 into account. It does not represent other potential sources of error or bias
Bias (statistics)

In statistics, the term bias is used for describing several different concepts:* A biased sample is one in which some members of the population are more likely to be included than others....
 such as a non-representative sample-design, poorly phrased questions
Questionnaire construction

A questionnaire is a series of questions asked to individuals to obtain statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires become a vital instrument by which statements can be made about specific groups or people or entire populations....
, people lying or refusing to respond, the exclusion of people who could not be contacted, or miscounts and miscalculations.

Concept


Running example

A running example from the 2004 U.S. presidential campaign will be used to illustrate concepts throughout this article. According to an October 2, 2004 survey by Newsweek
Newsweek

Newsweek is an United States weekly newsmagazine published in New York City. It is distributed throughout the United States and internationally....
, 47% of registered voters would vote for John Kerry
John Kerry

John Forbes Kerry is the Junior Senator United States Senate from Massachusetts and chairman of the Senate Foreign Relations Committee.As the Presidential nominee of the Democratic Party , he was defeated by 34 electoral votes in the United States presidential election, 2004 by the Republican Party incumbent President of the United States...
/John Edwards
John Edwards

Johnny Reid "John" Edwards is an American politician who served one term as United States Senate from North Carolina. He was the Democratic Party nominee for Vice President of the United States in United States presidential election, 2004, and was a candidate for the Democratic presidential nomination in Democratic Party presidential prima...
 if the election were held on that day, 45% would vote for George W. Bush
George W. Bush

George Walker Bush served as the List of Presidents of the United States President of the United States from 2001 to 2009. He was the 46th List of Governors of Texas from 1995 to 2000 before being United States presidential inauguration as President on January 20, 2001....
/Dick Cheney
Dick Cheney

Richard Bruce "Dick" Cheney served as the List of Vice Presidents of the United States Vice President of the United States from 2001 to 2009 in the George W....
, and 2% would vote for Ralph Nader
Ralph Nader

Ralph Nader is an American attorney at law, author, lecturer, political activism, and perennial candidate for presidency as an independent candidate for President of the United States in United States presidential election, 2004 and United States presidential election, 2008, and a Green Party candidate in 1996 and 2000....
/Peter Camejo
Peter Camejo

Peter Miguel Camejo was an United States author, activist and politician. In 2004, he was selected by independent candidate Ralph Nader as his Vice President of the United States running mate on a ticket which had the endorsement of the Reform Party of the United States of America....
. The size of the sample
Sample size

The sample size of a statistical sample is the number of observations that constitute it. It is typically denoted n, a positive integer ....
 was 1,013. Unless otherwise stated, the remainder of this article uses a 95% level of confidence.

Basic concept

Polls typically involve taking a sample from a certain population. In the case of the Newsweek poll, the population of interest is the population of people who will vote. Because it is impractical to poll everyone who will vote, pollsters take smaller samples that are intended to be representative, that is, a random sample
Random sample

A sample is a subject chosen from a population for investigation. A random sample is one chosen by a method involving an unpredictable component....
 of the population. It is possible that pollsters sample 1,013 voters who happen to vote for Bush when in fact the population is evenly split between Bush and Kerry, but this is extremely unlikely (p = 2−1013 ˜ 1.1 × 10−305) given that the sample is random.

Sampling theory
Sampling theory

sampling theory may mean:* Nyquist?Shannon sampling theorem, digital signal processing * statistics, statistical sampling* Fourier sampling...
 provides methods for calculating the probability that the poll results differ from reality by more than a certain amount, simply due to chance; for instance, that the poll reports 47% for Kerry but his support is actually as high as 50%, or is really as low as 44%. This theory and some Bayesian
Bayesian

Bayesian refers to methods in probability and statistics named after the Reverend Thomas Bayes , in particular methods related to:* the degree-of-belief interpretation of probability, as opposed to frequency or proportion or propensity interpretations; or...
 assumptions suggest that the "true" percentage will probably be fairly close to 47%. The more people that are sampled, the more confident pollsters can be that the "true" percentage is close to the observed percentage. The margin of error is a measure of how close the results are likely to be.

However, the margin of error only accounts for random sampling error, so it is blind to systematic errors that may be introduced by non-response
Response rate

Response rate in Statistical survey research refers to the ratio of number of people who answered the survey divided by the number of people in the sample ....
 or by interactions between the survey and subjects' memory, motivation, communication and knowledge.

Calculations assuming random sampling

This section will briefly discuss the standard error
Standard error (statistics)

The standard error of a method of measurement or estimation is the standard deviation of the sampling distribution associated with the estimation method....
 of a percentage, the corresponding confidence interval
Binomial proportion confidence interval

In statistics, a binomial proportion confidence interval is a confidence interval for a proportion in a statistical population. It uses the proportion estimated in a statistical sample and allows for sampling error....
, and connect these two concepts to the margin of error. For simplicity, the calculations here assume the poll was based on a simple random sample from a large population.

The standard error of a reported proportion or percentage p measures its accuracy, and is the estimated standard deviation of that percentage. It can be estimated from just p and the sample size, n, if n is small relative to the population size, using the following formula:



When the sample is not a simple random sample
Simple random sample

In statistics, a simple random sample is a subset of individuals chosen from a larger set . Each individual is chosen randomization and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has the same probability of being chosen...
 from a large population, the standard error and the confidence interval must be estimated through more advanced calculations. In most cases, the true confidence interval is approximated by assuming the distribution is normal, and inputing the interval. For normal distributions, the confidence interval radii are proportional to the standard error. Usually, the true standard error is unknown, so an estimate's standard error is calculated from the sample data.

Note that there is not necessarily a strict connection between the true confidence interval, and the true standard error. The true p percent confidence interval is the interval [a, b] that contains p percent of the distribution, and where (100 − p)/2 percent of the distribution lies below a, and (100 − p)/2 percent of the distribution lies above b. The true standard error of the statistic is the square root of the true sampling variance of the statistic. These two may not be directly related, although in general, for large distributions that look like normal curves, there is a direct relationship.

In the Newsweek poll, Kerry's level of support p = 0.47 and n = 1,013. The standard error (.016 or 1.6%) helps to give a sense of the accuracy of Kerry's estimated percentage (47%). A Bayesian
Bayesian

Bayesian refers to methods in probability and statistics named after the Reverend Thomas Bayes , in particular methods related to:* the degree-of-belief interpretation of probability, as opposed to frequency or proportion or propensity interpretations; or...
 interpretation of the standard error is that although we do not know the "true" percentage, it is highly likely to be located within two standard errors of the estimated percentage (47%). The standard error can be used to create a confidence interval within which the "true" percentage should be to a certain level of confidence.

The estimated percentage plus or minus
Plus-minus sign

The plus-minus sign is a mathematical symbol commonly used to indicate the accuracy and precision of an approximation, or as a convenient notation for a value that can be of either sign....
 its margin of error is a confidence interval for the percentage. In other words, the margin of error is half the width of the confidence interval. It can be calculated as a multiple of the standard error, with the factor depending of the level of confidence desired; a margin of one standard error gives a 68% confidence interval, while the estimate plus or minus 1.96 standard errors is a 95% confidence interval, and a 99% confidence interval runs 2.58 standard errors on either side of the estimate.

Definition

The margin of error for a particular statistic of interest is usually defined as the radius (or half the width) of the confidence interval for that statistic. The term can also be used to mean sampling error in general. In media reports of poll results, the term usually refers to the maximum margin of error for any percentage from that poll.

Maximum margin of error

The maximum margin of error for any percentage is the radius of the confidence interval when p = 50%. As such, it can be calculated directly from the number of poll respondents. For 95% confidence, assuming a simple random sample
Simple random sample

In statistics, a simple random sample is a subset of individuals chosen from a larger set . Each individual is chosen randomization and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has the same probability of being chosen...
 from a large population:



This calculation gives a margin of error of 3% for the Newsweek poll, which reported a margin of error of 4%. The difference was probably due to weighting or complex features of the sampling design that required alternative calculations for the standard error. It is also possible that Newsweek have rounded conservatively to avoid overstating the confidence of their results.

Different confidence levels

For a simple random sample
Simple random sample

In statistics, a simple random sample is a subset of individuals chosen from a larger set . Each individual is chosen randomization and entirely by chance, such that each individual has the same probability of being chosen at any stage during the sampling process, and each subset of k individuals has the same probability of being chosen...
 from a large population, the maximum margin of error is a simple re-expression of the sample size n. The numerators of these equations are rounded to two decimal places.

Margin of error at 99% confidence


Margin of error at 95% confidence


Margin of error at 90% confidence


If an article about a poll does not report the margin of error, but does state that a simple random sample of a certain size was used, the margin of error can be calculated for a desired degree of confidence using one of the above formulae. Also, if the 95% margin of error is given, one can find the 99% margin of error by increasing the reported margin of error by about 30%.

Maximum and specific margins of error

While the margin of error typically reported in the media is a poll-wide figure that reflects the maximum sampling variation of any percentage based on all respondents from that poll, the term margin of error also refers to the radius of the confidence interval for a particular statistic.

The margin of error for a particular individual percentage will usually be smaller than the maximum margin of error quoted for the survey. This maximum only applies when the observed percentage is 50%, and the margin of error shrinks as the percentage approaches the extremes of 0% or 100%.

In other words, the maximum margin of error is the radius of a 95% confidence interval for a reported percentage of 50%. If p moves away from 50%, the confidence interval for p will be shorter. Thus, the maximum margin of error represents an upper bound
Upper bound

In mathematics, especially in order theory, an upper bound of a subset S of some partially ordered set is an element of P which is greater than or equal to every element of S....
 to the uncertainty; one is at least 95% certain that the "true" percentage is within the maximum margin of error of a reported percentage for any reported percentage.

Effect of population size


The formulae above for the margin of error assume that there is an infinitely large population
Statistical population

In statistics, a statistical population is a Set of entities concerning which statistical inferences are to be drawn, often based on a random sample taken from the population....
 and thus do not depend on the size of the population of interest. According to sampling theory
Sampling (statistics)

Sampling is that part of statistical practice concerned with the selection of individual observations intended to yield some knowledge about a population of concern, especially for the purposes of statistical inference....
, this assumption is reasonable when the sampling fraction
Sampling fraction

In sampling theory, sampling fraction is the ratio of sample size to population size.The formula for the sampling fraction is = n/N where n is the sample size and N is the population size....
 is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling fraction is less than 5%.

In cases where the sampling fraction exceeds 5%, analysts can adjust the margin of error using "finite population correction," (FPC) to account for the added precision gained by sampling close to a larger percentage of the population. FPC can be calculated using the formula:



To adjust for a large sampling fraction, the fpc factored into to the calculation of the margin of error, which has the effect of narrowing the margin of error. It holds that the fpc approaches zero as the sample size (n) approaches the population size (N), which has the effect of eliminating the margin of error entirely. This makes intuitive sense because when N = n, the sample becomes a census and sampling error becomes moot.

Analysts should be mindful that the sample remain truly random as the sampling fraction grows, lest sampling bias be introduced.

Other statistics

Confidence intervals can be calculated, and so can margins of error, for a range of statistics including individual percentages, differences between percentages, averages, medians and totals.

The margin of error for the difference between two percentages is larger than the margins of error for each of these percentages, and may even be larger than the maximum margin of error for any individual percentage from the survey.

Comparing percentages

In a plurality voting system
Plurality voting system

The plurality voting system is a single-winner voting system often used to elect executive officers or to elect members of a legislative assembly which is based on single-member Constituency....
, it is important to know who is ahead. The terms "statistical tie" and "statistical dead heat" are sometimes used to describe reported percentages that differ by less than a margin of error, but these terms can be misleading. For one thing, the margin of error as generally calculated is applicable to an individual percentage and not the difference between percentages, so the difference between two percentage estimates may not be statistically significant
Statistical significance

In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. "A statistically significant difference" simply means there is statistical evidence that there is a difference; it does not mean the difference is necessarily large, important, or significant in the common meaning of the word....
 even when they differ by more than the reported margin of error. The survey results also often provide strong information even when there is not a statistically significant difference.

When comparing percentages, it can accordingly be useful to consider the probability that one percentage is higher than another. In simple situations, this probability can be derived with 1) the standard error calculation introduced earlier, 2) the formula
Formula

In mathematics and in the sciences, a formula is a concise way of expressing information symbolically , or a general relationship between quantities....
 for the variance
Variance

In probability theory and statistics, the variance of a random variable, probability distribution, or sample is one measure of statistical dispersion, averaging the squared distance of its possible values from the expected value ....
 of the difference of two random variable
Random variable

In mathematics, random variables are used in the study of Randomness and probability. They were developed to assist in the analysis of Game of chance, stochastic events, and the results of experiment by capturing only the mathematical properties necessary to answer probability questions....
s, and 3) an assumption that if anyone does not choose Kerry they will choose Bush, and vice versa; they are perfectly negatively correlate
Correlation

In probability theory and statistics, correlation indicates the strength and direction of a linear relationship between two random variables....
d. This may not be a tenable assumption when there are more than two possible poll responses. For more complex survey designs, different formulas for calculating the standard error of difference must be used.

The standard error of the difference of percentages p for Kerry and q for Bush, assuming that they are perfectly negatively correlated, follows:



Given the observed percentage difference pq (2% or 0.02) and the standard error of the difference calculated above (.03), any statistical calculator may be used to calculate the probability that a sample from a normal distribution
Normal distribution

The normal distribution, also called the Gaussian distribution, is an important family of continuous probability distributions, applicable in many fields....
 with mean
Mean

In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
 0.02 and standard deviation
Standard deviation

In statistics, standard deviation is a simple measure of the variability or statistical dispersion of a data set. A low standard deviation indicates that all of the data points are very close to the same value , while high standard deviation indicates that the data are ?spread out? over a large range of values....
 0.03 is greater than 0.

See also

  • Confidence interval
    Confidence interval

    In statistics, a confidence interval is an interval estimation of a population parameter. Instead of estimating the parameter by a single value, an interval likely to include the parameter is given....
  • Engineering tolerance
  • Key relevance
    Key relevance

    In master locksmithing, key relevance is the measurable difference between an original key and a copy made of that key, either from a wax impression or directly from the original, and how similar the two keys are in size and shape." It can also refer to the measurable difference between a key and the size required to fit and operate the keyw...


External links