In
descriptive statisticsDescriptive statistics are used to describe the main features of a collection of data in quantitative terms. Descriptive statistics are distinguished from inferential statistics , in that descriptive statistics aim to quantitatively summarize a data set, rather than being used to support...
, the
interquartile range (
IQR), also called the
midspread or
middle fifty, is a measure of
statistical dispersionIn statistics, statistical dispersion is variability or spread in a variable or a probability distribution...
, being equal to the difference between the third and first
quartileIn descriptive statistics, a quartile is any of the three values which divide the sorted data set into four equal parts, so that each part represents one fourth of the sampled population....
s.
Unlike the (total)
rangeIn descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...
, the interquartile range is a robust statistic, having a breakdown point of 25%, and is thus often preferred to the total range.
The IQR is used to build
box plotIn descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries In descriptive statistics, a box plot or boxplot (also known as a box-and-whisker diagram or plot) is a convenient way of graphically depicting...
s, simple graphical representations of a
probability distributionIn probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval...
.
or a symmetric distribution (so the median equals the
midhingeIn statistics, the midhinge is the average of the first and third quartiles and is thus a measure of location.Equivalently, it is the 25% trimmed mid-range; it is an L-estimator....
, the average of the first and third quartiles), half the IQR equals the
median absolute deviationIn statistics, the median absolute deviation is a robust measure of the variability of a univariate sample of quantitative data...
(MAD).
The
medianIn probability theory and statistics, a median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest...
is the corresponding measure of central tendency.IQR = Q
3 - Q
1
From this table, the width of the hi interquartile range is 115 − 105 = 10.
+-----+-+
o * |-------| | |---|
+-----+-+
+---+---+---+---+---+---+---+---+---+---+---+---+ number line
0 1 2 3 4 5 6 7 8 9 10 11 12
For this data set:
- lower (first) quartile (, ) = 7
- median (second quartile) (, ) = 8.5
- upper (third) quartile (, ) = 9
- interquartile range,
The interquartile range of a continuous distribution can be calculated by integrating the
probability density functionIn probability theory, a probability density function —often referred to as a probability distribution function—or density, of a random variable is a function that describes the density of probability at each point in the sample space...
(which yields the
cumulative distribution functionIn probability theory and statistics, the cumulative distribution function or just distribution function, completely describes the probability distribution of a real-valued random variable X...
—any other means of calculating the CDF will also work).
In
descriptive statisticsDescriptive statistics are used to describe the main features of a collection of data in quantitative terms. Descriptive statistics are distinguished from inferential statistics , in that descriptive statistics aim to quantitatively summarize a data set, rather than being used to support...
, the
interquartile range (
IQR), also called the
midspread or
middle fifty, is a measure of
statistical dispersionIn statistics, statistical dispersion is variability or spread in a variable or a probability distribution...
, being equal to the difference between the third and first
quartileIn descriptive statistics, a quartile is any of the three values which divide the sorted data set into four equal parts, so that each part represents one fourth of the sampled population....
s.
Unlike the (total)
rangeIn descriptive statistics, the range is the length of the smallest interval which contains all the data. It is calculated by subtracting the smallest observation from the greatest and provides an indication of statistical dispersion.It is measured in the same units as the data...
, the interquartile range is a robust statistic, having a breakdown point of 25%, and is thus often preferred to the total range.
The IQR is used to build
box plotIn descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries In descriptive statistics, a box plot or boxplot (also known as a box-and-whisker diagram or plot) is a convenient way of graphically depicting...
s, simple graphical representations of a
probability distributionIn probability theory and statistics, a probability distribution identifies either the probability of each value of an unidentified random variable , or the probability of the value falling within a particular interval...
.
or a symmetric distribution (so the median equals the
midhingeIn statistics, the midhinge is the average of the first and third quartiles and is thus a measure of location.Equivalently, it is the 25% trimmed mid-range; it is an L-estimator....
, the average of the first and third quartiles), half the IQR equals the
median absolute deviationIn statistics, the median absolute deviation is a robust measure of the variability of a univariate sample of quantitative data...
(MAD).
The
medianIn probability theory and statistics, a median is described as the number separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest...
is the corresponding measure of central tendency.IQR = Q
3 - Q
1
Examples
Data setA data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. It lists values for each of the variables, such as height and weight of an object or values of random numbers....
in a table
| i |
x[i] |
Quartile |
| 1 |
102 |
| 2 |
104 |
| 3 |
105 |
Q1 |
| 4 |
107 |
| 5 |
108 |
| 6 |
109 |
Q2 (median) |
| 7 |
110 |
| 8 |
112 |
| 9 |
115 |
Q3 |
| 10 |
116 |
| 11 |
118 |
From this table, the width of the hi interquartile range is 115 − 105 = 10.
Data set in a plain-text box plotIn descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries In descriptive statistics, a box plot or boxplot (also known as a box-and-whisker diagram or plot) is a convenient way of graphically depicting...
+-----+-+
o * |-------| | |---|
+-----+-+
+---+---+---+---+---+---+---+---+---+---+---+---+ number line
0 1 2 3 4 5 6 7 8 9 10 11 12
For this data set:
- lower (first) quartile (, ) = 7
- median (second quartile) (, ) = 8.5
- upper (third) quartile (, ) = 9
- interquartile range,
Interquartile range of distributions
The interquartile range of a continuous distribution can be calculated by integrating the
probability density functionIn probability theory, a probability density function —often referred to as a probability distribution function—or density, of a random variable is a function that describes the density of probability at each point in the sample space...
(which yields the
cumulative distribution functionIn probability theory and statistics, the cumulative distribution function or just distribution function, completely describes the probability distribution of a real-valued random variable X...
—any other means of calculating the CDF will also work). The lower quartile, Q1, is a number such that integral of the PDF from -∞ to Q1 equals 0.25, while the upper quartile, Q3, is such a number that the integral from Q3 to ∞ equals 0.75; in terms of the CDF, the quartiles can be defined as follows:
The interquartile range and median of some common distributions are shown below
| Distribution |
Median |
IQR |
NormalIn probability theory and statistics, the normal distribution or Gaussian distribution is a continuous probability distribution that describes data that cluster around a mean or average. The graph of the associated probability density function is bell-shaped, with a peak at the mean, and is known...
|
μ |
2 Φ−1(0.75) ≈ 1.349 |
LaplaceIn probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponential distributions spliced together back-to-back,...
|
μ |
2b ln(2) |
CauchyThe Cauchy–Lorentz Distribution, named after Augustin Cauchy and Hendrik Lorentz, is a continuous probability distribution. As a probability distribution, it is known as the Cauchy distribution, while among physicists, it is known as a Lorentz distribution, or a Lorentz function or...
|
μ |
|