Dot plot (statistics)
Encyclopedia
A dot chart or dot plot is a statistical
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

 chart consisting of data points plotted on a simple scale, typically using filled in circles. There are two common, yet very different, versions of the dot chart. The first is described by Wilkinson as a graph that has been used in hand-drawn (pre-computer era) graphs to depict distributions. The other version is described by Cleveland as an alternative to the bar chart, in which dots are used to depict the quantitative values (e.g. counts) associated with categorical variables.

Wilkinson dot plots

The dot plot as a representation of a distribution consists of group of data points plotted on a simple scale. Dot plots are used for continuous
Continuous function
In mathematics, a continuous function is a function for which, intuitively, "small" changes in the input result in "small" changes in the output. Otherwise, a function is said to be "discontinuous". A continuous function with a continuous inverse function is called "bicontinuous".Continuity of...

, quantitative, univariate
Univariate
In mathematics, univariate refers to an expression, equation, function or polynomial of only one variable. Objects of any of these types but involving more than one variable may be called multivariate...

 data. Data points may be labelled if there are few of them.

Dot plots are one of the simplest statistical plots, and are suitable for small to moderate sized data sets. They are useful for highlighting clusters and gaps, as well as outlier
Outlier
In statistics, an outlier is an observation that is numerically distant from the rest of the data. Grubbs defined an outlier as: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs....

s. Their other advantage is the conservation of numerical information. When dealing with larger data sets (around 20–30 or more data points) the related stemplot
Stemplot
A stemplot , in statistics, is a device for presenting quantitative data in a graphical format, similar to a histogram, to assist in visualizing the shape of a distribution. They evolved from Arthur Bowley's work in the early 1900s, and are useful tools in exploratory data analysis...

, box plot
Box plot
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...

 or histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

 may be more efficient, as dot plots may become too cluttered after this point. Dot plots may be distinguished from histograms in that dots are not spaced uniformly along the horizontal axis.

Although the plot appears to be simple, its computation and the statistical theory underlying it are not simple. The algorithm for computing a dot plot is closely related to kernel density estimation
Kernel density estimation
In statistics, kernel density estimation is a non-parametric way of estimating the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample...

. The size chosen for the dots affects the appearance of the plot. Choice of dot size is equivalent to choosing the bandwidth for a kernel density estimate.

Cleveland dot plots

Dot plot may also refer to plots of points that each belong to one of several categories. They are an alternative to bar charts or pie charts, and look somewhat like a horizontal bar chart where the bars are replaced by a dots at the values associated with each category. Compared to (vertical) bar charts and pie charts, Cleveland argues that dot plots allow more accurate interpretation of the graph by readers by making the labels easier to read, reducing non-data ink (or graph clutter) and supporting table look-up.

In the R
R (programming language)
R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....

programming language this type of plot is also referred to as a stripchart
or stripplot.

Other references

  • Wild, C. and Seber, G. (2000) Chance Encounters: A First Course in Data Analysis and Inference John Wiley and Sons. ISBN 0-471-32936-3

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK