Design of experiments
Encyclopedia
In general usage, design of experiments (DOE) or experimental design is the design of any information-gathering exercises where variation is present, whether under the full control of the experimenter or not. However, in statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, these terms are usually used for controlled experiments. Other types of study, and their design, are discussed in the articles on opinion poll
Opinion poll
An opinion poll, sometimes simply referred to as a poll is a survey of public opinion from a particular sample. Opinion polls are usually designed to represent the opinions of a population by conducting a series of questions and then extrapolating generalities in ratio or within confidence...

s and statistical survey
Statistical survey
Survey methodology is the field that studies surveys, that is, the sample of individuals from a population with a view towards making statistical inferences about the population using the sample. Polls about public opinion, such as political beliefs, are reported in the news media in democracies....

s (which are types of observational study
Observational study
In epidemiology and statistics, an observational study draws inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator...

), natural experiment
Natural experiment
A natural experiment is an observational study in which the assignment of treatments to subjects has been haphazard: That is, the assignment of treatments has been made "by nature", but not by experimenters. Thus, a natural experiment is not a controlled experiment...

s and quasi-experiment
Quasi-experiment
A quasi-experiment is an empirical study used to estimate the causal impact of an intervention on its target population. Quasi-experimental research designs share many similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of...

s (for example, quasi-experimental design
Quasi-experimental design
The design of a quasi-experiment relates to a particular type of experiment or other study in which one has little or no control over the allocation of the treatments or other factors being studied. The key difference in this empirical approach is the lack of random assignment...

). See Experiment
Experiment
An experiment is a methodical procedure carried out with the goal of verifying, falsifying, or establishing the validity of a hypothesis. Experiments vary greatly in their goal and scale, but always rely on repeatable procedure and logical analysis of the results...

 for the distinction between these types of experiments or studies.

In the design of experiments, the experimenter is often interested in the effect of some process or intervention (the "treatment") on some objects (the "experimental units"), which may be people, parts of people, groups of people, plants, animals, materials, etc. Design of experiments is thus a discipline that has very broad application across all the natural and social sciences.

Controlled experimentation on scurvy

In 1747, while serving as surgeon on HMS Salisbury
HMS Salisbury (1746)
HMS Salisbury was a 50-gun fourth rate ship of the line of the Royal Navy. She was built during the War of the Austrian Succession and went on to see action in the Seven Years' War, serving in the East Indies....

, James Lind carried out a controlled experiment to develop a cure for scurvy
Scurvy
Scurvy is a disease resulting from a deficiency of vitamin C, which is required for the synthesis of collagen in humans. The chemical name for vitamin C, ascorbic acid, is derived from the Latin name of scurvy, scorbutus, which also provides the adjective scorbutic...

.

Lind selected 12 men from the ship, all suffering from scurvy. Lind limited his subjects to men who "were as similar as I could have them", that is he provided strict entry requirements to reduce extraneous variation. He divided them into six pairs, giving each pair different supplements to their basic diet for two weeks. The treatments were all remedies that had been proposed:
  • A quart of cider every day
  • Twenty five gutts (drops) of elixir vitriol (sulphuric acid) three times a day upon an empty stomach,
  • One half-pint of seawater every day
  • A mixture of garlic, mustard, and horseradish in a lump the size of a nutmeg
  • Two spoonfuls of vinegar three times a day
  • Two oranges and one lemon every day.


The men who had been given citrus fruits recovered dramatically within a week. One of them returned to duty after 6 days and the other cared for the rest. The others experienced some improvement, but nothing was comparable to the citrus fruits, which were proved to be substantially superior to the other treatments.

Statistical experiments, following Charles S. Peirce

A theory of statistical inference was developed by Charles S. Peirce in "Illustrations of the Logic of Science" (1877–1878) and "A Theory of Probable Inference" (1883), two publications that emphasized the importance of randomization-based inference in statistics.

Randomized experiments

Charles S. Peirce randomly assigned volunteers to a blinded, repeated-measures design
Repeated measures design
The repeated measures design uses the same subjects with every condition of the research, including the control. For instance, repeated measures are collected in a longitudinal study in which change over time is assessed. Other studies compare the same measure under two or more different conditions...

 to evaluate their ability to discriminate weights.
Peirce's experiment inspired other researchers in psychology and education, which developed a research tradition of randomized experiments in laboratories and specialized textbooks in the 1800s.

Optimal designs for regression models

Charles S. Peirce also contributed the first English-language publication on an optimal design
Optimal design
Optimal designs are a class of experimental designs that are optimal with respect to some statistical criterion.In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated without bias and with minimum-variance...

 for regression
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

-models
Statistical model
A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...

 in 1876. A pioneering optimal design
Optimal design
Optimal designs are a class of experimental designs that are optimal with respect to some statistical criterion.In the design of experiments for estimating statistical models, optimal designs allow parameters to be estimated without bias and with minimum-variance...

 for polynomial regression
Polynomial regression
In statistics, polynomial regression is a form of linear regression in which the relationship between the independent variable x and the dependent variable y is modeled as an nth order polynomial...

 was suggested by Gergonne
Joseph Diaz Gergonne
Joseph Diaz Gergonne was a French mathematician and logician.-Life:In 1791, Gergonne enlisted in the French army as a captain. That army was undergoing rapid expansion because the French government feared a foreign invasion intended to undo the French Revolution and restore Louis XVI to full power...

 in 1815. In 1918 Kirstine Smith published optimal designs for polynomials of degree six (and less).

Sequences of experiments

The use of a sequence of experiments, where the design of each may depend on the results of previous experiments, including the possible decision to stop experimenting, is within the scope of Sequential analysis
Sequential analysis
In statistics, sequential analysis or sequential hypothesis testing is statistical analysis where the sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre-defined stopping rule as soon as significant results...

, a field that was pioneered by Abraham Wald
Abraham Wald
- See also :* Sequential probability ratio test * Wald distribution* Wald–Wolfowitz runs test...

 in the context of sequential tests of statistical hypotheses. Herman Chernoff
Herman Chernoff
Herman Chernoff is an American applied mathematician, statistician and physicist formerly a professor at MIT and currently working at Harvard University.-Education:* Ph.D., Applied Mathematics, 1948. Brown University....

 wrote an overview of optimal sequential designs, while adaptive designs have been surveyed by S. Zacks. One specific type of sequential design is the "two-armed bandit", generalized to the multi-armed bandit
Multi-armed bandit
In statistics, particularly in the design of sequential experiments, a multi-armed bandit takes its name from a traditional slot machine . Multiple levers are considered in the motivating applications in statistics. When pulled, each lever provides a reward drawn from a distribution associated...

, on which early work was done by Herbert Robbins
Herbert Robbins
Herbert Ellis Robbins was an American mathematician and statistician who did research in topology, measure theory, statistics, and a variety of other fields. He was the co-author, with Richard Courant, of What is Mathematics?, a popularization that is still in print. The Robbins lemma, used in...

 in 1952.

Principles of experimental design, following Ronald A. Fisher

A methodology for designing experiments was proposed by Ronald A. Fisher
Ronald Fisher
Sir Ronald Aylmer Fisher FRS was an English statistician, evolutionary biologist, eugenicist and geneticist. Among other things, Fisher is well known for his contributions to statistics by creating Fisher's exact test and Fisher's equation...

, in his innovative book The Design of Experiments
The Design of Experiments
The Design of Experiments is a 1935 book by the British statistician R.A. Fisher, which effectively founded the field of design of experiments. The book has been highly influential.-References:...

(1935). As an example, he described how to test the hypothesis
Hypothesis
A hypothesis is a proposed explanation for a phenomenon. The term derives from the Greek, ὑποτιθέναι – hypotithenai meaning "to put under" or "to suppose". For a hypothesis to be put forward as a scientific hypothesis, the scientific method requires that one can test it...

 that a certain lady could distinguish by flavour alone whether the milk or the tea was first placed in the cup. While this sounds like a frivolous application, it allowed him to illustrate the most important ideas of experimental design:

Comparison
In many fields of study it is hard to reproduce measured results exactly. Comparisons between treatments are much more reproducible and are usually preferable. Often one compares against a standard, scientific control
Scientific control
Scientific control allows for comparisons of concepts. It is a part of the scientific method. Scientific control is often used in discussion of natural experiments. For instance, during drug testing, scientists will try to control two groups to keep them as identical and normal as possible, then...

, or traditional treatment that acts as baseline.


Randomization
Randomization
Randomization is the process of making something random; this means:* Generating a random permutation of a sequence .* Selecting a random sample of a population ....


Random assignment is the process of assigning individuals at random to groups or to different groups in an experiment. The random assignment of individuals to groups (or conditions within a group) distinguishes a rigorous, "true" experiment from an adequate, but less-than-rigorous, "quasi-experiment".
There is an extensive body of mathematical theory that explores the consequences of making the allocation of units to treatments by means of some random mechanism such as tables of random numbers, or the use of randomization devices such as playing cards or dice. Provided the sample size is adequate, the risks associated with random allocation (such as failing to obtain a representative sample in a survey, or having a serious imbalance in a key characteristic between a treatment group and a control group) are calculable and hence can be managed down to an acceptable level. Random does not mean haphazard, and great care must be taken that appropriate random methods are used.


Replication
Replication (statistics)
In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "the repetition of the set of all the treatment combinations to be compared in...

Measurements are usually subject to variation and uncertainty
Measurement uncertainty
In metrology, measurement uncertainty is a non-negative parameter characterizing the dispersion of the values attributed to a measured quantity. The uncertainty has a probabilistic basis and reflects incomplete knowledge of the quantity. All measurements are subject to uncertainty and a measured...

. Measurements are repeated and full experiments are replicated to help identify the sources of variation, to better estimate the true effects of treatments, to further strengthen the experiment's reliability and validity, and to add to the existing knowledge of about the topic. However, certain conditions must be met before the replication of the experiment is commenced: the original research question has been published in a peer-reviewed journal or widely cited, the researcher is independent of the original experiment, the researcher must first try to replicate the original findings using the original data, and the write-up should state that the study conducted is a replication study that tried to follow the original study as strictly as possible.


Blocking
Blocking (statistics)
In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups that are similar to one another. For example, an experiment is designed to test a new drug on patients. There are two levels of the treatment, drug, and placebo, administered to male...

Blocking is the arrangement of experimental units into groups (blocks) consisting of units that are similar to one another. Blocking reduces known but irrelevant sources of variation between units and thus allows greater precision in the estimation of the source of variation under study.


Orthogonality
Orthogonality
Orthogonality occurs when two things can vary independently, they are uncorrelated, or they are perpendicular.-Mathematics:In mathematics, two vectors are orthogonal if they are perpendicular, i.e., they form a right angle...


Orthogonality concerns the forms of comparison (contrasts) that can be legitimately and efficiently carried out. Contrasts can be represented by vectors and sets of orthogonal contrasts are uncorrelated and independently distributed if the data are normal. Because of this independence, each orthogonal treatment provides different information to the others. If there are T treatments and T – 1 orthogonal contrasts, all the information that can be captured from the experiment is obtainable from the set of contrasts.


Factorial experiment
Factorial experiment
In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be...

s
Use of factorial experiments instead of the one-factor-at-a-time method. These are efficient at evaluating the effects and possible interactions of several factors (independent variables).


Analysis of the design of experiment
Experiment
An experiment is a methodical procedure carried out with the goal of verifying, falsifying, or establishing the validity of a hypothesis. Experiments vary greatly in their goal and scale, but always rely on repeatable procedure and logical analysis of the results...

s was built on the foundation of the analysis of variance
Analysis of variance
In statistics, analysis of variance is a collection of statistical models, and their associated procedures, in which the observed variance in a particular variable is partitioned into components attributable to different sources of variation...

, a collection of models in which the observed variance is partitioned into components due to different factors which are estimated and/or tested.

Example

This example is attributed to Harold Hotelling
Harold Hotelling
Harold Hotelling was a mathematical statistician and an influential economic theorist.He was Associate Professor of Mathematics at Stanford University from 1927 until 1931, a member of the faculty of Columbia University from 1931 until 1946, and a Professor of Mathematical Statistics at the...

. It conveys some of the flavor of those aspects of the subject that involve combinatorial designs.

The weights of eight objects are to be measured using a pan balance and set of standard weights. Each weighing measures the weight difference between objects placed in the left pan vs. any objects placed in the right pan by adding calibrated weights to the lighter pan until the balance is in equilibrium. Each measurement has a random error
Errors and residuals in statistics
In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

. The average error is zero; the standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...

s of the probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....

 of the errors is the same number σ on different weighings; and errors on different weighings are independent
Statistical independence
In probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...

. Denote the true weights by


We consider two different experiments:
  1. Weigh each object in one pan, with the other pan empty. Let Xi be the measured weight of the ith object, for i = 1, ..., 8.
  2. Do the eight weighings according to the following schedule and let Yi be the measured difference for i = 1, ..., 8:


Then the estimated value of the weight θ1 is


Similar estimates can be found for the weights of the other items. For example



The question of design of experiments is: which experiment is better?

The variance of the estimate X1 of θ1 is σ2 if we use the first experiment. But if we use the second experiment, the variance of the estimate given above is σ2/8. Thus the second experiment gives us 8 times as much precision for the estimate of a single item, and estimates all items simultaneously, with the same precision. What is achieved with 8 weighings in the second experiment would require 64 weighings if items are weighed separately. However, note that the estimates for the items obtained in the second experiment have errors which are correlated with each other.

Many problems of the design of experiments involve combinatorial design
Combinatorial design
Combinatorial design theory is the part of combinatorial mathematics that deals with the existence and construction of systems of finite sets whose intersections have specified numerical properties....

s, as in this example.

Statistical control

It is best for a process to be in reasonable statistical control prior to conducting designed experiments. When this is not possible, proper blocking, replication, and randomization allow for the careful conduct of designed experiments.
To control for nuisance variables, researchers institute control checks as additional measures. Investigators should ensure that uncontrolled influences (e.g., source credibility perception) are measured do not skew the findings of the study. A manipulation check
Manipulation checks
Manipulation check is a term in experimental research in the social sciences which refers to certain kinds of secondary evaluations of an experiment.- Overview :...

 is one example of a control check. Manipulation checks allow investigators to isolate the chief variables to strengthen support that these variables are operating as planned.

Experimental designs after Fisher

Some efficient designs for estimating several main effects simultaneously were found by Raj Chandra Bose
Raj Chandra Bose
Raj Chandra Bose was an Indian mathematician and statistician best known for his work in design theory and the theory of error-correcting codes in which the class of BCH codes is partly named after him. He was notable for his work along with S. S. Shrikhande and E. T...

 and K. Kishen in 1940 at the Indian Statistical Institute
Indian Statistical Institute
Indian Statistical Institute is a public research institute and university in Kolkata's northern outskirt of Baranagar, India founded by Prasanta Chandra Mahalanobis in 1931...

, but remained little known until the Plackett-Burman design
Plackett-Burman design
Plackett–Burman designs are experimental designs presented in 1946 by Robin L. Plackett and J. P. Burman while working in the British Ministry of Supply....

s were published in Biometrika
Biometrika
- External links :* . The Internet Archive. 2011....

in 1946. About the same time, C. R. Rao
C. R. Rao
Calyampudi Radhakrishna Rao FRS known as C R Rao is an Indian statistician. He is currently professor emeritus at Penn State University and Research Professor at the University at Buffalo. Rao has been honored by numerous colloquia, honorary degrees, and festschrifts and was awarded the US...

 introduced the concepts of orthogonal arrays as experimental designs. This was a concept which played a central role in the development of Taguchi methods
Taguchi methods
Taguchi methods are statistical methods developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to, engineering, biotechnology, marketing and advertising...

 by Genichi Taguchi
Genichi Taguchi
is an engineer and statistician. From the 1950s onwards, Taguchi developed a methodology for applying statistics to improve the quality of manufactured goods...

, which took place during his visit to Indian Statistical Institute in early 1950s. His methods were successfully applied and adopted by Japanese and Indian industries and subsequently were also embraced by US industry albeit with some reservations.

In 1950, Gertrude Mary Cox
Gertrude Mary Cox
Gertrude Mary Cox was an influential American statistician and founder of the department of Experimental Statistics at North Carolina State University. She was later appointed director of both the Institute of Statistics of the Consolidated University of North Carolina and the Statistics Research...

 and William Gemmell Cochran
William Gemmell Cochran
William Gemmell Cochran was a prominent statistician; he was born in Scotland but spent most of his life in the United States....

 published the book Experimental Designs which became the major reference work on the design of experiments for statisticians for years afterwards.

Developments of the theory of linear model
Linear model
In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However the term is also used in time series analysis with a different...

s have encompassed and surpassed the cases that concerned early writers. Today, the theory rests on advanced topics in linear algebra
Linear algebra
Linear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...

, algebra
Algebraic statistics
Algebraic statistics is the use of algebra to advance statistics. Algebra has been useful for experimental design, parameter estimation, and hypothesis testing....

 and combinatorics
Combinatorial design
Combinatorial design theory is the part of combinatorial mathematics that deals with the existence and construction of systems of finite sets whose intersections have specified numerical properties....

.

As with other branches of statistics, experimental design is pursued using both frequentist and Bayesian
Bayesian experimental design
Bayesian experimental design provides a general probability-theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment...

 approaches: In evaluating statistical procedures like experimental designs, frequentist statistics studies the sampling distribution
Sampling distribution
In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given statistic based on a random sample. Sampling distributions are important in statistics because they provide a major simplification on the route to statistical inference...

 while Bayesian statistics
Bayesian statistics
Bayesian statistics is that subset of the entire field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief or, more specifically, Bayesian probabilities...

 updates a probability distribution
Bayesian probability
Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...

 on the parameter space.

Some important contributors to the field of experimental designs are C. S. Peirce, R. A. Fisher, F. Yates
Frank Yates
Frank Yates FRS was one of the pioneers of 20th century statistics.He was born in Manchester. Yates was the eldest of five children, and the only boy, born to Edith and Percy Yates. His father was a seed merchant. He attended Wadham House, a private school, before gaining a scholarship to Clifton...

, C. R. Rao
C. R. Rao
Calyampudi Radhakrishna Rao FRS known as C R Rao is an Indian statistician. He is currently professor emeritus at Penn State University and Research Professor at the University at Buffalo. Rao has been honored by numerous colloquia, honorary degrees, and festschrifts and was awarded the US...

, R. C. Bose, J. N. Srivastava, Shrikhande S. S., D. Raghavarao
D. Raghavarao
Damaraju Raghavarao is an Indian-born statistician, currently the Laura H. Carnell professor of statistics and chair of the department of statistics at Temple University in Philadelphia....

, W. G. Cochran, O. Kempthorne
Oscar Kempthorne
Oscar Kempthorne was a statistician and geneticist known for his research on randomization-analysis and the design of experiments, which had wide influence on research in agriculture, genetics, and other areas of science...

, W. T. Federer, V. V. Fedorov, A. S. Hedayat, J. A. Nelder
John Nelder
John Ashworth Nelder FRS was a British statistician known for his contributions to experimental design, analysis of variance, computational statistics, and statistical theory.-Contributions:...

, R. A. Bailey
Rosemary A. Bailey
Rosemary A. Bailey is a British statistician who is renowned for her work in the design of experiments and the analysis of variance and in related areas of combinatorial design, especially in association schemes....

, J. Kiefer
Jack Kiefer (mathematician)
Jack Carl Kiefer was an American statistician.- Biography :Jack Kiefer was born on January 25, 1924, in Cincinnati, Ohio, to Carl Jack Kiefer and Marguerite K. Rosenau...

, W. J. Studden, A. Pázman, F. Pukelsheim, D. R. Cox, H. P. Wynn, A. C. Atkinson, G. E. P. Box and G. Taguchi
Genichi Taguchi
is an engineer and statistician. From the 1950s onwards, Taguchi developed a methodology for applying statistics to improve the quality of manufactured goods...

. The textbooks of D. Montgomery and R. Myers have reached generations of students and practitioners.

Further reading

Pre-publication chapters are available on-line.
  • Box, G. E.
    George E. P. Box
    - External links :* from a at NIST* * * * * *** For Box's PhD students see*...

    , Hunter,W.G., Hunter, J.S., Hunter,W.G., "Statistics for Experimenters: Design, Innovation, and Discovery", 2nd Edition, Wiley, 2005, ISBN 0471718130

}
}
  • Pearl, Judea
    Judea Pearl
    Judea Pearl is a computer scientist and philosopher, best known for developing the probabilistic approach to artificial intelligence and the development of Bayesian networks ....

    . Causality: Models, Reasoning and Inference, Cambridge University Press, 2000.
  • Peirce, C. S. (1876), "Note on the Theory of the Economy of Research", Appendix No. 14 in Coast Survey Report, pp. 197–201, NOAA PDF Eprint. Reprinted 1958 in Collected Papers of Charles Sanders Peirce 7, paragraphs 139–157 and in 1967 in Operations Research 15 (4): pp. 643–648, abstract at JSTOR.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK