Holland's Schema Theorem
Encyclopedia
Holland's schema theorem is widely taken to be the foundation for explanations of the power of genetic algorithms. It was proposed by John Holland in the 1970s.

A schema
Schema (genetic algorithms)
A schema is a template in computer science used in the field of genetic algorithms that identifies a subset of strings with similarities at certain string positions. Schemata are a special case of cylinder sets; and so form a topological space.- Description :...

 is a template that identifies a subset
Subset
In mathematics, especially in set theory, a set A is a subset of a set B if A is "contained" inside B. A and B may coincide. The relationship of one set being a subset of another is called inclusion or sometimes containment...

 of strings with similarities at certain string positions. Schemata are a special case of cylinder set
Cylinder set
In mathematics, a cylinder set is the natural open set of a product topology. Cylinder sets are particularly useful in providing the base of the natural topology of the product of a countable number of copies of a set...

s; and so form a topological space
Topological space
Topological spaces are mathematical structures that allow the formal definition of concepts such as convergence, connectedness, and continuity. They appear in virtually every branch of modern mathematics and are a central unifying notion...

.

Description

For example, consider binary strings of length 6. The schema 1*10*1 describes the set of all strings of length 6 with 1's at positions 1, 3 and 6 and a 0 at position 4. The * is a wildcard
Wildcard character
-Telecommunication:In telecommunications, a wildcard character is a character that may be substituted for any of a defined subset of all possible characters....

 symbol, which means that positions 2 and 5 can have a value of either 1 or 0. The order of a schema is defined as the number of fixed positions in the template, while the defining length
Defining length
In genetic algorithms and genetic programming defining length L is the maximum distance between two defining symbols in schema H...

is the distance between the first and last specific positions. The order of 1*10*1 is 4 and its defining length is 5. The fitness of a schema is the average fitness of all strings matching the schema. The fitness of a string is a measure of the value of the encoded problem solution, as computed by a problem-specific evaluation function. Using the established methods and genetic operator
Genetic operator
A genetic operator is an operator used in genetic algorithms to maintain genetic diversity, known as Mutation and to combine existing solutions into others, Crossover...

s of genetic algorithms, the schema theorem states that short, low-order schemata with above-average fitness increase exponentially in successive generations. Expressed as an equation:


Here is the number of strings belonging to schema at generation , is the observed fitness of schema and is the observed average fitness at generation . The probability of disruption is the probability that crossover or mutation will destroy the schema . It can be expressed as:


where is the number of fixed positions, is the length of the code, is the probability of mutation and is the probability of crossover. So a schema with a shorter defining length is less likely to be disrupted.
An often misunderstood point is why the Schema Theorem is an inequality rather than an equality. The answer is in fact simple: the Theorem neglects the small, yet non-zero, probability that a string belonging to the schema will be created "from scratch" by mutation of a single string (or recombination of two strings) that did not belong to in the previous generation.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK