Prior knowledge for pattern recognition

Pattern recognition

In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...

is a very active field of research intimately bound to machine learning

Machine learning

Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

. Also known as classification or statistical classification, pattern recognition aims at building a classifier

Classifier

Classifier may refer to:*Classifier *Classifier *Classifier *Hierarchical classifier*Linear classifier...

that can determine the class of an input pattern. This procedure, known as training, corresponds to learning an unknown decision function based only on a set of input-output pairs

that form the training data (or training set). Nonetheless, in real world applications such as character recognition, a certain amount of information on the problem is usually known beforehand. The incorporation of this prior knowledge into the training is the key element that will allow an increase of performance in many applications.

Definition

Prior knowledge refers to all information about the problem available in addition to the training data. However, in this most general form, determining a model from a finite set of samples without prior knowledge is an ill-posed problem, in the sense that a unique model may not exist. Many classifiers incorporate the general smoothness assumption that a test pattern similar to one of the training samples tends to be assigned to the same class.

The importance of prior knowledge in machine learning is suggested by its role in search and optimization. Loosely, the no free lunch theorem states that all search algorithms have the same average performance over all problems, and thus implies that to gain in performance on a certain application one must use a specialized algorithm that includes some prior knowledge about the problem.

The different types of prior knowledge encountered in pattern recognition are now regrouped under two main categories: class-invariance and knowledge on the data.

Class-invariance

A very common type of prior knowledge in pattern recognition is the invariance of the class (or the output of the classifier) to a transformation of the input pattern. This type of knowledge is referred to as transformation-invariance. The mostly used transformations used in image recognition are:

translation
Translation
Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. Whereas interpreting undoubtedly antedates writing, translation began only after the appearance of written literature; there exist partial translations of the Sumerian Epic of...

;
rotation
Rotation
A rotation is a circular movement of an object around a center of rotation. A three-dimensional object rotates always around an imaginary line called a rotation axis. If the axis is within the body, and passes through its center of mass the body is said to rotate upon itself, or spin. A rotation...

;
skewing;
scaling
Scaling
Scaling may refer to:* Scaling , a linear transformation that enlarges or diminishes objects* Reduced scales of semiconductor device fabrication processes...

.

Incorporating the invariance to a transformation

parametrized in

into a classifier of output

for an input pattern

corresponds to enforce the equality

Local invariance can also be considered for a transformation centered at

, so that

, by the constraint

in these equations can be either the decision function of the classifier or its real-valued output.

Another approach is to consider the class-invariance with respect to a "domain of the input space" instead of a transformation. In this case, the problem becomes finding

so that

where

is the membership class of the region

of the input space.

A different type of class-invariance found in pattern recognition is the permutation-invariance, i.e. invariance of the class to a permutation of elements in a structured input. A typical application of this type of prior knowledge is a classifier invariant to permutations of rows in matrix inputs.

Knowledge of the data

Other forms of prior knowledge than class-invariance concern the data more specifically and are thus of particular interest for real-world applications. The three particular cases that most often occur when gathering data are:

Unlabeled samples are available with supposed class-memberships;
Imbalance of the training set due to a high proportion of samples of a class;
Quality of the data may vary from a sample to another.

Prior knowledge of these can enhance the quality of the recognition if included in the learning. Moreover, not taking into account the poor quality of some data or a large imbalance between the classes can mislead the decision of a classifier.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.