Boosting methods for object categorization
Encyclopedia
Given images containing various known objects in the world, a classifier can be learned from them to automatically categorize the objects in future images. Simple classifiers built based on some image feature of the object tend to be weak in categorization performance. Using boosting methods for object categorization, then, is a way to unify the weak classifiers in a special way to boost the overall ability of categorization.

Problem of object categorization

Object categorization is a typical task of computer vision
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

 which involves determining whether or not an image contains some specific category of object. The idea is closely related with recognition, identification, and detection. Appearance based object categorization typically contains feature extraction
Feature extraction
In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant then the input data will be transformed into a reduced representation...

, learning a classifier
Classifier
Classifier may refer to:*Classifier *Classifier *Classifier *Hierarchical classifier*Linear classifier...

, and applying the classifier to new examples. There are many ways to represent a category of objects, e.g. from shape analysis, bag of words model
Bag of words model
The bag-of-words model is a simplifying assumption used in natural language processing and information retrieval. In this model, a text is represented as an unordered collection of words, disregarding grammar and even word order.The bag-of-words model is used in some methods of document...

s, or local descriptors such as SIFT
Scale-invariant feature transform
Scale-invariant feature transform is an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999....

, etc. Examples of supervised classifiers
Supervised learning
Supervised learning is the machine learning task of inferring a function from supervised training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value...

 are Naive Bayes classifier
Naive Bayes classifier
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong independence assumptions...

, SVM
Support vector machine
A support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...

, mixtures of Gaussians, neural network
Neural network
The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...

, etc. However, recent research has shown that object categories and their locations in images can be discovered in an unsupervised manner
Unsupervised learning
In machine learning, unsupervised learning refers to the problem of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution...

 as well.

Status quo for object categorization

The recognition of object categories in images is a challenging problem in computer vision, especially when the number of categories is large. This is due to high intra class variability and the need for generalization across variations of objects within the same category. Objects within one category may look quite different. Even the same object may appear unalike under different viewpoint, scale
Scaling (geometry)
In Euclidean geometry, uniform scaling is a linear transformation that enlarges or shrinks objects by a scale factor that is the same in all directions. The result of uniform scaling is similar to the original...

, and illumination
Illumination (image)
Processing of illumination is an important concept in computer vision and computer graphics....

. Background clutter and partial occlusion add difficulties to recognition as well. Humans are able to recognize thousands of object types, whereas most of the existing object recognition systems are trained to recognize only a few, e.g., human face, car, simple objects, etc. Research has been very active on dealing with more categories and enabling incremental additions of new categories, and although the general problem remains unsolved, several multi-category objects detectors (number of categories around 20) for clustered scenes have been developed. One means is by feature
Feature (Computer vision)
In computer vision and image processing the concept of feature is used to denote a piece of information which is relevant for solving the computational task related to a certain application...

 sharing and boosting
Boosting
Boosting is a machine learning meta-algorithm for performing supervised learning. Boosting is based on the question posed by Kearns: can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification...

.

Boosting methods in machine learning

Boosting
Boosting
Boosting is a machine learning meta-algorithm for performing supervised learning. Boosting is based on the question posed by Kearns: can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification...

 is a general method for improving the accuracy of any given learning algorithm.
A typical application of AdaBoost
AdaBoost
AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent...

 as one of the popular boosting algorithms is fast face detection
Face detection
Face detection is a computer technology that determines the locations and sizes of human faces in arbitrary images. It detects facial features and ignores anything else, such as buildings, trees and bodies....

 by P. Viola and M. Jones, Viola–Jones object detection framework. There AdaBoost
AdaBoost
AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent...

 is used both to select good features (very simple rectangles) and to turn weak learners into a final strong classifier.

Boosting for binary categorization

We use Adaboost for face detection as an example of binary categorization. The two categories are faces versus background. The general algorithm is as follows:
  1. Form a large set of simple features
  2. Initialize weights for training images
  3. for T rounds
    1. Normalize the weights
    2. For available features from the set, train a classifier using a single feature and evaluate the training error
    3. Choose the classifier with the lowest error
    4. Update the weights of the training images: increase if classified wrongly by this classifier, decrease if correctly
  4. Form the final strong classifier as the linear combination of the T classifiers (coefficient larger if training error is small)

After boosting, a classifier constructed from 200 features could yield a 95% detection rate under a false positive rate
Type I and type II errors
In statistical test theory the notion of statistical error is an integral part of hypothesis testing. The test requires an unambiguous statement of a null hypothesis, which usually corresponds to a default "state of nature", for example "this person is healthy", "this accused is not guilty" or...

.

Another application of boosting
Boosting
Boosting is a machine learning meta-algorithm for performing supervised learning. Boosting is based on the question posed by Kearns: can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification...

 for binary categorization is a system which detects pedestrians using patterns of motion and appearance. This work is the first to combine both motion information and appearance information as features to detect a walking person. It takes a similar approach as the face detection work of Viola and Jones.

Boosting for multi-class categorization

Compared with binary categorization, multi-class categorization looks for common features that can be shared across the categories at the same time. They turn to be more generic edge
Edge detection
Edge detection is a fundamental tool in image processing and computer vision, particularly in the areas of feature detection and feature extraction, which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities...

 like features. During learning, the detectors for each category can be trained jointly. Compared with training separately, it generalizes
Generalization
A generalization of a concept is an extension of the concept to less-specific criteria. It is a foundational element of logic and human reasoning. Generalizations posit the existence of a domain or set of elements, as well as one or more common characteristics shared by those elements. As such, it...

 better, needs less training data, and requires less number of features to achieve same performance.

The main flow of the algorithm is similar to the binary case. What is different is that a measure of the joint training error shall be defined in advance. During each iteration the algorithm chooses a classifier of a single feature (features which can be shared by more categories shall be encouraged). This can be done via converting multi-class classification into a binary one (a set of categories versus the rest) , or by introducing a penalty error from the categories which do not have the feature of the classifier.

In the paper "Sharing visual features for multiclass and multiview object detection", A. Torralba et al. used GentleBoost for boosting and showed that when training data is limited, learning via sharing features does a much better job than no sharing, given same boosting rounds. Also, for a given performance level, the total number of features required (and therefore the run time cost of the classifier) for the feature sharing detectors, is observed to scale approximately logarithm
Logarithm
The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...

ically with the number of class, i.e., slower than linear
Linear
In mathematics, a linear map or function f is a function which satisfies the following two properties:* Additivity : f = f + f...

 growth in the non-sharing case. Similar results are shown in the paper "Incremental learning of object detectors using a visual shape alphabet", yet the authors used AdaBoost
AdaBoost
AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent...

for boosting.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK