Autoencoder
Encyclopedia
An auto-encoder is an artificial neural network
Artificial neural network
An artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...

 used for learning efficient codings.
The aim of an auto-encoder is to learn a compressed representation (encoding) for a set of data.
This means it is being used for dimensionality reduction
Dimensionality reduction
In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...

. More specifically, it is a feature extraction method.
Auto-encoders use three or more layers:
  • An input layer. For example, in a face recognition task, the neurons in the input layer could map to pixels in the photograph.
  • A number of considerably smaller hidden layers, which will form the encoding.
  • An output layer, where each neuron has the same meaning as in the input layer.


If linear neurons are used, then an auto-encoder is very similar to PCA
Principal components analysis
Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...

.

Training

An auto-encoder is often trained using one of the many backpropagation
Backpropagation
Backpropagation is a common method of teaching artificial neural networks how to perform a given task. Arthur E. Bryson and Yu-Chi Ho described it as a multi-stage dynamic system optimization method in 1969 . It wasn't until 1974 and later, when applied in the context of neural networks and...

 variants (conjugate gradient method
Conjugate gradient method
In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is symmetric and positive-definite. The conjugate gradient method is an iterative method, so it can be applied to sparse systems that are too...

, steepest descent, etc.) Though often reasonably effective, there are fundamental problems with using backpropagation to train networks with many hidden layers. Once the errors get backpropagated to the first few layers, they are minuscule, and quite ineffectual. This causes the network to almost always learn to reconstruct the average of all the training data. Though more advanced backpropagation methods (such as the conjugate gradient method) help with this to some degree, it still results in very slow learning and poor solutions. This problem is remedied by using initial weights that approximate the final solution. The process to find these initial weights is often called pretraining.

A pretraining technique developed by Geoffrey Hinton
Geoffrey Hinton
Geoffrey Hinton is a British born informatician most noted for his work on the mathematics and applications of neural networks, and their relationship to information theory.-Career:...

 for training many-layered "deep" auto-encoders involves treating each neighboring set of two layers like a Restricted Boltzmann Machine for pre-training to approximate a good solution and then using a backpropagation technique to fine-tune.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK