Targeted projection pursuit
Encyclopedia
Targeted projection pursuit is a type of statistical technique used for exploratory data analysis
Exploratory data analysis
In statistics, exploratory data analysis is an approach to analysing data sets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis...

, information visualization
Information visualization
Information visualization is the interdisciplinary study of "the visual representation of large-scale collections of non-numerical information, such as files and lines of code in software systems, library and bibliographic databases, networks of relations on the internet, and so forth".- Overview...

, and feature selection
Feature selection
In machine learning and statistics, feature selection, also known as variable selection, feature reduction, attribute selection or variable subset selection, is the technique of selecting a subset of relevant features for building robust learning models...

. It allows the user to interactively explore very complex data (typically having tens to hundreds of attributes) to find features or patterns of potential interest.

Conventional, or 'blind', projection pursuit
Projection pursuit
Projection pursuit is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a Normal distribution are considered to be more interesting...

, finds the most "interesting" possible projections in multidimensional data, using a search algorithm
Search algorithm
In computer science, a search algorithm is an algorithm for finding an item with specified properties among a collection of items. The items may be stored individually as records in a database; or may be elements of a search space defined by a mathematical formula or procedure, such as the roots...

 that optimizes some fixed criterion of "interestingness" – such as deviation from a normal distribution. In contrast, targeted projection pursuit allows the user to explore the space of projections by manipulating data points directly in an interactive scatter plot.

Targeted Projection Pursuit has found applications in DNA microarray
DNA microarray
A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome...

 data analysis, protein Sequence analysis
Sequence analysis
In bioinformatics, the term sequence analysis refers to the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological...

, graph layout and Digital signal processing
Digital signal processing
Digital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...

. It is available as a package for the WEKA
Weka
The Weka or woodhen is a flightless bird species of the rail family. It is endemic to New Zealand, where four subspecies are recognized. Weka are sturdy brown birds, about the size of a chicken. As omnivores, they feed mainly on invertebrates and fruit...

machine learning toolkit.

Further reading


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK