Semantic mapping (statistics)
Encyclopedia
The semantic mapping is a dimensionality reduction
Dimensionality reduction
In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...

 method that extracts new features by clustering
Clustering
Clustering can refer to the following:In demographics:* Clustering , the gathering of various populations based on factors such as ethnicity, economics or religion.In graph theory:...

 the original features in semantic cluster
Cluster
-In science:* Cluster , a small group of atoms or molecules* Cluster chemistry, an array of bound atoms intermediate in character between a molecule and a solid...

s and combining features mapped in the same cluster
Cluster
-In science:* Cluster , a small group of atoms or molecules* Cluster chemistry, an array of bound atoms intermediate in character between a molecule and a solid...

 to generate an extracted feature. Given a data set
Data set
A data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...

, this method construct a projection matrix that can be used to mapping of data element
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

s from one high dimensional space into reduced dimensional space. The SM can be applied in construction of text mining
Text mining
Text mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as...

 and information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

 systems, as well as systems managing vectors of high dimensionality.
The SM is an alternative to principal components analysis
Principal components analysis
Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...

 and latent semantic indexing
Latent semantic indexing
Latent Semantic Indexing is an indexing and retrieval method that uses a mathematical technique called Singular value decomposition to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words...

 methods.

See also

  • Dimensionality reduction
    Dimensionality reduction
    In machine learning, dimension reduction is the process of reducing the number of random variables under consideration, and can be divided into feature selection and feature extraction.-Feature selection:...

  • Principal components analysis
    Principal components analysis
    Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...

  • Latent semantic indexing
    Latent semantic indexing
    Latent Semantic Indexing is an indexing and retrieval method that uses a mathematical technique called Singular value decomposition to identify patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is based on the principle that words...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK