Stochastic grammar
Encyclopedia
A stochastic grammar is a grammar framework with a probabilistic notion of grammaticality
Grammaticality
In theoretical linguistics, grammaticality is the quality of a linguistic utterance of being grammatically well-formed. An * before a form is a mark that the cited form is ungrammatical....

:
  • Stochastic context-free grammar
    Stochastic context-free grammar
    A stochastic context-free grammar is a context-free grammar in which each production is augmented with a probability...

  • Statistical parsing
    Statistical parsing
    Statistical parsing is a group of parsing methods within natural language processing. The methods have in common that they associate grammar rules with a probability. Grammar rules are traditionally viewed in computational linguistics as defining the valid sentences in a language...

  • Data-oriented parsing
    Data-oriented parsing
    Data-oriented parsing is a probabilistic grammar formalism in computational linguistics. DOP was conceived by Remko Scha in 1990 with the aim of developing a performance-oriented grammar framework...

  • Hidden Markov model
    Hidden Markov model
    A hidden Markov model is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. An HMM can be considered as the simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E...

  • Estimation theory
    Estimation theory
    Estimation theory is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the...



Statistical natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 uses stochastic
Stochastic
Stochastic refers to systems whose behaviour is intrinsically non-deterministic. A stochastic process is one whose behavior is non-deterministic, in that a system's subsequent state is determined both by the process's predictable actions and by a random element. However, according to M. Kac and E...

, probabilistic and statistical methods, especially to resolve difficulties that arise because longer sentences are highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses. Methods for disambiguation often involve the use of corpora
Corpus linguistics
Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally done by hand, corpora are now largely...

 and Markov model
Markov model
In probability theory, a Markov model is a stochastic model that assumes the Markov property. Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable.-Introduction:...

s. "A probabilistic model consists of a non-probabilistic model plus some numerical quantities; it is not true that probabilistic models are inherently simpler or less structural than non-probabilistic models."

The technology for statistical NLP comes mainly from machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

 and data mining
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

, both of which are fields of artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 that involve learning from data.

See also

  • Colorless green ideas sleep furiously
    Colorless green ideas sleep furiously
    "Colorless green ideas sleep furiously" is a sentence composed by Noam Chomsky in his 1957 Syntactic Structures as an example of a sentence that is grammatically correct but semantically nonsensical. The term was originally used in his 1955 thesis "Logical Structures of Linguistic Theory"...

  • Computational linguistics
    Computational linguistics
    Computational linguistics is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective....


Further reading

  • Christopher D. Manning, Hinrich Schütze: Foundations of Statistical Natural Language Processing, MIT Press (1999), ISBN 978-0262133609.
  • Stefan Wermter, Ellen Riloff, Gabriele Scheler (eds.): Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Springer (1996), ISBN 978-3540609254.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK