Constraint Grammar
Encyclopedia
Constraint Grammar is a methodological paradigm for Natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 (NLP). Linguist-written, context dependent rules are compiled into a grammar that assigns grammatical tags ("readings") to words or other tokens in running text. Typical tags address lemmatisation
Lemmatisation
Lemmatisation in linguistics, is the process of grouping together the different inflected forms of a word so they can be analysed as a single item....

 (lexeme or base form), inflexion, derivation
Derivation (linguistics)
In linguistics, derivation is the process of forming a new word on the basis of an existing word, e.g. happi-ness and un-happy from happy, or determination from determine...

, syntactic function, dependency, valency
Valency (linguistics)
In linguistics, verb valency or valence refers to the number of arguments controlled by a verbal predicate. It is related, though not identical, to verb transitivity, which counts only object arguments of the verbal predicate...

, case role
Case role
In linguistic semantics, a case role is any of the spatially-distinguished parts of a process.In the sentence, "The dog ate food", the subject "dog" has the case role of "agent" and the object "food" has the case role of "theme"....

s, semantic type etc. Each rule either adds, removes, selects or replaces a tag or a set of grammatical tags in a given sentence context. Context conditions can be linked to any tag or tag set of any word anywhere in the sentence, either locally (defined distances) or globally (undefined distances). Context conditions in the same rule may be linked, i.e. conditioned upon each other, negated, or blocked by interfering words or tags. Typical CGs consist of thousands of rules, that are applied set-wise in progressive steps, covering ever more advanced levels of analysis. Within each level, safe rules are used before heuristic rules, and no rule is allowed to remove the last reading of a given kind, thus providing a high degree of robustness.

The Constraint Grammar concept was launched by Fred Karlsson
Fred Karlsson
Fred Karlsson is a professor of general linguistics at the University of Helsinki.In computational linguistics Karlsson has designed a language-independent formalism called Constraint Grammar...

 in 1990 (Karlsson 1990; Karlsson et al., eds, 1995), and CG taggers and parsers have since been written for a large variety of languages, routinely achieving accuracy F-scores for PoS (word class) of over 99%. A number of syntactic CG systems have reported F-scores of around 95% for syntactic function labels. CG systems can be used to create full syntactic trees in other formalisms by adding small, non-terminal based phrase structure grammar
Phrase structure grammar
The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammars as defined by phrase structure rules, i.e. rewrite rules of the type studied previously by Emil Post and Axel Thue...

s or dependency grammar
Dependency grammar
Dependency grammar is a class of modern syntactic theories that are all based on the dependency relation and that can be traced back primarily to the work of Lucien Tesnière. Dependency grammars are distinct from phrase structure grammars , since they lack phrasal nodes. Structure is determined by...

s, and a number of corpus/treebank projects have used Constraint Grammar for automatic annotation. CG methodology has also used in a number of language technology applications, such as spell checker
Spell checker
In computing, a spell checker is an application program that flags words in a document that may not be spelled correctly. Spell checkers may be stand-alone capable of operating on a block of text, or as part of a larger application, such as a word processor, email client, electronic dictionary,...

s and machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

 systems.

List of Constraint Grammar systems sorted by language

Free software
  • VISL CG-3 Constraint Grammar compiler/parser
  • North and Lule Sami, Faroese
    Faroese language
    Faroese , is an Insular Nordic language spoken by 48,000 people in the Faroe Islands and about 25,000 Faroese people in Denmark and elsewhere...

    , Komi
    Komi language
    The Komi language is a Finno-Permic language spoken by the Komi peoples in the northeastern European part of Russia. Komi is one of the two members of the Permic subgroup of the Finno-Ugric branch...

     and Greenlandic from the University of Tromsø
    University of Tromsø
    The University of Tromsø is the world's northernmost university. Located in the city of Tromsø, Norway, it was established in 1968, and opened in 1972. It is one of eight universities in Norway. The University of Tromsø is the largest research and educational institution in northern Norway...

     (more information, Northern Sami documentation)
    • Fred Karlsson's original Finnish
      Finnish language
      Finnish is the language spoken by the majority of the population in Finland Primarily for use by restaurant menus and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a...

       FinCG is also available from the University of Tromsø as GPL, both in the original CG1 and in a converted CG3 version.
  • Estonian
    Estonian language
    Estonian is the official language of Estonia, spoken by about 1.1 million people in Estonia and tens of thousands in various émigré communities...

     http://citeseer.ist.psu.edu/muurisep99determination.html
  • Norwegian
    Norwegian language
    Norwegian is a North Germanic language spoken primarily in Norway, where it is the official language. Together with Swedish and Danish, Norwegian forms a continuum of more or less mutually intelligible local and regional variants .These Scandinavian languages together with the Faroese language...

     Nynorsk and Bokmål online,Oslo-Bergen tagger(sourcecode)
  • Breton
    Breton language
    Breton is a Celtic language spoken in Brittany , France. Breton is a Brythonic language, descended from the Celtic British language brought from Great Britain to Armorica by migrating Britons during the Early Middle Ages. Like the other Brythonic languages, Welsh and Cornish, it is classified as...

    , Welsh
    Welsh language
    Welsh is a member of the Brythonic branch of the Celtic languages spoken natively in Wales, by some along the Welsh border in England, and in Y Wladfa...

    , Irish Gaelic and Norwegian
    Norwegian language
    Norwegian is a North Germanic language spoken primarily in Norway, where it is the official language. Together with Swedish and Danish, Norwegian forms a continuum of more or less mutually intelligible local and regional variants .These Scandinavian languages together with the Faroese language...

     (converted from the above) in Apertium
    Apertium
    Apertium is a rule-based machine translation platform. It is free software and released under the terms of the GNU General Public License.-History:...

     (see CG in Apertium)

Non-free software

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK