Realization (linguistics)
Encyclopedia
Realisation is a subtask of Natural language generation
Natural language generation
Natural Language Generation is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form...

, which involves
creating an actual text in a human language (English, French, etc) from a syntactic
representation. There are a number of software packages available for realisation,
most of which have been developed by academic research groups in NLG.

Example

For example, the following Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 code causes the simplenlg systemhttp://simplenlg.googlecode.com/ to print out the text The women do not smoke.:


NPPhraseSpec subject = nlgFactory.createNounPhrase("the", "woman");
subject.setPlural(true);
SPhraseSpec sentence = nlgFactory.createClause(subject, "smoke");
sentence.setFeature(Feature.NEGATED, true);
System.out.println(realiser.realiseSentence(sentence));


In this example, the computer program has specified the linguistic constituents of the sentence (verb, subject), and also linguistic features (plural subject, negated), and from this information the realiser has constructed the actual sentence.

Processing

Realisation involves three kinds of processing:

Syntactic realisation: Using grammatical knowledge to choose inflections, add function words and also to decide the order of components. For example, in English the subject usually precedes the verb, and the negated form of smoke is do not smoke.

Morphological realisation: Computing inflected forms, for example the plural form of woman is women (not womans).

Orthographic realisation: Dealing with casing, punctuation
Punctuation
Punctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud.In written English, punctuation is vital to disambiguate the meaning of sentences...

, and formatting. For example capitalising The because it is the first word of the sentence.

The above examples are very basic, most realisers are capable of considerably more complex processing.

Systems

A number of realisers have been developed over the past 20 years. These systems differ in terms of complexity and sophistication of their processing, robustness in dealing with unusual cases, and whether they are accessed programmatically via an API (like simplenlg) or whether they take a textual representation of a syntactic structure as their input. There are also major differences in pragmatic factors such as documentation, support, licensing terms, speed and memory usage, etc.

It is not possible to describe all realisers here, but a few of the more popular ones are
  • KPML http://www.purl.org/net/kpml: this is the oldest realiser, which has been under development under different guises since the 1980s. It comes with grammars for ten different languages.
  • FUF/SURGE http://www.cs.bgu.ac.il/surge: a realiser which was widely used in the 1990s, and is still used in some projects today
  • OpenCCG http://openccg.sourceforge.net: an open-source realiser which has a number of nice features, such as the ability to use statistical language models to make realisation decisions.
  • Simplenlg http://simplenlg.googlecode.com/: a realiser which is intended to be simple to learn and use, at the cost of more limited functionality

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK