History of natural language processing
Encyclopedia
The history of natural language processing describes the advances of natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

. There is some overlap with the history of machine translation
History of machine translation
The history of machine translation generally starts in the 1950s, although work can be found from earlier periods. The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. The experiment was a great success and ushered in an era of...

 and the history of artificial intelligence
History of artificial intelligence
The history of artificial intelligence began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen; as Pamela McCorduck writes, AI began with "an ancient wish to forge the gods."...

.

Theoretical history

The history of machine translation dates back to the seventeenth century, when philosophers such as Leibniz and Descartes put forward proposals for codes which would relate words between languages. All of these proposals remained theoretical, and none resulted in the development of an actual machine.

The first patents for "translating machines" were applied for in the mid 1930s. One proposal, by Georges Artsrouni was simply an automatic bilingual dictionary using paper tape. The other proposal, by Peter Troyanskii, a Russian
Russians
The Russian people are an East Slavic ethnic group native to Russia, speaking the Russian language and primarily living in Russia and neighboring countries....

, was more detailed. It included both the bilingual dictionary, and a method for dealing with grammatical roles between languages, based on Esperanto
Esperanto
is the most widely spoken constructed international auxiliary language. Its name derives from Doktoro Esperanto , the pseudonym under which L. L. Zamenhof published the first book detailing Esperanto, the Unua Libro, in 1887...

.

In 1950, Alan Turing
Alan Turing
Alan Mathison Turing, OBE, FRS , was an English mathematician, logician, cryptanalyst, and computer scientist. He was highly influential in the development of computer science, providing a formalisation of the concepts of "algorithm" and "computation" with the Turing machine, which played a...

 published his famous article "Computing Machinery and Intelligence" which proposed what is now called the Turing test
Turing test
The Turing test is a test of a machine's ability to exhibit intelligent behaviour. In Turing's original illustrative example, a human judge engages in a natural language conversation with a human and a machine designed to generate performance indistinguishable from that of a human being. All...

 as a criterion of intelligence. This criterion depends on the ability of a computer program to impersonate a human in a real-time written conversation with a human judge, sufficiently well that the judge is unable to distinguish reliably - on the basis of the conversational content alone - between the program and a real human.

In 1957, Noam Chomsky
Noam Chomsky
Avram Noam Chomsky is an American linguist, philosopher, cognitive scientist, and activist. He is an Institute Professor and Professor in the Department of Linguistics & Philosophy at MIT, where he has worked for over 50 years. Chomsky has been described as the "father of modern linguistics" and...

’s Syntactic Structures
Syntactic Structures
Syntactic Structures is an seminal book in linguistics by American linguist Noam Chomsky, first published in 1957. It laid the foundation of Chomsky's idea of transformational grammar...

 revolutionized Linguistics with 'universal grammar
Universal grammar
Universal grammar is a theory in linguistics that suggests that there are properties that all possible natural human languages have.Usually credited to Noam Chomsky, the theory suggests that some rules of grammar are hard-wired into the brain, and manifest themselves without being taught...

', a rule based system of syntactic structures.

However, the real progress of NLP was much slower, and after the ALPAC report
ALPAC
ALPAC was a committee of seven scientists led by John R. Pierce, established in 1964 by the U. S. Government in order to evaluate the progress in computational linguistics in general and machine translation in particular...

 in 1966, which found that ten years long research had failed to fulfill the expectations, funding was dramatically reduced internationally.

In 1969 Roger Schank
Roger Schank
Roger Schank is an American artificial intelligence theorist, cognitive psychologist, learning scientist, educational reformer, and entrepreneur.-Academic career:...

 introduced the conceptual dependency theory
Conceptual dependency theory
Conceptual dependency theory is a model of natural language understanding used in artificial intelligence systems.Roger Schank at Stanford University introduced the model in 1969, in the early days of artificial intelligence...

 for natural language understanding. This model, partially influenced by the work of Sydney Lamb
Sydney Lamb
Sydney MacDonald Lamb is an American linguist and professor at Rice University, whose stratificational grammar is a significant alternative theory to Chomsky's transformational grammar....

, was extensively used by Schank's students at Yale University
Yale University
Yale University is a private, Ivy League university located in New Haven, Connecticut, United States. Founded in 1701 in the Colony of Connecticut, the university is the third-oldest institution of higher education in the United States...

, such as Robert Wilensky, Wendy Lehnert, and Janet Kolodner.

In 1970, William A. Woods introduced the augmented transition network
Augmented transition network
An augmented transition network is a type of graph theoretic structure used in the operational definition of formal languages, used especially in parsing relatively complex natural languages, and having wide application in artificial intelligence...

 (ATN) to represent natural language input. Instead of phrase structure rules
Phrase structure rules
Phrase-structure rules are a way to describe a given language's syntax. They are used to break down a natural language sentence into its constituent parts namely phrasal categories and lexical categories...

ATNs used an equivalent set of finite state automata that were called recursively. ATNs and their more general format called "generalized ATNs" continued to be used for a number of years.

Software

Software Year Creator Description Reference
Georgetown experiment
Georgetown-IBM experiment
The Georgetown-IBM experiment was an influential demonstration of machine translation, which was performed during January 7, 1954. Developed jointly by the Georgetown University and IBM, the experiment involved completely automatic translation of more than sixty Russian sentences into...

 
1954 Georgetown University
Georgetown University
Georgetown University is a private, Jesuit, research university whose main campus is in the Georgetown neighborhood of Washington, D.C. Founded in 1789, it is the oldest Catholic university in the United States...

 and IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

involved fully automatic translation of more than sixty Russian sentences into English.
STUDENT
STUDENT (computer program)
STUDENT is an early artificial intelligence program that solves algebra word problems. It is written in Lisp by Daniel G Bobrow as his PhD thesis in 1964 . It was designed to read and solve the kind of word problems found in high school algebra books...

 
1964 Daniel Bobrow could solve high school algebra word problems.
ELIZA
ELIZA
ELIZA is a computer program and an early example of primitive natural language processing. ELIZA operated by processing users' responses to scripts, the most famous of which was DOCTOR, a simulation of a Rogerian psychotherapist. Using almost no information about human thought or emotion, DOCTOR...

 
1964 Joseph Weizenbaum
Joseph Weizenbaum
Joseph Weizenbaum was a German-American author and professor emeritus of computer science at MIT.-Life and career:...

a simulation of a Rogerian psychotherapist, rephrasing her response with a few grammar rules.
SHRDLU
SHRDLU
SHRDLU was an early natural language understanding computer program, developed by Terry Winograd at MIT from 1968-1970. In it, the user carries on a conversation with the computer, moving objects, naming collections and querying the state of a simplified "blocks world", essentially a virtual box...

 
1970 Terry Winograd
Terry Winograd
Terry Allen Winograd is an American professor of computer science at Stanford University, and co-director of the Stanford Human-Computer Interaction Group...

a natural language system working in restricted "blocks world
Blocks world
The blocks world is one of the most famous planning domains in artificial intelligence. The program was created by Terry Winograd and is a limited-domain natural-language system that can understand typed commands and move blocks around on a surface....

s" with restricted vocabularies, worked extremely well
PARRY
PARRY
PARRY is, besides ELIZA, the other famous early chatterbot.-History:PARRY was written in 1972 by psychiatrist Kenneth Colby, then at Stanford University. While ELIZA was a tongue-in-cheek simulation of a Rogerian therapist, PARRY attempted to simulate a paranoid schizophrenic...

 
1972 Kenneth Colby
Kenneth Colby
Kenneth Mark Colby, M.D. was an American psychiatrist dedicated to the theory and application of computer science and artificial intelligence to psychiatry. Colby was a pioneer in the development of computer technology as a tool to try to understand cognitive functions and to assist both patients...

A chatterbot
Chatterbot
A chatter robot, chatterbot, chatbot, or chat bot is a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk. The primary aim of such simulation has been to fool the user into thinking...

KL-ONE
KL-ONE
KL-ONE is a well known knowledge representation system in the tradition of semantic networks and frames; that is, it is a frame language. The system is an attempt to overcome semantic indistinctness in semantic network representations and to explicitly represent conceptual information as a...

 
1974 Sondheimer et al a knowledge representation system in the tradition of semantic networks and frames; it is a frame language
Frame language
A frame language is a metalanguage. It applies the frame concept to the structuring of language properties. Frame languages are usually software languages.-Description:...

.
MARGIE
Margie
Margie, also known at the American Journal of Poetry, is a literary journal, based in Missouri, that features the work of the nation's leading poets. The journal is dedicated to the memory of Marjorie J. Wilson . The founder and editor is Robert Nazarene. The journal sponsors several prestigious...

 
1975 Roger Schank
Roger Schank
Roger Schank is an American artificial intelligence theorist, cognitive psychologist, learning scientist, educational reformer, and entrepreneur.-Academic career:...

TaleSpin (software)  1976 Meehan
QUALM  Lehnert
LIFER/LADDER
LIFER/LADDER
LIFER/LADDER was one of the first database natural language processing systems. It was designed as a natural languageinterface to a database of information about US Navy ships...

 
1978 Hendrix a natural language interface to a database of information about US Navy ships.
SAM (software)  1978 Cullingford
PAM (software)  1978 Robert Wilensky
Politics (software)  1979 Carbonell
Plot Units (software)  1981 Lehnert
Jabberwacky
Jabberwacky
Jabberwacky is a chatterbot created by British programmer Rollo Carpenter. Its stated aim is to "simulate natural human chat in an interesting, entertaining and humorous manner"...

 
1982 Rollo Carpenter
Rollo Carpenter
Rollo Carpenter is the British-born creator of Jabberwacky and Cleverbot, learning Artificial Intelligence software. Carpenter has worked as CTO of a business software startup in Silicon Valley, but returned to the UK to work at Icogno....

chatterbot
Chatterbot
A chatter robot, chatterbot, chatbot, or chat bot is a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk. The primary aim of such simulation has been to fool the user into thinking...

 with stated aim to "simulate natural human chat in an interesting, entertaining and humorous manner".
MUMBLE (software)  1982 McDonald
Racter
Racter
Racter was an artificial intelligence computer program that generated English language prose at random.-History:The name of the program is short for raconteur. The sophistication claimed for the program was likely exaggerated, as could be seen by investigation of the template system of text...

 
1983 William Chamberlain and Thomas Etter chatterbot
Chatterbot
A chatter robot, chatterbot, chatbot, or chat bot is a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk. The primary aim of such simulation has been to fool the user into thinking...

 that generated English language prose at random.
MOPTRANS  1984 Lytinen
KODIAK (software)  1986 Wilensky
Absity (software)  1987 Hirst
Watson (artificial intelligence software)
Watson (artificial intelligence software)
Watson is an artificial intelligence computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first president, Thomas J...

 
2006 IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

A question answering system that won the Jeopardy!
Jeopardy!
Griffin's first conception of the game used a board comprising ten categories with ten clues each, but after finding that this board could not be shown on camera easily, he reduced it to two rounds of thirty clues each, with five clues in each of six categories...

contest, defeating the best human players in February 2011.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK