WordNet
Encyclopedia
WordNet is a lexical database
Lexical database
A lexical database is a lexical resource which has an associated software environment database which permits access to its contents. The database may be custom-designed for the lexical information or a general-purpose database into which lexical information has been entered.Information typically...

 for the English language
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

. It groups English word
Word
In language, a word is the smallest free form that may be uttered in isolation with semantic or pragmatic content . This contrasts with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own...

s into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym
Synonym
Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...

 sets. The purpose is twofold: to produce a combination of dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...

 and thesaurus
Thesaurus
A thesaurus is a reference work that lists words grouped together according to similarity of meaning , in contrast to a dictionary, which contains definitions and pronunciations...

 that is more intuitively usable, and to support automatic text analysis
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

 and artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

 applications. The database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

 and software tools have been released under a BSD style license and can be downloaded and used freely. The database can also be browsed online.
WordNet was created and is being maintained at the Cognitive Science
Cognitive science
Cognitive science is the interdisciplinary scientific study of mind and its processes. It examines what cognition is, what it does and how it works. It includes research on how information is processed , represented, and transformed in behaviour, nervous system or machine...

 Laboratory of Princeton University
Princeton University
Princeton University is a private research university located in Princeton, New Jersey, United States. The school is one of the eight universities of the Ivy League, and is one of the nine Colonial Colleges founded before the American Revolution....

 under the direction of psychology
Psychology
Psychology is the study of the mind and behavior. Its immediate goal is to understand individuals and groups by both establishing general principles and researching specific cases. For many, the ultimate goal of psychology is to benefit society...

 professor
Professor
A professor is a scholarly teacher; the precise meaning of the term varies by country. Literally, professor derives from Latin as a "person who professes" being usually an expert in arts or sciences; a teacher of high rank...

 George A. Miller. Development began in 1985. Over the years, the project received funding from government agencies interested in machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

. As of 2009, the WordNet team includes the following members of the Cognitive Science Laboratory: George Armitage Miller, Christiane Fellbaum
Christiane Fellbaum
Christiane D. Fellbaum, born in Braunschweig, Lower Saxony, Germany, has lived in the United States since 1969. After graduating from Princeton University with a PhD in linguistics, she became a part of the Cognitive Science department under George Armitage Miller and has played an active role in...

, Randee Tengi, Pamela Wakefield, Helen Langone and Benjamin R. Haskell. WordNet has been supported by grants from the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

, DARPA, the Disruptive Technology Office
Disruptive Technology Office
The Disruptive Technology Office is a funding agency within the United States Intelligence Community. It was until known as Advanced Research and Development Activity ....

 (formerly the Advanced Research and Development Activity), and REFLEX. George Miller and Christiane Fellbaum were awarded the 2006 Antonio Zampolli Prize for their work with WordNet.

Database contents

WordNet's latest version is 3.1, . , the database contains 155,287 words organized in 117,659 synsets for a total of 206,941 word-sense pairs; in compressed
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 form, it is about 12 megabyte
Megabyte
The megabyte is a multiple of the unit byte for digital information storage or transmission with two different values depending on context: bytes generally for computer memory; and one million bytes generally for computer storage. The IEEE Standards Board has decided that "Mega will mean 1 000...

s in size.

WordNet distinguishes between noun
Noun
In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition .Lexical categories are defined in terms of how their members combine with other kinds of...

s, verb
Verb
A verb, from the Latin verbum meaning word, is a word that in syntax conveys an action , or a state of being . In the usual description of English, the basic form, with or without the particle to, is the infinitive...

s, adjective
Adjective
In grammar, an adjective is a 'describing' word; the main syntactic role of which is to qualify a noun or noun phrase, giving more information about the object signified....

s and adverb
Adverb
An adverb is a part of speech that modifies verbs or any part of speech other than a noun . Adverbs can modify verbs, adjectives , clauses, sentences, and other adverbs....

s because they follow different grammatical rules—it does not include prepositions, determiners etc. Every synset contains a group of synonymous words or collocation
Collocation
In corpus linguistics, collocation defines a sequence of words or terms that co-occur more often than would be expected by chance. In phraseology, collocation is a sub-type of phraseme. An example of a phraseological collocation is the expression strong tea...

s (a collocation is a sequence of words that go together to form a specific meaning, such as "car pool
Carpool
Carpooling , is the sharing of car journeys so that more than one person travels in a car....

"); different senses of a word are in different synsets. The meaning of the synsets is further clarified with short defining glosses (Definitions and/or example sentences). A typical example synset with gloss is:
good, right, ripe – (most suitable or right for a particular purpose; "a good time to plant tomatoes"; "the right time to act"; "the time is ripe for great sociological changes")


Most synsets are connected to other synsets via a number of semantic relations. These relations vary based on the type of word, and include:
  • Noun
    Noun
    In linguistics, a noun is a member of a large, open lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition .Lexical categories are defined in terms of how their members combine with other kinds of...

    s
    • hypernym
      Hypernym
      In linguistics, a hyponym is a word or phrase whose semantic field is included within that of another word, its hypernym . In simpler terms, a hyponym shares a type-of relationship with its hypernym...

      s
      : Y is a hypernym of X if every X is a (kind of) Y (canine is a hypernym of dog
      Dog
      The domestic dog is a domesticated form of the gray wolf, a member of the Canidae family of the order Carnivora. The term is used for both feral and pet varieties. The dog may have been the first animal to be domesticated, and has been the most widely kept working, hunting, and companion animal in...

      )
    • hyponyms: Y is a hyponym of X if every Y is a (kind of) X (dog is a hyponym of canine)
    • coordinate terms: Y is a coordinate term of X if X and Y share a hypernym (wolf is a coordinate term of dog, and dog is a coordinate term of wolf)
    • holonym
      Holonymy
      Holonymy is a semantic relation. Holonymy defines the relationship between a term denoting the whole and a term denoting a part of, or a member of, the whole. That is,...

      : Y is a holonym of X if X is a part of Y (building is a holonym of window)
    • meronym
      Meronymy
      Meronymy is a semantic relation used in linguistics. A meronym denotes a constituent part of, or a member of something. That is,...

      : Y is a meronym of X if Y is a part of X (window is a meronym of building)
  • Verb
    Verb
    A verb, from the Latin verbum meaning word, is a word that in syntax conveys an action , or a state of being . In the usual description of English, the basic form, with or without the particle to, is the infinitive...

    s
    • hypernym: the verb Y is a hypernym of the verb X if the activity X is a (kind of) Y (to perceive is an hypernym of to listen)
    • troponym: the verb Y is a troponym of the verb X if the activity Y is doing X in some manner (to lisp is a troponym of to talk)
    • entailment
      Entailment
      In logic, entailment is a relation between a set of sentences and a sentence. Let Γ be a set of one or more sentences; let S1 be the conjunction of the elements of Γ, and let S2 be a sentence: then, Γ entails S2 if and only if S1 and not-S2 are logically inconsistent...

      : the verb Y is entailed by X if by doing X you must be doing Y (to sleep is entailed by to snore)
    • coordinate terms: those verbs sharing a common hypernym (to lisp and to yell)
  • Adjective
    Adjective
    In grammar, an adjective is a 'describing' word; the main syntactic role of which is to qualify a noun or noun phrase, giving more information about the object signified....

    s
    • related nouns
    • similar to
    • participle of verb
  • Adverb
    Adverb
    An adverb is a part of speech that modifies verbs or any part of speech other than a noun . Adverbs can modify verbs, adjectives , clauses, sentences, and other adverbs....

    s
    • root adjectives


While semantic relations apply to all members of a synset because they share a meaning but are all mutually synonym
Synonym
Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...

s, words can also be connected to other words through lexical relations, including antonym
Antonym
In lexical semantics, opposites are words that lie in an inherently incompatible binary relationship as in the opposite pairs male : female, long : short, up : down, and precede : follow. The notion of incompatibility here refers to the fact that one word in an opposite pair entails that it is not...

s (opposites of each other) which are derivationally related, as well.

WordNet also provides the polysemy count of a word: the number of synsets that contain the word. If a word participates in several synsets (i.e. has several senses) then typically some senses are much more common than others. WordNet quantifies this by the frequency score: in which several sample texts have all words semantically tagged with the corresponding synset, and then a count provided indicating how often a word appears in a specific sense.

The morphology functions of the software distributed with the database try to deduce the lemma or root
Root (linguistics)
The root word is the primary lexical unit of a word, and of a word family , which carries the most significant aspects of semantic content and cannot be reduced into smaller constituents....

 form of a word
Word
In language, a word is the smallest free form that may be uttered in isolation with semantic or pragmatic content . This contrasts with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own...

 from the user's input; only the root form is stored in the database unless it has irregular inflected forms.

Knowledge structure

Both nouns and verbs are organized into hierarchies, defined by hypernym or IS A relationships. For instance, the first sense of the word dog would have the following hypernym hierarchy; the words at the same level are synonyms of each other: some sense of dog is synonymous with some other senses of domestic dog and Canis lupus familiaris, and so on. Each set of synonyms (synset), has a unique index and shares its properties, such as a gloss (or dictionary) definition.

dog, domestic dog, Canis familiaris
=> canine, canid
=> carnivore
=> placental, placental mammal, eutherian, eutherian mammal
=> mammal
=> vertebrate, craniate
=> chordate
=> animal, animate being, beast, brute, creature, fauna
=> ...

At the top level, these hierarchies are organized into base types, 25 primitive groups for nouns, and 15 for verbs. These groups form lexicographic files at a maintenance level. These primitive groups are connected to an abstract root node that has, for some time, been assumed by various applications that use WordNet.

In the case of adjectives, the organization is different. Two opposite 'head' senses work as binary poles, while 'satellite' synonyms connect to each of the heads via synonymy relations. Thus, the hierarchies, and the concept involved with lexicographic files, do not apply here the same way they do for nouns and verbs.

The network of nouns is far deeper than that of the other parts of speech. Verbs have a far bushier structure, and adjectives are organized into many distinct clusters. Adverbs are defined in terms of the adjectives they are derived from, and thus inherit their structure from that of the adjectives.

Psychological justification

The goal of WordNet was to develop a system that would be consistent with the knowledge acquired over the years about how human beings process language. Anomic aphasia, for example, creates a condition that seems to selectively encumber individuals' ability to name objects; this makes the decision to partition the parts of speech into distinct hierarchies more of a principled decision than an arbitrary one.

In the case of hyponymy, psychological experiments revealed that individuals can access properties of nouns more quickly depending on when a characteristic becomes a defining property. That is, individuals can quickly verify that canaries can sing because a canary is a songbird (only one level of hyponymy), but require slightly more time to verify that canaries can fly (two levels of hyponymy) and even more time to verify canaries have skin (multiple levels of hyponymy). This suggests that we too store semantic information in a way that is much like WordNet, because we only retain the most specific information needed to differentiate one particular concept from similar concepts.

WordNet as an ontology

The hypernym/hyponym relationships among the noun synsets can be interpreted as specialization relations between conceptual categories. In other words, WordNet can be interpreted and used as a lexical ontology
Ontology (computer science)
In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be used to describe the domain.In theory, an ontology is...

 in the computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

 sense. However, such an ontology should normally be corrected before being used since it contains hundreds of basic semantic inconsistencies such as (i) the existence of common specializations for exclusive categories and (ii) redundancies in the specialization hierarchy. Furthermore, transforming WordNet into a lexical ontology usable for knowledge representation should normally also involve (i) distinguishing the specialization relations into subtypeOf and instanceOf relations, and (ii) associating intuitive unique identifiers to each category. Although such corrections and transformations have been performed and documented as part of the integration of WordNet 1.7 into the cooperatively updatable knowledge base of WebKB-2, most projects claiming to re-use WordNet for knowledge-based applications (typically, knowledge-oriented information retrieval) simply re-use it directly.
WordNet has also been converted to a formal specification, by means of a hybrid bottom-up top-down methodology to automatically extract association relations from WordNet, and interpret these associations in terms of a set of conceptual relations, formally defined in the DOLCE foundational ontology.

Problems and limitations

Unlike other dictionaries, WordNet does not include information about etymology
Etymology
Etymology is the study of the history of words, their origins, and how their form and meaning have changed over time.For languages with a long written history, etymologists make use of texts in these languages and texts about the languages to gather knowledge about how words were used during...

, pronunciation and the forms of irregular verb
Irregular verb
In contrast to regular verbs, irregular verbs are those verbs that fall outside the standard patterns of conjugation in the languages in which they occur. The idea of an irregular verb is important in second language acquisition, where the verb paradigms of a foreign language are learned...

s and contains only limited information about usage.

The actual lexicographical and semantic information is maintained in lexicographer files, which are then processed by a tool called grind to produce the distributed database. Both grind and the lexicographer files are freely available in a separate distribution, but modifying and maintaining the database requires expertise.

Though WordNet contains a sufficiently wide range of common words, it does not cover special domain vocabulary. Since it is primarily designed to act as an underlying database for different applications, those applications cannot be used in specific domains that are not covered by WordNet.

In most works that claim to have integrated WordNet into other ontologies, the content of WordNet has not simply been corrected when semantic problems have been encountered; instead, WordNet has been used as an inspiration source but heavily re-interpreted and updated whenever suitable. This was the case when, for example, the top-level ontology of WordNet was re-structured according to the OntoClean
OntoClean
OntoClean is a methodology for analyzing ontologies based on formal, domain-independent properties of classes due to Nicola Guarino and Chris Welty.-Overview and History:...

 based approach or when WordNet was used as a primary source for constructing the lower classes of the SENSUS ontology.

WordNet is the most commonly used computational lexicon of English for word sense disambiguation
Word sense disambiguation
In computational linguistics, word-sense disambiguation is an open problem of natural language processing, which governs the process of identifying which sense of a word is used in a sentence, when the word has multiple meanings...

 (WSD), a task aimed to assigning the most appropriate senses (i.e. synsets) to words in context. However, it has been argued that WordNet encodes sense distinctions that are too fine-grained even for humans. This issue prevents WSD systems from achieving high performance. The granularity issue has been tackled by proposing clustering methods that automatically group together similar senses of the same word.

Applications

WordNet has been used for a number of different purposes in information systems, including word sense disambiguation
Word sense disambiguation
In computational linguistics, word-sense disambiguation is an open problem of natural language processing, which governs the process of identifying which sense of a word is used in a sentence, when the word has multiple meanings...

, information retrieval
Information retrieval
Information retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...

, automatic text classification
Document classification
Document classification or document categorization is a problem in both library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" or algorithmically...

, automatic text summarization, and even automatic crossword puzzle generation.

A project at Brown University
Brown University
Brown University is a private, Ivy League university located in Providence, Rhode Island, United States. Founded in 1764 prior to American independence from the British Empire as the College in the English Colony of Rhode Island and Providence Plantations early in the reign of King George III ,...

 started by Jeff Stibel
Jeff Stibel
Jeffrey M. Stibel is an entrepreneur, having started numerous technology and marketing companies. At age 32, he became one of the youngest public company CEOs in America and opened the NASDAQ stock market on June 15, 2007. He is also a brain scientist and published author.- Business :Stibel is...

, James A. Anderson
James A. Anderson
James A. Anderson is a Professor of Cognitive Science and Brain Science at Brown University. His multi-disciplinary background includes expertise in psychology, biology, physics, neuroscience and computer science. Anderson received his Ph.D...

, Steve Reiss and others called Applied Cognition Lab created a disambiguator using WordNet in 1998. The project later morphed into a company called Simpli
Simpli
Simpli was an early search engine that offered disambiguation to search terms. A user could enter in a search term that was ambiguous and the search engine would return a list of alternatives .The technology was rooted in brain science and built by academics to model the way in which the mind...

, which is now owned by ValueClick
ValueClick
ValueClick is a Westlake Village, CA-based online advertising company, which provides online advertising campaigns and programs for advertisers and advertising agency customers in the United States and internationally....

. George Miller joined the Company as a member of the Advisory Board. Simpli built an Internet search engine that utilized a knowledge base principally based on WordNet to disambiguate and expand keywords and synsets to help retrieve information online. WordNet was expanded upon to add increased dimensionality, such as intentionality (used for x), people (Albert Einstein
Albert Einstein
Albert Einstein was a German-born theoretical physicist who developed the theory of general relativity, effecting a revolution in physics. For this achievement, Einstein is often regarded as the father of modern physics and one of the most prolific intellects in human history...

) and colloquial terminology more relevant to Internet search (i.e., blogging, ecommerce). Neural network
Neural network
The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...

 algorithms searched the expanded WordNet for related terms to disambiguate search keywords (Java
Java coffee
Java coffee is a coffee produced on the island of Java. In the United States the term "Java" by itself is, in general, slang for coffee. The Indonesian phrase Kopi Jawa refers not only to the origin of the coffee, but is used to distinguish the strong, black, very sweet coffee, with powdered grains...

, in the sense of coffee) and expand the search synset (Coffee, Drink, Joe) to improve search engine results. Before the company was acquired, it performed searches across search engines such as Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

, Yahoo!, Ask.com
Ask.com
Ask is a Q&A focused search engine founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky from his own design. Warthen, Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine...

 and others.

Another prominent example of the use of WordNet is to determine the similarity
Semantic similarity
Semantic similarity or semantic relatedness is a concept whereby a set of documents or terms within term lists are assigned a metric based on the likeness of their meaning / semantic content....

 between words. Various algorithms have been proposed, and these include considering the distance between the conceptual categories of words, as well as considering the hierarchical structure of the WordNet ontology. A number of these WordNet-based word similarity algorithms are implemented in a Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 package called WordNet::Similarity, and in a Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 package called NLTK.

Interfaces

Princeton maintains a list of related projects that includes links to some of the widely used application programming interface
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

s available for accessing WordNet using various programming languages and environments.

Related projects and extensions

Wordnet is connected to several databases of the Semantic Web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

. WordNet is also commonly re-used via mappings between the WordNet categories (i.e. synsets) and the categories from other ontologies. Most often, only the top-level categories of WordNet are mapped.

Other languages

  • WOLF (WordNet Libre du Français), a French version of WordNet.
  • The MultiWordNet project, a multilingual WordNet aimed at producing an Italian WordNet strongly aligned with the Princeton WordNet.
  • The EuroWordNet
    EuroWordNet
    EuroWordNet is a system of semantic networks for European languages, based on Wordnet. Each language develops its own wordnet but they are interconnected with interlingual links stored in the Interlingual Index ....

     project has produced WordNets for several European languages and linked them together; these are not freely available however. The Global Wordnet project attempts to coordinate the production and linking of "wordnets" for all languages. Oxford University Press
    Oxford University Press
    Oxford University Press is the largest university press in the world. It is a department of the University of Oxford and is governed by a group of 15 academics appointed by the Vice-Chancellor known as the Delegates of the Press. They are headed by the Secretary to the Delegates, who serves as...

    , the publisher of the Oxford English Dictionary
    Oxford English Dictionary
    The Oxford English Dictionary , published by the Oxford University Press, is the self-styled premier dictionary of the English language. Two fully bound print editions of the OED have been published under its current name, in 1928 and 1989. The first edition was published in twelve volumes , and...

    , has voiced plans to produce their own online competitor to WordNet.
  • The BalkaNet project has produced WordNets for six European languages (Bulgarian, Czech, Greek, Romanian, Turkish and Serbian). For this project, freely available XML-based WordNet editor was developed. This editor – VisDic – is not in active development anymore, but is still used for the creation of various WordNets. Its successor, DEBVisDic, is client-server application and is currently used for the editing of several WordNets (Dutch in Cornetto project, Polish, Hungarian, several African languages, Chinese).
  • UWN is an automatically constructed multilingual lexical knowledge base extending WordNet to cover over a million words in many different languages.
  • Such projects as BalkaNet and EuroWordNet made feasible to create standalone wordnets linked to the original one. One of such projects is Russian WordNet patronized by Petersburg State University of Means of Communication
    Petersburg State University of Means of Communication
    The Petersburg State Transport University is a higher education institution specializing in railway transport. Before 1990 it was known as "Leningrad Institute of Railway Engineers"...

    .
  • FinnWordNet is a Finnish version of the WordNet where all entries of the original English WordNet were translated.

Linked data

  • BabelNet, a very large multilingual semantic network
    Semantic network
    A semantic network is a network which represents semantic relations among concepts. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges.- History :...

     with millions of concepts obtained from an integration of WordNet and Wikipedia based on an automatic mapping algorithm.
  • The SUMO
    Suggested Upper Merged Ontology
    The Suggested Upper Merged Ontology or SUMO is an upper ontology intended as a foundation ontology for a variety of computer information processing systems. It was originally developed by the Teknowledge Corporation and now is maintained by . It is one candidate for the "standard upper ontology"...

     ontology has produced a mapping between all of the WordNet synsets, (including nouns, verbs, adjectives and adverbs), and SUMO classes. The most recent addition of the mappings provides links to all of the more specific terms in the MId-Level Ontology (MILO), which extends SUMO.
  • OpenCyc, an open ontology and knowledge base
    Knowledge base
    A knowledge base is a special kind of database for knowledge management. A Knowledge Base provides a means for information to be collected, organised, shared, searched and utilised.-Types:...

     of everyday common sense knowledge, has 12,000 terms linked to WordNet synonym sets.
  • DOLCE, is the first module of the WonderWeb Foundational Ontologies Library (WFOL). This upper-ontology has been developed in light of rigorous ontological principles inspired by the philosophical tradition, with a clear orientation toward language and cognition. OntoWordNet is the result of an experimental effort to align WordNet's upper level with DOLCE. It is suggested that such alignment could lead to an "ontologically sweetened" WordNet, meant to be conceptually more rigorous, cognitively transparent, and efficiently exploitable in several applications.
  • DBpedia
    DBpedia
    DBpedia is a project aiming to extract structured content from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web. DBpedia allows users to query relationships and properties associated with Wikipedia resources,...

    , a database of structured information, is also linked to WordNet.
  • The eXtended WordNet
    EXtended WordNet
    The eXtended WordNet is a project at the University of Texas at Dallas that aims to improve WordNet by semantically parsing the glosses, thus making the information contained in these definitions available for automatic knowledge processing systems. It is freely available under a BSD style license...

     is a project at the University of Texas at Dallas
    University of Texas at Dallas
    The University of Texas at Dallas, also referred to as UT Dallas or UTD, is a public research university in the University of Texas System. The main campus is in the heart of the Richardson, Texas, Telecom Corridor, north of downtown Dallas...

     which aims to improve WordNet by semantically parsing the glosses, thus making the information contained in these definitions available for automatic knowledge processing systems. It is also freely available under a license similar to WordNet's.
  • The GCIDE
    GCIDE
    GCIDE is the GNU version of the Collaborative International Dictionary of English.The dictionary was derived from the Webster's Revised Unabridged Dictionary Version published 1913 and WordNet.The GNU version is licensed under the GNU General Public License....

     project produced a dictionary by combining a public domain
    Public domain
    Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...

     Webster's Dictionary
    Webster's Dictionary
    Webster's Dictionary refers to the line of dictionaries first developed by Noah Webster in the early 19th century, and also to numerous unrelated dictionaries that added Webster's name just to share his prestige. The term is a genericized trademark in the U.S.A...

    from 1913 with some WordNet definitions and material provided by volunteers. It was released under the copyleft
    Copyleft
    Copyleft is a play on the word copyright to describe the practice of using copyright law to offer the right to distribute copies and modified versions of a work and requiring that the same rights be preserved in modified versions of the work...

     license GPL
    GNU General Public License
    The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....

    .
  • ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. Currently it has an average of over five hundred images per node.
  • BioWordnet, a biomedical extension of wordnet was abandoned due to issues about stability over versions.
  • WikiTax2WordNet, a mapping between WordNet synsets and Wikipedia categories.
  • WordNet++, a resource including over millions of semantic edges harvested from Wikipedia and connecting pairs of WordNet synsets.
  • SentiWordNet, a resource for supporting opinion mining applications obtained by tagging all the WordNet 3.0 synsets according to their estimated degrees of positivity, negativity, and neutrality.

Other projects

  • FrameNet
    FrameNet
    FrameNet is a project housed at the International Computer Science Institute in Berkeley, California which produces an electronic resource based on...

     is a project similar to WordNet. It consists of a lexicon which is based on annotating over 100,000 sentences with their semantic properties. The unit in focus is the lexical frame, a type of state or event together with the properties associated with it.
  • A fledgling project titled wordNet (not WordNet) is an internet search engine
    Search engine
    A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

     containing maps of the internet
    Internet
    The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

    , not only word mappings (like WordNet), but also phrase, concept, and web site map
    Site map
    A site map is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for web design, or a web page that lists the pages on a web site, typically organized in hierarchical fashion...

    pings.
  • Lexical markup framework
    Lexical Markup Framework
    ISO 24613:2008, Language resource management - Lexical markup framework , is the ISO International Organization for Standardization ISO/TC37 standard for natural language processing and machine-readable dictionary lexicons...

     (LMF) is a work in progress within ISO/TC37 in order to define a common standardized framework for the construction of lexicons, including WordNet.
  • UNL Programme
    Universal Networking Language
    Universal Networking Language is a declarative formal language specifically designed to represent semantic data extracted from natural language texts...

     is a project under the auspices of UNO
    United Nations
    The United Nations is an international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace...

     aimed to consolidate lexicosemantic data of many languages to be used in machine translation and information extraction systems.

Distributions

WordNet Database is distributed as a dictionary package (usually a single file) for following software:
  • StarDict
    StarDict
    StarDict, developed by Hu Zheng , is a free GUI released under the GPL for accessing StarDict dictionary files . It is the successor of the program StarDic, developed by Ma Su'an...

  • Babylon
  • Lingoes
    Lingoes (program)
    Lingoes is a single-click multi-lingual translation software program released as a freeware translation utility. Lingoes is often compared to Babylon dictionary due to similarities in the GUI, same functionalities and most importantly being free of charge.-Features and Expandability:Lingoes has the...


See also

  • Hyponym
  • Is-a
    Is-a
    In knowledge representation, object-oriented programming and design, is-a or is_a or is a is a relationship where one class D is a subclass of another class B ....

  • Machine-readable dictionary
    Machine-readable dictionary
    Machine-readable dictionary is a dictionary stored as machine data instead of being printed on paper. It is an electronic dictionary and lexical database....

  • Ontology (information science)
  • Semantic network
    Semantic network
    A semantic network is a network which represents semantic relations among concepts. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges.- History :...

  • Semantic Web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • Synonym Ring
    Synonym ring
    In metadata a synonym ring or synset, is a group of data elements that are considered semantically equivalent for the purposes of information retrieval. These data elements are frequently found in different metadata registries...

  • Taxonomy
    Taxonomy
    Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

  • ThoughtTreasure
    ThoughtTreasure
    ThoughtTreasure is a commonsense knowledge base and architecture for natural language processing.It contains both declarative and proceduralknowledge.-Declarative knowledge:ThoughtTreasure's knowledge baseconsists of concepts, which are...

  • Troponym
  • Word sense disambiguation
    Word sense disambiguation
    In computational linguistics, word-sense disambiguation is an open problem of natural language processing, which governs the process of identifying which sense of a word is used in a sentence, when the word has multiple meanings...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK