All Topics  
Lemma (linguistics)

 

   Email Print
   Bookmark   Link






 

Lemma (linguistics)



 
 
In linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
 a lemma (plural lemmas or lemmata) has two distinct interpretations:
  1. morphology
    Morphology (linguistics)

    Morphology is the identification, analysis and description of structure of words . While words are generally accepted as being the smallest units of syntax, it is clear that in most languages, words can be related to other words by rules....
     / lexicography
    Lexicography

    The pursuit of lexicography is divided into two related disciplines:*Practical lexicography is the art or craft of compiling, writing and editing dictionary....
    : the canonical form
    Canonical form

    Generally, in mathematics, a canonical form of an object is a standard way of presenting that object.Canonical form can also mean a differential form that is defined in a natural way; #Differential forms....
     or citation form of a set of forms (headword
    Headword

    A headword, head word, lemma, or sometimes catchword is the word under which a set of related dictionary or encyclopaedia entries appears....
    ); e.g. in English
    English language

    English is a West Germanic language that originated in Anglo-Saxon England and has lingua franca status in many parts of the world as a result of the military, economic, scientific, political and cultural influence of the British Empire in the 18th, 19th and early 20th centuries and that of the United States from the mid 20th century onwa...
    , run, runs, ran and running are forms of the same lexeme, with run as the lemma.
  2. psycholinguistics
    Psycholinguistics

    Psycholinguistics or psychology of language is the study of the psychology and neurobiology factors that enable humans to acquire, use, and understand language....
    : Abstract conceptual form that has been mentally selected for utterance in the early stages of speech production, but before any sounds are attached to it.


A lemma in morphology is the canonical form of a lexeme
Lexeme

A lexeme is an abstract Unit of Morphology Semantic analysis in linguistics, that roughly corresponds to a set of forms taken by a single word....
.






Discussion
Ask a question about 'Lemma (linguistics)'
Start a new discussion about 'Lemma (linguistics)'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
 a lemma (plural lemmas or lemmata) has two distinct interpretations:
  1. morphology
    Morphology (linguistics)

    Morphology is the identification, analysis and description of structure of words . While words are generally accepted as being the smallest units of syntax, it is clear that in most languages, words can be related to other words by rules....
     / lexicography
    Lexicography

    The pursuit of lexicography is divided into two related disciplines:*Practical lexicography is the art or craft of compiling, writing and editing dictionary....
    : the canonical form
    Canonical form

    Generally, in mathematics, a canonical form of an object is a standard way of presenting that object.Canonical form can also mean a differential form that is defined in a natural way; #Differential forms....
     or citation form of a set of forms (headword
    Headword

    A headword, head word, lemma, or sometimes catchword is the word under which a set of related dictionary or encyclopaedia entries appears....
    ); e.g. in English
    English language

    English is a West Germanic language that originated in Anglo-Saxon England and has lingua franca status in many parts of the world as a result of the military, economic, scientific, political and cultural influence of the British Empire in the 18th, 19th and early 20th centuries and that of the United States from the mid 20th century onwa...
    , run, runs, ran and running are forms of the same lexeme, with run as the lemma.
  2. psycholinguistics
    Psycholinguistics

    Psycholinguistics or psychology of language is the study of the psychology and neurobiology factors that enable humans to acquire, use, and understand language....
    : Abstract conceptual form that has been mentally selected for utterance in the early stages of speech production, but before any sounds are attached to it.


A lemma in morphology is the canonical form of a lexeme
Lexeme

A lexeme is an abstract Unit of Morphology Semantic analysis in linguistics, that roughly corresponds to a set of forms taken by a single word....
. Lexeme, in this context, refers to the set of all the forms that have the same meaning, and lemma refers to the particular form that is chosen by convention to represent the lexeme. In lexicography, this unit is usually also the citation form or headword
Headword

A headword, head word, lemma, or sometimes catchword is the word under which a set of related dictionary or encyclopaedia entries appears....
 by which it is indexed. Lemmas have special significance in highly inflected languages
Inflection

In grammar, inflection or inflexion is the way language handles grammatical relations and relational categories such as grammatical tense, grammatical mood, grammatical voice, grammatical aspect, grammatical person, grammatical number, grammatical gender, grammatical case....
 such as Czech
Czech language

Czech is a West Slavic language with about 12 million native speakers; it is the majority language in the Czech Republic and spoken by Czech people worldwide....
. The process of determining the lemma for a given word is called lemmatisation
Lemmatisation

Lemmatisation is the process of grouping together the different inflected forms of a word so they can be analysed as a single item.In computing, lemmatisation is the algorithmic process of determining the lemma for a given word....
.

The psycholinguistics
Psycholinguistics

Psycholinguistics or psychology of language is the study of the psychology and neurobiology factors that enable humans to acquire, use, and understand language....
 interpretation refers to one of the more widely accepted psycholinguistic models of speech production, referring to an early stage in the mental preparation for an utterance. Here, lemma is the abstract form of a word that arises after the word has been selected mentally, but before any information has been accessed about the sounds in it (and thus before the word can be pronounced). It therefore contains information concerning only meaning
Meaning (linguistics)

Linguistic strings can be made up of phenomena such as words, phrases, and sentences, each of which has a different kind of meaning. Individual words, such as the word "bachelor", refer to some abstract concept....
 and the relation of this word to others in the sentence. This notion of lemma is similar to the Sanskrit
Sanskrit

Sanskrit is a historical Indo-Aryan language, one of the liturgical languages of Hinduism and Buddhism, and one of the 22 official languages of India....
 sphota (6th c.), an invariant mental word, of which the sound is a feature.

Morphology / Lexicography


In a dictionary, the lemma "go" represents the inflected
Inflection

In grammar, inflection or inflexion is the way language handles grammatical relations and relational categories such as grammatical tense, grammatical mood, grammatical voice, grammatical aspect, grammatical person, grammatical number, grammatical gender, grammatical case....
 forms "go", "goes", "going", "went", and "gone". The relationship between an inflected form and its lemma is usually denoted by an angle bracket, e.g. "went" < "go". The disadvantage of such simplifications is, of course, the inability to look up a declined or conjugated form of the word, although some dictionaries, like Webster's
Webster's Dictionary

Webster's Dictionary is the name given to a common type of English language dictionary in the United States. The name is derived from lexicographer Noah Webster and has become a genericized trademark for this type of dictionary....
, will list "went". Multilingual dictionaries vary in how they deal with this issue: the Langenscheidt dictionary of German does not list ging (< gehen); the Cassell does.

The form that is chosen to be the lemma is usually the least marked
Markedness

Markedness is a Linguistics concept that developed out of the Prague School. A marked form is a non-basic or less natural form. An unmarked form is a basic, default form....
 form, though there are occasional exceptions; e.g. in Finnish
Finnish language

Finnish is the language spoken by the majority of the population in Finland and by Finnish people outside of Finland. It is one of the official languages of Finland and an official minority language in Sweden....
, the dictionaries lists verbs not under the verb root, but under the first infinitive marked with -(t)a, -(t)ä.

Lemmas or word stem
Word stem

In linguistics, a stem is the part of a word that is common to all its inflection variants. Stems are often root , e.g. atomic, its root is atom, but its stem is atom?ic....
s are used often in corpus linguistics
Corpus linguistics

Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language....
 for determining word frequency. In such usage the specific definition of "lemma" is flexible depending on the task it is being used for.

Lemmas in different languages

In English, the citation form of a noun
Noun

In linguistics, a noun is a member of a large, open class lexical category whose members can occur as the main word in the subject of a clause, the object of a verb, or the object of a preposition....
 is the singular
Grammatical number

In linguistics, grammatical number is a grammatical category of nouns, pronouns, and adjective and verb agreement that expresses count distinctions ....
: e.g. mouse rather than mice. For multi-word lexemes which contain possessive adjective
Possessive adjective

What are traditionally and popularly, if mistakenly, called possessive adjectives — in linguistic analyses possessive pronouns, possessive determiners or genitive pronouns — are a part of speech that prototypically modifies a noun by attributing possession to someone or something ....
s or reflexive pronoun
Reflexive pronoun

A reflexive pronoun is a pronoun that is preceded by the noun or pronoun to which it refers within the same clause. In generative grammar, a reflexive pronoun is an anaphora that must be bound by its antecedent ....
s, the citation form uses a form of the indefinite pronoun
Indefinite pronoun

An indefinite pronoun is a pronoun that refers to one or more unspecified beings, objects, or places.List of English indefinite pronouns...
 one: e.g. do one's best, perjure oneself. In languages with grammatical gender
Grammatical gender

In linguistics, grammatical genders, sometimes also called noun classes, are classes of nouns reflected in the behavior of associated words; every noun must belong to one of the classes and there should be very few which belong to several classes at once....
, the citation form of regular adjectives and nouns is usually the masculine singular. If the language additionally has cases
Grammatical case

In grammar, the case of a noun or pronoun indicates its grammatical function in a greater phrase or clause; such as the role of subject , of direct object, or of possession ....
, the citation form is often the masculine singular nominative.

In many languages, the citation form of a verb
Verb

In syntax, a verb is a word that usually denotes an action , an occurrence , or a state of being . Depending on the language, a verb may vary in form according to many factors, possibly including its grammatical tense, grammatical aspect, grammatical mood and grammatical voice....
 is the infinitive
Infinitive

In grammar, infinitive is the name for certain verb forms that exist in many languages. In the usual description of English language, the infinitive of a verb is its basic form with or without the grammatical particle to: therefore, do and to do, be and to be, and so on are infinitives....
: French
French language

French is a Romance language spoken around the world by around 80 million people as first language, by 190 million as second language, and by about another 200 million people as an acquired tongue, with significant speakers in 54 countries....
 aller, German
German language

German is a West Germanic languages, thus related to and classified alongside English language and Dutch language. It is one of the world's world language and the most widely spoken mother tongue in the European Union....
 gehen. In English it usually is the full infinitive (to go), but the bare infinitive for some defective verb
Defective verb

In linguistics, a defective verb is a verb with an incomplete grammatical conjugation. Defective verbs cannot be conjugated in certain grammatical tense, grammatical aspect, or grammatical mood....
s (must). In Latin, Ancient Greek
Ancient Greek

Ancient Greek is the historical stage in the development of the Greek language spanning across the Archaic Greece , Classical Greece , and Hellenistic civilization periods of ancient Greece and the classical antiquity....
, and Modern Greek
Modern Greek

Modern Greek refers the varieties of Greek spoken in the modern era. The beginning of the "modern" period of the language is often symbolically assigned to the fall of the Byzantine Empire in 1453, even though that date marks no clear linguistic boundary and many characteristic modern features of the language had been present centuries earli...
 (which has no infinitive), however, the first person singular present tense
Present tense

The present tense is the Grammatical tense that may be used to express:* action at the present* a state of being;* a habitual action;* an occurrence in the near future; or...
 is normally used, though occasionally the infinitive may also be seen. (For contracted verbs
Ancient Greek grammar

Ancient Greek grammar ?here mainly referring to that of the Attic Greek? is morphologically complex and preserves several features of Proto-Indo-European language morphology....
 in Greek, an uncontracted first person singular present tense is used to reveal the contract vowel, e.g. f???? philéo for f??? philo "I love" [implying affection]; a?ap?? agapáo for a?ap? agapo "I love" [implying regard]). In Japanese
Japanese language

IPA: [n?iho?go] is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is related to the Ryukyuan languages....
, the non-past (present and future) tense is used.

In Arabic
Arabic language

Arabic is a Central Semitic language, thus related to and classified alongside other Semitic languages languages such as Hebrew language and Aramaic language....
, which has no infinitives, the third person singular masculine of the past tense is the least-marked form, and is used for entries in modern dictionaries. In older dictionaries, which are still commonly used today, the triliteral
Triliteral

The root of verbs and most nouns in the Semitic languages are characterized as a sequence of consonants or "radicals" . Such abstract consonantal roots are used in the derivation of actual words by adding the vowels and non-root consonants which go with a particular morphological category around the root consonants, in an appropriate...
 of the word, either a verb or a noun, is used. Hebrew
Hebrew language

Hebrew is a Semitic languages of the Afro-Asiatic languages. Modern Hebrew is spoken by more than seven million people in Israel and Classical Hebrew is used for prayer or study in Jews communities around the world....
 often uses the 3rd person masculine qal perfect, e.g. ??? bara' create, ??? kaphar deny. For Korean
Korean language

Korean is the official language of North Korea and South Korea. It is also one of the two official languages in the Yanbian Korean Autonomous Prefecture in People's Republic of China....
, -da is attached to the stem.

Some phrases are cited in a sort of lemma, e.g. Carthago delenda est (literally, "Carthage must be destroyed") is a common way of citing Cato
Cato

Cato may refer to:...
, although what he said was more like, Ceterum censeo Carthaginem esse delendam ("As to the rest, I hold that Carthage must be destroyed").

Difference between stem
Word stem

In linguistics, a stem is the part of a word that is common to all its inflection variants. Stems are often root , e.g. atomic, its root is atom, but its stem is atom?ic....
 and lemma

A stem is the part of the word that never changes even when morphologically inflected, whilst a lemma is the base form of the verb. For example given the word "produced" its lemma is "produce" however the stem is "produc" this is because there are words such as production.

Psycholinguistics

When we produce a word, we are essentially turning our thoughts into sounds (a process known as lexicalisation
Lexicalisation

In psycholinguistics lexicalisation is the process of going from Meaning to sound in Speech communication production.In the most widely accepted model, speech production, in which an underlying concept is converted into a word, is at least a two-stage process....
). In many psycholinguistic models this is considered to be at least a two-stage process. The lemma is thus intermediate between the semantic
Semantics

Semantics is the study of meaning in communication. The word is derived from the Greek language word s??a?t???? , "significant", from s??a??? , "to signify, to indicate" and that from s??a , "sign, mark, token"....
 level (where meaning
Meaning (linguistics)

Linguistic strings can be made up of phenomena such as words, phrases, and sentences, each of which has a different kind of meaning. Individual words, such as the word "bachelor", refer to some abstract concept....
 is specified) and the phonological
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
 level (where the sounds of the word are specified). It is an abstract form containing syntactic
Syntax

In linguistics, syntax is the study of the principles and rules for constructing Sentence s in natural languages. In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the Irish syntax"....
 information (about how the word can be used in a sentence), but no information about the pronunciation of the word. In this context, the lexeme is the phonologically specified form that is selected after the lemma.

This two-staged model is the most widely supported theory of speech production in psycholinguistics, although it has been recently challenged. For example, there is some evidence to indicate that the grammatical gender
Grammatical gender

In linguistics, grammatical genders, sometimes also called noun classes, are classes of nouns reflected in the behavior of associated words; every noun must belong to one of the classes and there should be very few which belong to several classes at once....
 of a noun is retrieved from the word's phonological form (the lexeme) rather than from the lemma. This is easily explained by Caramazza's Independent Network model, which does not assume a distinct level between the semantic and the phonological stages (so there is no lemma representation); in this model, syntactic information about the word in this model is activated in the semantic or phonological level (so gender would be activated in the latter).

See also

  • Linguistics
    Linguistics

    Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
  • Corpus linguistics
    Corpus linguistics

    Corpus linguistics is the study of language as expressed in samples or "real world" text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language....
  • Morphology
    Morphology (linguistics)

    Morphology is the identification, analysis and description of structure of words . While words are generally accepted as being the smallest units of syntax, it is clear that in most languages, words can be related to other words by rules....
  • Psycholinguistics
    Psycholinguistics

    Psycholinguistics or psychology of language is the study of the psychology and neurobiology factors that enable humans to acquire, use, and understand language....
  • Markedness
    Markedness

    Markedness is a Linguistics concept that developed out of the Prague School. A marked form is a non-basic or less natural form. An unmarked form is a basic, default form....
  • Principal parts
    Principal parts

    In language learning, the principal parts of a verb are those forms that a student must memorize in order to be able to grammatical conjugation the verb through all its forms....
  • Root (linguistics)
    Root (linguistics)

    The root is the primary lexicology unit of a word, which carries the most significant aspects of semantics content and cannot be reduced into smaller constituents....
  • Null morpheme
    Null morpheme

    In Morphology #Morpheme-based_morphology, a null morpheme is a morpheme that is realized by a phonology null affix . In simpler terms, a null morpheme is an "invisible" affix....
  • Lemmatisation
    Lemmatisation

    Lemmatisation is the process of grouping together the different inflected forms of a word so they can be analysed as a single item.In computing, lemmatisation is the algorithmic process of determining the lemma for a given word....
  • Lexeme
    Lexeme

    A lexeme is an abstract Unit of Morphology Semantic analysis in linguistics, that roughly corresponds to a set of forms taken by a single word....
  • Uninflected word
    Uninflected word

    In the context of morphology , an uninflected word is a word that has no morphological marker s such as affixes, ablaut, consonant gradation, etc., indicating declension or grammatical conjugation....
  • lexical markup framework
    Lexical Markup Framework

    Lexical Markup Framework is the ISO International Organization for Standardization ISO/TC37 standard for natural language processing and machine-readable dictionary lexicons....


External links