Agglutination
Encyclopedia
In contemporary linguistics
Linguistics
Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

, agglutination usually refers to the kind of morphological
Morphology (linguistics)
In linguistics, morphology is the identification, analysis and description, in a language, of the structure of morphemes and other linguistic units, such as words, affixes, parts of speech, intonation/stress, or implied context...

 derivation in which there is a one-to-one correspondence between affix
Affix
An affix is a morpheme that is attached to a word stem to form a new word. Affixes may be derivational, like English -ness and pre-, or inflectional, like English plural -s and past tense -ed. They are bound morphemes by definition; prefixes and suffixes may be separable affixes...

es and syntactical
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....

 categories. Languages that use agglutination widely are called agglutinative language
Agglutinative language
An agglutinative language is a language that uses agglutination extensively: most words are formed by joining morphemes together. This term was introduced by Wilhelm von Humboldt in 1836 to classify languages from a morphological point of view...

s. For example, the Hungarian
Hungarian language
Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

 word hajókon `on ships' may be divided into a root hajó with two endings -k and -on expressing respectively the plural number (hajó-k `ships') and the location `on' something (hajó-n `on a ship'). Moreover, the ending -n is so regular that the Hungarian Wiktionary simply marks this case as "-on/-en/ön" (in English it is called superessive
Superessive case
The Superessive case is a grammatical declension indicating location on top of, or on the surface of something. Its name comes from Latin supersum, superesse: to be over and above....

). In contrast to this, in the Czech translation v lodích, the location is expressed by a combination of a separate word (a preposition v `in') and the locative plural ending ích which is added to the stem loď `ship' and cannot be subdivided into a part expressing plural and a part expressing the locative case. Therefore Czech is not an agglutinative language.

Agglutinative languages are often contrasted both with languages in which syntactic structure is expressed solely by means of word order and auxiliary words (isolating languages) and with languages in which a single affix typically expresses several syntactic categories and a single category may be expressed by several different affixes (as is the case in inflectional (fusional) languages
Fusional language
A fusional language is a type of synthetic language, distinguished from agglutinative languages by its tendency to overlay many morphemes in a way that can be difficult to segment....

). However, both fusional and isolating languages may use agglutination in the most-often-used constructs, and use agglutination heavily in certain contexts, such as word derivation. This is the case in English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

, which has an agglutinated plural marker -(e)s and derived words such as shame·less·ness.

Agglutinative suffixes are often inserted irrespective of syllabic boundaries
Syllable
A syllable is a unit of organization for a sequence of speech sounds. For example, the word water is composed of two syllables: wa and ter. A syllable is typically made up of a syllable nucleus with optional initial and final margins .Syllables are often considered the phonological "building...

, for example, by adding a consonant to the syllable coda
Syllable coda
In phonology, a syllable coda comprises the consonant sounds of a syllable that follow the nucleus, which is usually a vowel. The combination of a nucleus and a coda is called a rime. Some syllables consist only of a nucleus with no coda...

 as in English tie – ties. Agglutinative languages also have large inventories of enclitics
Clitic
In morphology and syntax, a clitic is a morpheme that is grammatically independent, but phonologically dependent on another word or phrase. It is pronounced like an affix, but works at the phrase level...

, too, which can be and are separated from the word root by native speakers in daily usage.

Note that the term
agglutination is sometimes used more generally to refer to the morphological process of adding suffixes or other morphemes to the base of a word. This is treated in more detail in the section on other uses of the term.

Examples of agglutinative languages

Examples of agglutinative languages include the Uralic languages
Uralic languages
The Uralic languages constitute a language family of some three dozen languages spoken by approximately 25 million people. The healthiest Uralic languages in terms of the number of native speakers are Hungarian, Finnish, Estonian, Mari and Udmurt...

, such as Finnish
Finnish language
Finnish is the language spoken by the majority of the population in Finland Primarily for use by restaurant menus and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a...

, Estonian
Estonian language
Estonian is the official language of Estonia, spoken by about 1.1 million people in Estonia and tens of thousands in various émigré communities...

, and Hungarian
Hungarian language
Hungarian is a Uralic language, part of the Ugric group. With some 14 million speakers, it is one of the most widely spoken non-Indo-European languages in Europe....

. These have highly agglutinated expressions in daily usage, and most words are bisyllabic or longer. Grammatical information expressed by adposition
Adposition
Prepositions are a grammatically distinct class of words whose most central members characteristically express spatial relations or serve to mark various syntactic functions and semantic roles...

s in Western Indo-European languages is typically found in suffixes. For example, the Finnish word talossanikin means "in my house, too". Derivation can also be quite complex. For example, Finnish epäjärjestelmällisyys has the root järki 'logos
Logos
' is an important term in philosophy, psychology, rhetoric and religion. Originally a word meaning "a ground", "a plea", "an opinion", "an expectation", "word," "speech," "account," "reason," it became a technical term in philosophy, beginning with Heraclitus ' is an important term in...

', and consists of negative-root-causative
Causative
In linguistics, a causative is a form that indicates that a subject causes someone or something else to do or be something, or causes a change in state of a non-volitional event....

-frequentative
Frequentative
In grammar, a frequentative form of a word is one which indicates repeated action. The frequentative form can be considered a separate, but not completely independent word, called a frequentative...

-nominalizer-adessive-"related to"-"property", and means "the property of being unsystematic," "unsystematicalness." The word has lots of stem changes, so Finnish is not the best example of an agglutinative language.

Hungarian uses extensive agglutination in almost all and any part of it. The suffixes follow each other in special order, and can be heaped in extreme amount, resulting words conveying complex meanings in very compact form. An example is fiaiéi where the root "fi-" means "son", the subsequent 4 vowels are all separate suffixes, and the whole word means "[properties] of his/her sons". The nested possessive structure and expression of plurals is quite remarkable (note that Hungarian uses no genders).

Agglutination is used very heavily in some Native American
Indigenous peoples of the Americas
The indigenous peoples of the Americas are the pre-Columbian inhabitants of North and South America, their descendants and other ethnic groups who are identified with those peoples. Indigenous peoples are known in Canada as Aboriginal peoples, and in the United States as Native Americans...

 language
Language
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...

s, such as Nahuatl
Nahuatl
Nahuatl is thought to mean "a good, clear sound" This language name has several spellings, among them náhuatl , Naoatl, Nauatl, Nahuatl, Nawatl. In a back formation from the name of the language, the ethnic group of Nahuatl speakers are called Nahua...

, Quechua
Quechua languages
Quechua is a Native South American language family and dialect cluster spoken primarily in the Andes of South America, derived from an original common ancestor language, Proto-Quechua. It is the most widely spoken language family of the indigenous peoples of the Americas, with a total of probably...

, Tz'utujil
Tz'utujil
The Tz'utujil are a Native American people, one of the 21 Maya ethnic groups that dwell in Guatemala. Together with the Xinca, Garífunas and the Ladinos, they make up the 24 ethnic groups in this relatively small country. Approximately 100,000 Tz'utujil live in the area around Lake Atitlán...

, Kaqchikel
Kaqchikel language
The Kaqchikel, or Kaqchiquel, language is an indigenous Mesoamerican language and a member of the Quichean–Mamean branch of the Mayan languages family. It is spoken by the indigenous Kaqchikel people in central Guatemala...

, Cha'palaachi and K'iche
K'iche' language
The K’iche’ language is a part of the Mayan language family. It is spoken by many K'iche' people in the central highlands of Guatemala. With close to a million speakers , it is the second-most widely spoken language in the country after Spanish...

, where one word can contain enough morpheme
Morpheme
In linguistics, a morpheme is the smallest semantically meaningful unit in a language. The field of study dedicated to morphemes is called morphology. A morpheme is not identical to a word, and the principal difference between the two is that a morpheme may or may not stand alone, whereas a word,...

s to convey the meaning of what would be a complex sentence
Sentence (linguistics)
In the field of linguistics, a sentence is an expression in natural language, and often defined to indicate a grammatical unit consisting of one or more words that generally bear minimal syntactic relation to the words that precede or follow it...

 in other languages.

Agglutination is also a common feature of Basque
Basque language
Basque is the ancestral language of the Basque people, who inhabit the Basque Country, a region spanning an area in northeastern Spain and southwestern France. It is spoken by 25.7% of Basques in all territories...

. The conjugations of verbs, for example, are done by adding different prefixes or suffixes to the root of the verb: dakartzat, which means 'I bring them', is formed by da (indicates present tense), kar (root of the verb ekarri-> bring), tza (indicates plural) and t (indicates subject, in this case, "I"). Another example would be the declination: Etxean = "In the house" where etxe = house.

Almost all of the Philippine languages
Philippine languages
The Philippine languages are a 1991 proposal by Robert Blust that all the languages of the Philippines and northern Sulawesi—except Sama–Bajaw and a few languages of Palawan—form a subfamily of Austronesian languages...

 also belong to this category. This enables them, especially Filipino
Filipino language
This move has drawn much criticism from other regional groups.In 1987, a new constitution introduced many provisions for the language.Article XIV, Section 6, omits any mention of Tagalog as the basis for Filipino, and states that:...

, to form new words from simple base forms. An example is
nakakapagpabagabag, which means causing someone or something to be upset and is formed from the root bagabag, which means upset/upsetting.

Japanese
Japanese language
is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

 is also an agglutinating language, adding information such as negation
Negation (rhetoric)
In rhetoric, where the role of the interpreter is taken into consideration as a non-negligible factor, negation bears a much wider range of functions and meanings than it does in logic, where the interpretation of signs for negation is constrained by axioms to a few standard options, typically just...

, passive voice
Voice
Voice may refer to:* Human voice* Voice control or voice activation* Writer's voice* Voice acting* Voice vote* Voice message-In film:* Voice , a 2005 South Korean film* The Voice , a 2010 Turkish horror film directed by Ümit Ünal...

, past tense
Grammatical tense
A tense is a grammatical category that locates a situation in time, to indicate when the situation takes place.Bernard Comrie, Aspect, 1976:6:...

, honorific
Honorific
An honorific is a word or expression with connotations conveying esteem or respect when used in addressing or referring to a person. Sometimes, the term is used not quite correctly to refer to an honorary title...

 degree and causality in the verb form. Common examples would be
hatarakaseraretara (働かせられたら), which combines causative, passive or potential, and conditional conjugations to arrive at two meanings depending on context "if (subject) had been made to work..." and "if (subject) could make (object) work", and tabetakunakatta (食べたくなかった), which combines desire, negation, and past tense conjugations to mean "(subject) did not want to eat".

Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

 is another agglutinating language: the expression
Avustralyalılaştıramadıklarımızdanmışsınızcasına is pronounced as one word in Turkish, but it can be translated into English as "as if you were one of those whom we could not make resemble the Australian people."

All Dravidian languages
Dravidian languages
The Dravidian language family includes approximately 85 genetically related languages, spoken by about 217 million people. They are mainly spoken in southern India and parts of eastern and central India as well as in northeastern Sri Lanka, Pakistan, Nepal, Bangladesh, Afghanistan, Iran, and...

, including Kannada, Telugu
Telugu language
Telugu is a Central Dravidian language primarily spoken in the state of Andhra Pradesh, India, where it is an official language. It is also spoken in the neighbouring states of Chattisgarh, Karnataka, Maharashtra, Orissa and Tamil Nadu...

, Malayalam and Tamil
Tamil language
Tamil is a Dravidian language spoken predominantly by Tamil people of the Indian subcontinent. It has official status in the Indian state of Tamil Nadu and in the Indian union territory of Pondicherry. Tamil is also an official language of Sri Lanka and Singapore...

, are agglutinative. Agglutination is used to very high degrees both in formal written forms in Tamil (e.g.
sevvaanam "red sky") and in colloquial spoken forms of the language (e.g. sokkathangam "pure gold").

Esperanto
Esperanto
is the most widely spoken constructed international auxiliary language. Its name derives from Doktoro Esperanto , the pseudonym under which L. L. Zamenhof published the first book detailing Esperanto, the Unua Libro, in 1887...

 is a constructed
Constructed language
A planned or constructed language—known colloquially as a conlang—is a language whose phonology, grammar, and/or vocabulary has been consciously devised by an individual or group, instead of having evolved naturally...

 auxiliary language
International auxiliary language
An international auxiliary language or interlanguage is a language meant for communication between people from different nations who do not share a common native language...

 with highly regular grammar and agglutinative word morphology. See Esperanto vocabulary
Esperanto vocabulary
The word base of Esperanto was originally defined by Lingvo internacia, published by Zamenhof in 1887. It contained some 900 root words. The rules of the language allow speakers to borrow words as needed, recommending only that they look for the most international words, and that they borrow one...

.

Whilst agglutination is characteristic of certain language families, it would be facile to jump to the conclusion that when several languages in similar geographic area are all agglutinative, they necessarily have to be related in the phylogenetic sense. In particular, such a conclusion formerly led linguists to propose the so-called Ural–Altaic language family which would (in the largest scope ever proposed) include Uralic and Turkic languages as well as Mongolian, Korean and Japanese. However, contemporary linguistics views this proposal as controversial.

On the other hand, it is also the case that some languages that have developed from agglutinative proto-languages have lost this feature. For example, contemporary Estonian, which is so closely related to Finnish that the two languages are mutually intelligible, has shifted towards the fusional type. (It has also lost other features typical of the Uralic families, such as vowel harmony.)

Slots

As noted above, it is a typical feature of agglutinative languages that there is a one-to-one correspondence between suffixes and syntactic categories. For example, a noun may have separate markers for number, case, possessive or conjunctive usage etc. The order of these affixes is fixed, so we may view any given noun or verb as a stem followed by several inflectional slots, i.e. positions in which inflectional suffixes may occur. It is often the case that the most common instance of a given grammatical category is unmarked, i.e. the corresponding affix is empty.

The number of slots for a given part of speech can be surprisingly high. For example, a finite Korean
Korean language
Korean is the official language of the country Korea, in both South and North. It is also one of the two official languages in the Yanbian Korean Autonomous Prefecture in People's Republic of China. There are about 78 million Korean speakers worldwide. In the 15th century, a national writing...

 verb has seven slots (the brackets indicate parts of morphemes which may be omitted in some phonological environments):
  1. honorific: -(ǔ)si is used when the speaker is honouring the subject of the sentence
  2. tense: (ə)s for completed (past) action or state; when this slot is empty, the tense is interpreted as present
  3. experiential-contrastive aspect: (ə)s doubling the past tense marker means "the subject has had the experience described by the verb"
  4. modal: kes is used with first-person-subjects only for definite future and with second-or-third-person-subjects also for probable present or past
  5. formal: (sǔ)pni expresses politeness to the hearer
  6. retrospective aspect: indicates that the speaker recollects what he observed in the past and reports in in the present situation
  7. mood: ta for declarative, k'a for interrogative, la for imperative, ca for propositive, yo for polite declarative and a large number of other possible mood markers


Moreover, passive and causative verbal forms can be derived by adding suffixes to the base, which could be seen as the null-th slot; however, passives are not as commonly used as in English and many verbs do not allow passivization at all.

Even though some combinations of suffixes are not possible (e.g. only one of the aspect slots may be filled with a non-empty suffix), over 400 verb forms may be formed from a single base. Here are a few examples formed from the word root ka `to go'; the numbers indicate which slots contain non-empty suffixes:
  • 7 (imperative mood marker): imperative suffix -la combines with the root ka- to express imperative: ka-la `Go!';
  • 7 (propositive mood marker): if we want to express proposition rather than command, we use the propositive mood marker -ca instead of -la: ka-ca `Let's go!'
  • 5 and 7: If the speaker wants to show respect for the hearer, he uses the politeness marker -pni (in slot 5); various mood markers may be simultaneously used (in slot 7, therefore after the politeness marker): ka-pni-ta `He is going.', ka-pni-k'a? `Is he going?'
  • 6: retrospective aspect: John i cip e ka-tə la `I observed that John was going home and now I am reporting that to you.'
  • 7: simple indicative: sənsæŋnim i cip e ka-n-ta `The teacher is going home. (not expressing respect or politeness)'
  • 5 and 7: politeness towards the hearer: sənsæŋnim i cip e ka-pni-ta or sənsæŋnim i cip e ka-yo `The teacher is going home.',
  • 1 and 7: respect towards the subject: sənsæŋnim i cip e ka-si-n-ta `The (respected) teacher is going home.'
  • 1, 5 and 7: two kinds of politeness in one sentence: sənsæŋnim i cip e ka-si-əyo or sənsæŋnim i cip e ka-si-pni-ta `The teacher is going home. (expressing respect both to the hearer and the teacher)'
  • 2, 3 and 7: past forms: John i hakkyo e ka-s'-ta `John has gone to school (and is there now).', John i hakkyo e ka-s'-əs'-ta `John has been to school (and has come back).'
  • 4 and 7: first person modal: næ ka næil ka-kes'-ta `I will go tomorrow.'
  • 4 and 7: third person modal: John ka næil ka-kes'-ta `I suppose that John will go tomorrow.', John ka ace ka-kes'-ta `I suppose that John left yesterday.'

Suffixing or prefixing

Whilst most agglutinative languages in Europe and Asia use predominantly suffixing, the Bantu languages
Bantu languages
The Bantu languages constitute a traditional sub-branch of the Niger–Congo languages. There are about 250 Bantu languages by the criterion of mutual intelligibility, though the distinction between language and dialect is often unclear, and Ethnologue counts 535 languages...

 of southern Africa are known for a highly complex mixture of prefixes, suffixes and reduplication. A typical feature of this language family is that nouns fall into noun classes. To each noun class, there are specific singular and plural prefixes, which also serve as markers of agreement between the subject and the verb. Moreover, the noun determines prefixes of all words that modify it and subject determines prefixes of other elements in the same verb-phrase. For example, Swahili nouns -toto `child' and -tu `person' fall into class 1, with singular prefix m- and plural prefix wa-, whilst -tabu `book' falls into class 7, with singular prefix ki- and plural prefix vi-. The following sentences may be formed:
  • m-toto a-li-fika `The child arrived.'
  • m-toto a-ta-fika `The child will arrive.'
  • wa-toto wa-li-fika `The children arrived.'
  • wa-toto wa-ta-fika `The children will arrive.'
  • m-tu a-li-lala `The person slept.'
  • m-tu a-ta-lala `The person will sleep.'
  • wa-tu wa-li-lala `The persons slept.'
  • wa-tu wa-ta-lala `The persons will sleep.'
  • ki-tabu ki-li-anguka `The book fell.'
  • ki-tabu ki-ta-anguka `The book will fall.'
  • vi-tabu vi-li-anguka `The books fell.'
  • vi-tabu vi-ta-anguka `The books will fall.'


  • yu-le
    1sg-that

    m-tu
    1sg-person

    m-moja
    1sg-one

    m-refu
    1sg-tall

    a-li
    1sg-he-past

    y-e
    7sg-rel.-it

    ki-soma
    7sg-read

    ki-le
    7sg-that

    ki-tabu
    7sg-book

    ki-refu
    7sg-long

    `That one tall person who read that long book.'

    wa-le
    1pl-that

    wa-tu
    1pl-person

    wa-wili
    1pl-two

    wa-refu
    1pl-tall

    wa-li
    1pl-he-past

    (w)-o
    7pl-rel.-it

    vi-soma
    7pl-read

    vi-le
    7pl-that

    vi-tabu
    7pl-book

    vi-refu
    7pl-long
    `Those two tall people who read those long books.'

    Agglutination in the context of quantitative linguistics

    We have already mentioned the fact that most languages include inflectional, agglutinative and isolating constructions side by side. The American linguist Joseph Harold Greenberg
    Joseph Greenberg
    Joseph Harold Greenberg was a prominent and controversial American linguist, principally known for his work in two areas, linguistic typology and the genetic classification of languages.- Early life and career :...

     in his 1960 paper A quantitative approach to the morphological typology of language proposed to use the so-called agglutinative index to calculate a numerical value which would allow a researcher to compare the "degree of agglutitativeness" of various languages. For Greenberg, agglutination means that the morphs are joined only with slight or no modification. A morpheme
    Morpheme
    In linguistics, a morpheme is the smallest semantically meaningful unit in a language. The field of study dedicated to morphemes is called morphology. A morpheme is not identical to a word, and the principal difference between the two is that a morpheme may or may not stand alone, whereas a word,...

     is said to be automatic if it either takes a single surface form (morph), or if its surface form is determined by phonological rules that hold in all similar instances in that language. A morph juncture – a position in a word where two morphs meet – is considered agglutinative when both morphemes included are automatic. The index of agglutination is equal to the average ratio of the number of agglutinative junctures to the number of morph junctures. Languages with high values of the agglutinative index are agglutinative and with low values of the agglutinative index are fusional.

    In the same paper, Greenberg proposed several other indices, many of which turn out to be relevant to the study of agglutination. The synthetic index is the average number of morphemes per word, with the lowest conceivable value equal to 1 for isolating (analytic) languages and real-life values rarely exceeding 3. The compounding index is equal to the average number of root morphemes per word (as opposed to derivational and inflectional morphemes). The derivational, inflectional, prefixial and suffixial indices correspond respectively to the average number of derivational and inflectional morphemes, prefixes and suffixes.

    Here is a table of sample values:
    agglutination synthesis compounding derivation inflection prefixing suffixing
    Swahili 0.67 2.56 1.00 0.03 0.31 0.45 0.16
    spoken Turkish 0.67 1.75 1.04 0.06 0.38 0.00 0.44
    written Turkish 0.60 2.33 1.00 0.11 0.43 0.00 0.54
    Yakut 0.51 2.17 1.02 0.16 0.38 0.00 0.53
    Greek 0.40 1.82 1.02 0.07 0.37 0.02 0.42
    English 0.30 1.67 1.00 0.09 0.32 0.02 0.38
    Eskimo 0.03 3.70 1.00 0.34 0.47 0.00 0.73

    Phonetics and agglutination

    The one-to-one relationship between an affix and its grammatical function may be somewhat complicated by the phonological processes active in the given language. For example, the following two phonological phenomena appear in many of the Uralic languages
    Uralic languages
    The Uralic languages constitute a language family of some three dozen languages spoken by approximately 25 million people. The healthiest Uralic languages in terms of the number of native speakers are Hungarian, Finnish, Estonian, Mari and Udmurt...

    , and the latter also in Altaic languages
    Altaic languages
    Altaic is a proposed language family that includes the Turkic, Mongolic, Tungusic, and Japonic language families and the Korean language isolate. These languages are spoken in a wide arc stretching from northeast Asia through Central Asia to Anatolia and eastern Europe...

    :
    • consonant gradation
      Consonant gradation
      Consonant gradation is a type of consonant mutation, in which consonants alternate between various "grades". It is found in some Uralic languages such as Finnish, Estonian, Northern Sámi, and the Samoyed language Nganasan. In addition, it has been reconstructed for Proto-Germanic, the parent...

      , meaning that there is alternation between certain pairs of consonant clusters such that one member of the pair appears at the beginning of an open syllable and the other at the beginning of a closed syllable;
    • vowel harmony
      Vowel harmony
      Vowel harmony is a type of long-distance assimilatory phonological process involving vowels that occurs in some languages. In languages with vowel harmony, there are constraints on which vowels may be found near each other....

      , meaning that only specific subclasses of vowels coexist in a non-compounded word.


    Several examples from Finnish
    Finnish language
    Finnish is the language spoken by the majority of the population in Finland Primarily for use by restaurant menus and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a...

     will illustrate how these two rules and other phonological processes lead to diversions from the basic one-to-one relationship between morphs and their syntactic and semantic function. No phonological rule is applied in the conjugation of talo `house'. However, the second example illustrates several kinds of phonological phenomena.
    talo
    `house'
    märkä paita
    `a wet shirt'
    the roots contain consonant clusters -rk- and -t-
    talo-n
    `of the house'
    märä-n paida-n
    `of a wet shirt'
    consonant gradation: the genitive suffix -n closes the preceding syllable;
              rk -> r, t->d
    talo-ssa
    `in the house'
    märä-ssä paida-ssa
    `in a wet shirt'
    vowel harmony: a word containing ä may not contain the vowels a, o, u;
              an allomorph of the inessive ending -ssa/ssä is used
    talo-i-ssa
    `in the houses'
    mär-i-ssä paido-i-ssa
    `in wet shirts'
    phonological rules also imply different vowel changes when the plural marker -i- meets a stem-final vowel

    Extremes of agglutination

    It is possible to construct artificial extreme examples of agglutination, which have no real use, but illustrate the theoretical capability of the grammar to agglutinate. This is not a question of "long words", since some languages permit limitless combinations with compound words, negative clitics or such, which can be (and are) expressed with an analytic structure in actual usage.

    English is capable of agglutinating morphemes of solely Germanic
    Germanic languages
    The Germanic languages constitute a sub-branch of the Indo-European language family. The common ancestor of all of the languages in this branch is called Proto-Germanic , which was spoken in approximately the mid-1st millennium BC in Iron Age northern Europe...

     origin, as un-whole-some-ness, but generally speaking the longest words
    Longest words
    The longest word in any given language depends on the word formation rules of each specific language, and on the types of words allowed for consideration. Agglutinative languages allow for the creation of long words via compounding. Even non-agglutinative languages may allow word formation of...

     are assembled from forms of Latin
    Latin
    Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

     or Ancient Greek
    Ancient Greek
    Ancient Greek is the stage of the Greek language in the periods spanning the times c. 9th–6th centuries BC, , c. 5th–4th centuries BC , and the c. 3rd century BC – 6th century AD of ancient Greece and the ancient world; being predated in the 2nd millennium BC by Mycenaean Greek...

     origin. The classic example is antidisestablishmentarianism
    Antidisestablishmentarianism
    Antidisestablishmentarianism is a political position that originated in 19th-century Britain in opposition to proposals for the disestablishment of the Church of England, that is, to remove the Anglican Church's status as the state church of England, Ireland, and Wales.The establishment was...

    . Agglutinative languages often have more complex derivational agglutination than isolating languages, so they can do the same to a much larger extent. For example, in Hungarian, a word such as elnemzetietleníthetetlenségnek, which means "for [the purposes of] undenationalizationability" can find actual use. The same way, there are the words that have their meaning but probably are never used such as legeslegmegszentségteleníttethetetlenebbjeitekként, which means "like the most of most undesecratable ones of you", but hard to decipher in meaning when heard by native speakers. Using inflectional agglutination, these can be extended. For example, the official Guinness world record is Finnish epäjärjestelmällistyttämättömyydellänsäkäänköhän "I wonder if – even with his/her quality of not having been made unsystematized". It has the derived word epäjärjestelmällistyttämättömyys as the root and is lengthened with the inflectional endings -llänsäkäänköhän. However, this word is grammatically unusual, since -kään "also" is used only in negative clauses, but -kö (question) only in question clauses.
    A very popular Turkish agglutination is Çekoslovakyalılaştırabildiklerimizden miydiniz?, which actually is one word, however, the question suffixes (miydiniz in this case) are written separately and the word stands for Were you one of those who we failed to assimilate as a Czechoslovakian?. This historical reference is used as a joke for the individuals who are hard to change or those who stick out in a group.

    On the other hand, Afyonkarahisarlılaştırabildiklerimizdenmişsinizcesine is a longer word and it does not surprise people as it contains no spaces and the latter stands for As if you are one of the people that we made resemble from Afyonkarahisar. A recent addition to the claims has come with the introduction of the following word in Turkish muvaffakiyetsizleştiricileştiriveremeyebileceklerimizdenmişsinizcesine, which means something like (you are talking) as if you are one of those that we cannot easily convert into an unsuccessful-person-maker (someone who un-educates people to make them unsuccessful).

    Georgian is also highly agglutinative language, for example the word gadmosakontrrevolucieleblebisnairebisatvisaco (გადმოსაკონტრრევოლუციელებლებისნაირებისათვისაცო) would mean (someone not specified) said that it is also for those who are like the ones who need to be to again/back contrrevolutionized.

    Other uses of the words agglutination and agglutinative

    The words agglutination and agglutinative come from the Latin word agglutinare, `to glue together'. In linguistics, these words have been in use since 1836, when Wilhelm von Humboldt's
    Wilhelm von Humboldt
    Friedrich Wilhelm Christian Karl Ferdinand Freiherr von Humboldt was a German philosopher, government functionary, diplomat, and founder of Humboldt Universität. He is especially remembered as a linguist who made important contributions to the philosophy of language and to the theory and practice...

     posthumously published work Über die Verschiedenheit des menschlichen Sprachbaues und ihren Einfluß auf die geistige Entwicklung des Menschengeschlechts introduced the division of languages into isolating, inflectional, agglutinative and incorporating.

    Especially in some older literature, agglutinative is sometimes used as a synonym for synthetic
    Synthetic language
    In linguistic typology, a synthetic language is a language with a high morpheme-per-word ratio, as opposed to a low morpheme-per-word ratio in what is described as an isolating language...

    . In that case, it embraces what we call agglutinative and inflectional languages, and it is an antonym of analytic or isolating
    Isolating language
    An isolating language is a type of language with a low morpheme-per-word ratio — in the extreme case of an isolating language words are composed of a single morpheme...

    . Besides the clear etymological motivation (after all, inflectional endings are also "glued" to the stems), this more general usage is justified by the fact that the distinction between agglutinative and inflectional languages is not a sharp one, as we have already seen.

    In the second half of the 19th century, many linguists believed that there is a natural cycle of language evolution: function words of the isolating type are glued to their head-words, so that the language becomes agglutinative; later morphs become merged through phonological processes, and what comes out is an inflectional language; finally inflectional endings are often dropped in quick speech, inflection is omitted and the language goes back to the isolating type.

    The following passage from Lord (1960) demonstrates well the whole range of meanings that the word agglutination may have.
    (Agglutination...) consists of the welding together of two or more terms constantly occurring as a syntagmatic group into a single unit, which becomes either difficult or impossible to analyse thereafter.

    Agglutination takes various forms. In French, welding becomes complete fusion. Latin hanc horam `at this hour' is the French adverbial unit encore. Old French tous jours becomes toujours, and dès jà (`since now') déjà (`already'). In English, on the other hand, apart from rare combinations such as good-bye from God be with you, walnut from Wales nut, window from wind-eye (O.N. vindauga), the units making up the agglutinated forms retain their identity. Words like blackbird and beefeater are a different kettle of fish; they retain their units but their ultimate meaning is not fully deducible from these units. (...)

    Saussure preferred to distinguish between compound words and truly synthesised or agglutinated combinations.

    Agglutinative languages in Natural Language Processing

    In natural language processing, languages with rich morphology pose problems of quite a different kind than isolating languages. In the case of agglutinative languages, the main obstacle lies in the large number of word forms that can be obtained from a single root. As we have already seen, the generation of these word forms is somewhat complicated by the phonological processes of the particular language. Although the basic one-to-one relationship between form and syntactic function is not broken in Finnish, the authoritative institution Kotimaisten kielten tutkimuskeskus (KOTUS, i.e. the Institute for the Languages of Finland) lists 51 declension types for Finnish nouns, adjectives, pronouns, and numerals.

    Even more problems occur with the recognition of word forms. Modern linguistic methods are largely based on the exploitation of corpora; however, when the number of possible word forms is large, any corpus will necessarily contain only a small fraction of them. Hajič (2010) claims that computer space and power are so cheap nowadays that all possible word forms may be generated beforehands and stored in a form of a lexicon listing all possible interpretations of any given word form. (The data structure of the lexicon has to be optimized so that the search is quick and efficient.) According to Hajič, it is the disambiguation of these word forms which is difficult (more so for inflective languages where the ambiguity is high than for agglutinative languages).

    Other authors do not share Hajič's view that space is no issue and instead of listing all possible word forms in a lexicon, word form analysis is implemented by modules which try to break up the surface form into a sequence of morphemes occurring in an order permissible by the language. The problem of such an analysis is the large number of morpheme boundaries typical for agglutinative languages. A word of an inflectional language has only one ending and therefore the number of possible divisions of a word into the base and the ending is only linear with the length of the word. In an agglutinative language, where several suffixes are concatenated at the end of the word, the number of different divisions which have to be checked for consistency is large. This approach was used for example in the development of a system for Arabic, where agglutination occurs when articles, prepositions
    and conjunctions are joined with the following word and pronouns are joined with the preceding word. See Grefenstette et. al (2005) for more details.
    The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
     
    x
    OK