Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Compound (linguistics)

Compound (linguistics)

Discussion
Ask a question about 'Compound (linguistics)'
Start a new discussion about 'Compound (linguistics)'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
In linguistics
Linguistics
Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....

, a compound is a lexeme
Lexeme
A lexeme is an abstract unit of morphological analysis in linguistics, that roughly corresponds to a set of forms taken by a single word. For example, in the English language, run, runs, ran and running are forms of the same lexeme, conventionally written as RUN...

 (less precisely, a word
Word
In language, a word is the smallest free form that may be uttered in isolation with semantic or pragmatic content . This contrasts with a morpheme, which is the smallest unit of meaning but will not necessarily stand on its own...

) that consists of more than one stem
Word stem
In linguistics, a stem is a part of a word. The term is used with slightly different meanings.In one usage, a stem is a form to which affixes can be attached. Thus, in this usage, the English word friendships contains the stem friend, to which the derivational suffix -ship is attached to form a new...

. Compounding or composition is the word formation
Word formation
In linguistics, word formation is the creation of a new word. Word formation is sometimes contrasted with semantic change, which is a change in a single word's meaning...

 that creates compound lexemes (the other word-formation process being derivation
Derivation (linguistics)
In linguistics, derivation is the process of forming a new word on the basis of an existing word, e.g. happi-ness and un-happy from happy, or determination from determine...

). Compounding or Word-compounding refers to the faculty and device of language to form new words by combining or putting together old words. In other words, compound, compounding or word-compounding occurs when a person attaches two or more words together to make them one word. The meanings of the words interrelate in such a way that a new meaning comes out which is very different from the meanings of the words in isolation.

Formation of compounds


Compound formation rules vary widely across language types.

In a synthetic language
Synthetic language
In linguistic typology, a synthetic language is a language with a high morpheme-per-word ratio, as opposed to a low morpheme-per-word ratio in what is described as an isolating language...

, the relationship between the elements of a compound may be marked with a case or other morpheme. For example, the German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

 compound Kapitänspatent consists of the lexemes Kapitän (sea captain) and Patent (license) joined by an -s- (originally a genitive case
Genitive case
In grammar, genitive is the grammatical case that marks a noun as modifying another noun...

 suffix); and similarly, the Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

 lexeme paterfamilias contains the archaic
Old Latin
Old Latin refers to the Latin language in the period before the age of Classical Latin; that is, all Latin before 75 BC...

 genitive form familias of the lexeme familia (family). Conversely, in the Hebrew language
Hebrew language
Hebrew is a Semitic language of the Afroasiatic language family. Culturally, is it considered by Jews and other religious groups as the language of the Jewish people, though other Jewish languages had originated among diaspora Jews, and the Hebrew language is also used by non-Jewish groups, such...

 compound, the word בֵּית סֵפֶר bet sefer (school), it is the head that is modified: the compound literally means "house-of book", with בַּיִת bayit (house) having entered the construct state to become בֵּית bet (house-of). This latter pattern is common throughout the Semitic languages
Semitic languages
The Semitic languages are a group of related languages whose living representatives are spoken by more than 270 million people across much of the Middle East, North Africa and the Horn of Africa...

, though in some it is combined with an explicit genitive case, so that both parts of the compound are marked.

Agglutinative language
Agglutinative language
An agglutinative language is a language that uses agglutination extensively: most words are formed by joining morphemes together. This term was introduced by Wilhelm von Humboldt in 1836 to classify languages from a morphological point of view...

s tend to create very long words with derivational morphemes. Compounds may or may not require the use of derivational morphemes also.
The longest compounds in the world may be found in the Finnish
Finnish language
Finnish is the language spoken by the majority of the population in Finland Primarily for use by restaurant menus and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a...

 and Germanic languages
Germanic languages
The Germanic languages constitute a sub-branch of the Indo-European language family. The common ancestor of all of the languages in this branch is called Proto-Germanic , which was spoken in approximately the mid-1st millennium BC in Iron Age northern Europe...

. In German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

, extremely
extendable compound words can be found in the language of chemical compounds, where in the cases of biochemistry and polymers, they can be practically unlimited in length.
German examples include Farbfernsehgerät (color television set), Funkfernbedienung (radio remote control), and the jocular word Donaudampfschifffahrtsgesellschaftskapitänsmütze (Danube steamboat shipping company
Donaudampfschiffahrtsgesellschaft
The Erste Donau-Dampfschiffahrts-Gesellschaft, or Donaudampfschiffahrtsgesellschaft was a shipping company founded in 1829 by the Austrian government for transporting passengers and cargo on the Danube.In 1880, the DDSG was the world's largest river shipping company with more than 200 steamboat...

 Captain's hat).

In Finnish there is no theoretical limit to the length of compound words, but in practice words consisting of more than three components are rare. Even those can look mysterious to non-Finnish, take hätäuloskäytävä (emergency exit) as an example. Internet folklore sometimes suggests that lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas (Airplane jet turbine engine auxiliary mechanic non-commissioned officer student) would be the longest word in Finnish, but evidence of it actually being used is scant and anecdotic at best.

Compounds can be rather long when translating technical documents from English to some other language, for example, Swedish. "Motion estimation search range settings" can be directly translated to rörelseuppskattningssökintervallsinställningar; the length of the words are theoretically unlimited, especially in chemical terminology.

Semantic classification


A common semantic classification of compounds yields four types:
  • endocentric
  • exocentric (also bahuvrihi)
  • copulative (also dvandva)
  • appositional


An endocentric
Endocentric
In linguistics, an endocentric construction is a grammatical construction that fulfills the same linguistic function as one of its parts. An endocentric construction is not an exocentric construction, and an exocentric construction is not an endocentric construction...

 compound
consists of a head
Head (linguistics)
In linguistics, the head is the word that determines the syntactic type of the phrase of which it is a member, or analogously the stem that determines the semantic category of a compound of which it is a component. The other elements modify the head....

, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning. For example, the English compound doghouse, where house is the head and dog is the modifier, is understood as a house intended for a dog. Endocentric compounds tend to be of the same part of speech (word class) as their head, as in the case of doghouse. (Such compounds were called tatpuruṣa
Tatpurusa
In Sanskrit grammar a ' compound is a dependent determinative compound, i.e. a compound XY meaning a type of Y which is related to X in a way corresponding to one of the grammatical cases of X....

in the Sanskrit tradition.)

Exocentric compounds (called a bahuvrihi
Bahuvrihi
A bahuvrihi compound is a type of compound that denotes a referent by specifying a certain characteristic or quality the referent possesses. A bahuvrihi is exocentric, so that the compound is not a hyponym of its head...

compound in the Sanskrit tradition) are hyponyms of some unexpressed semantic head (e.g. a person, a plant, an animal...), and their meaning often cannot be transparently guessed from its constituent parts. For example, the English compound white-collar is neither a kind of collar nor a white thing. In an exocentric compound, the word class is determined lexically, disregarding the class of the constituents. For example, a must-have is not a verb but a noun. The meaning of this type of compound can be glossed as "(one) whose B is A", where B is the second element of the compound and A the first. A bahuvrihi compound is one whose nature is expressed by neither of the words: thus a white-collar person is neither white nor a collar (the collar's colour is a metaphor for socioeconomic status). Other English examples include barefoot and Blackbeard.

Copulative compounds are compounds which have two semantic heads.

Appositional compounds refer to lexemes that have two (contrary) attributes which classify the compound.
Type Description Examples
endocentric A+B denotes a special kind of B darkroom, smalltalk
exocentric A+B denotes a special kind of an unexpressed semantic head skinhead, paleface (head: 'person')
copulative A+B denotes 'the sum' of what A and B denote bittersweet, sleepwalk
appositional A and B provide different descriptions for the same referent actor-director, maidservant

Noun–noun compounds


Most natural languages have compound nouns. The positioning of the words (i. e. the most common order of constituents in phrases where nouns are modified by adjectives, by possessors, by other nouns, etc.) varies according to the language. While Germanic languages, for example, are left-branching when it comes to noun phrases (the modifiers come before the head), the Romance languages are usually right-branching.

In French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

, compound nouns are often formed by left-hand heads with prepositional components inserted before the modifier, as in chemin-de-fer 'railway' lit. 'road of iron' and moulin à vent 'windmill', lit. 'mill (that works)-by-means-of wind'.

In Turkish
Turkish language
Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

, one way of forming compound nouns is as follows: yeldeğirmeni ‘windmill’ (yel: wind, değirmen-i:mill-possessive); demiryolu 'railway'(demir: iron, yol-u: road-possessive).

Verb–noun compounds


A type of compound that is fairly common in the Indo-European languages
Indo-European languages
The Indo-European languages are a family of several hundred related languages and dialects, including most major current languages of Europe, the Iranian plateau, and South Asia and also historically predominant in Anatolia...

 is formed of a verb and its object, and in effect transforms a simple verbal clause into a noun.

In Spanish
Spanish language
Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

, for example, such compounds consist of a verb conjugated for third person singular, present tense, indicative mood followed by a noun (usually plural): e.g., rascacielos (modelled on "skyscraper", lit. 'scratches skies'), sacacorchos ('corkscrew', lit. 'removes corks'), guardarropas ('wardrobe', lit. 'stores clothing'). These compounds are formally invariable in the plural (but in many cases they have been reanalyzed as plural forms, and a singular form has appeared). French and Italian have these same compounds with the noun in the singular form: Italian grattacielo, 'skyscraper'; French grille-pain, 'toaster' (lit. 'toasts bread') and torche-cul 'ass-wipe' (Rabelais: See his "propos torcheculatifs").

This construction exists in English, generally with the verb and noun both in uninflected form: examples are spoilsport, killjoy, breakfast, cutthroat, pickpocket, dreadnought, and know-nothing.

Also common in English is another type of verb–noun (or noun–verb) compound, in which an argument of the verb is incorporated
Incorporation (linguistics)
Incorporation is a phenomenon by which a word, usually a verb, forms a kind of compound with, for instance, its direct object or adverbial modifier, while retaining its original syntactic function....

 into the verb, which is then usually turned into a gerund
Gerund
In linguistics* As applied to English, it refers to the usage of a verb as a noun ....

, such as breastfeeding, finger-pointing, etc. The noun is often an instrumental complement. From these gerunds new verbs can be made: (a mother) breastfeeds (a child) and from them new compounds mother-child breastfeeding, etc.

In the Australian Aboriginal language Jingulu, (a Pama–Nyungan language), it is claimed that all verbs are V+N compounds, such as "do a sleep", or "run a dive", and the language has only three basic verbs: do, make, and run.

A special kind of composition is incorporation
Incorporation (linguistics)
Incorporation is a phenomenon by which a word, usually a verb, forms a kind of compound with, for instance, its direct object or adverbial modifier, while retaining its original syntactic function....

, of which noun incorporation into a verbal root (as in English backstabbing, breastfeed, etc.) is most prevalent (see below).

Verb–verb compounds



Verb–verb compounds are sequences of more than one verb acting together to determine clause structure. They have two types:
  • In a serial verb
    Serial verb construction
    The serial verb construction, also known as serialization, is a syntactic phenomenon common to many African, Asian and New Guinean languages...

    , two actions, often sequential, are expressed in a single clause. For example, Ewe
    Ewe language
    Ewe is a Niger–Congo language spoken in Ghana, Togo and Benin by approximately six million people. Ewe is part of a cluster of related languages commonly called Gbe, spoken in southeastern Ghana, Togo, and parts of Benin. Other Gbe languages include Fon, Gen, Phla Phera, and Aja...

     trɔ dzo, lit. "turn leave", means "turn and leave", and Hindi
    Hindi
    Standard Hindi, or more precisely Modern Standard Hindi, also known as Manak Hindi , High Hindi, Nagari Hindi, and Literary Hindi, is a standardized and sanskritized register of the Hindustani language derived from the Khariboli dialect of Delhi...

      jā-kar dekh-o, lit. "go-CONJUNCTIVE PARTICIPLE see-IMPERATIVE", means "go and see". In each case, the two verbs together determine the semantics and argument structure.


Serial verb expressions in English may include What did you go and do that for?, or He just upped and left; this is however not quite a true compound since they are connected by a conjunction and the second missing arguments may be taken as a case of ellipsis
Elliptical construction
In linguistics, ellipsis or elliptical construction refers to the omission from a clause of one or more words that would otherwise be required by the remaining elements.-Overview:...

.
  • In a compound verb (or complex predicate), one of the verbs is the primary, and determines the primary semantics and also the argument structure. The secondary verb, often called a vector verb or explicator, provides fine distinctions, usually in temporality or aspect
    Grammatical aspect
    In linguistics, the grammatical aspect of a verb is a grammatical category that defines the temporal flow in a given action, event, or state, from the point of view of the speaker...

    , and also carries the inflection
    Inflection
    In grammar, inflection or inflexion is the modification of a word to express different grammatical categories such as tense, grammatical mood, grammatical voice, aspect, person, number, gender and case...

     (tense and/or agreement markers). The main verb usually appears in conjunctive participial (sometimes zero) form. For examples, Hindi
    Hindi
    Standard Hindi, or more precisely Modern Standard Hindi, also known as Manak Hindi , High Hindi, Nagari Hindi, and Literary Hindi, is a standardized and sanskritized register of the Hindustani language derived from the Khariboli dialect of Delhi...

      nikal gayā, lit. "exit went", means 'went out', while निकल पड़ा nikal paRā, lit. "exit fell", means 'departed' or 'was blurted out'. In these examples निकल nikal is the primary verb, and गया gayā and पड़ा paRā are the vector verbs. Similarly, in both English start reading and Japanese 読み始める yomihajimeru "start-CONJUNCTIVE-read" "start reading," the vector verbs start and 始める hajimeru "start" change according to tense, negation, and the like, while the main verbs reading and 読み yomi "reading" usually remain the same. An exception to this is the passive voice, in which both English and Japanese modify the main verb, i.e. start to be read and 読まれ始める yomarehajimeru lit. "read-PASSIVE-(CONJUNCTIVE)-start" start to be read. With a few exceptions all compound verbs alternate with their simple counterparts. That is, removing the vector does not affect grammaticality at all nor the meaning very much: निकला nikalā '(He) went out.' In a few languages both components of the compound verb can be finite forms: Kurukh
    Kurukh language
    Kurukh , also called Kurux, Kuṛux or Kuruḵẖ, is a Dravidian language spoken by the Oraon and Kisan tribal peoples of Bihar, Jharkhand, Orissa, Madhya Pradesh, Chhattisgarh, and West Bengal, India, as well as in northern Bangladesh. It is most closely related to Brahui and Malto...

     kecc-ar ker-ar lit. "died-3pl went-3pl" '(They) died.'

  • Compound verbs are very common in some languages, such as the northern Indo-Aryan languages
    Indo-Aryan languages
    The Indo-Aryan languages constitutes a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family...

     Hindi-Urdu and Panjabi
    Punjabi language
    Punjabi is an Indo-Aryan language spoken by inhabitants of the historical Punjab region . For Sikhs, the Punjabi language stands as the official language in which all ceremonies take place. In Pakistan, Punjabi is the most widely spoken language...

     where as many as 20% of verb forms in running text are compound. They exist but are less common in Dravidian languages
    Dravidian languages
    The Dravidian language family includes approximately 85 genetically related languages, spoken by about 217 million people. They are mainly spoken in southern India and parts of eastern and central India as well as in northeastern Sri Lanka, Pakistan, Nepal, Bangladesh, Afghanistan, Iran, and...

     and in other Indo-Aryan languages like Marathi
    Marathi language
    Marathi is an Indo-Aryan language spoken by the Marathi people of western and central India. It is the official language of the state of Maharashtra. There are over 68 million fluent speakers worldwide. Marathi has the fourth largest number of native speakers in India and is the fifteenth most...

     and Nepali
    Nepali language
    Nepali or Nepalese is a language in the Indo-Aryan branch of the Indo-European language family.It is the official language and de facto lingua franca of Nepal and is also spoken in Bhutan, parts of India and parts of Myanmar...

    , in Tibeto-Burman languages
    Tibeto-Burman languages
    The Tibeto-Burman languages are the non-Chinese members of the Sino-Tibetan language family, over 400 of which are spoken thoughout the highlands of southeast Asia, as well as lowland areas in Burma ....

     like Limbu
    Limbu language
    Limbu is a Tibeto-Burman language spoken in Nepal, Bhutan, Sikkim, Kashmir and Darjeeling district, West Bengal, India, by the Limbu community. Virtually all Limbus are bilingual in Nepali....

     and Newari, in potentially Altaic languages
    Altaic languages
    Altaic is a proposed language family that includes the Turkic, Mongolic, Tungusic, and Japonic language families and the Korean language isolate. These languages are spoken in a wide arc stretching from northeast Asia through Central Asia to Anatolia and eastern Europe...

     like Turkish
    Turkish language
    Turkish is a language spoken as a native language by over 83 million people worldwide, making it the most commonly spoken of the Turkic languages. Its speakers are located predominantly in Turkey and Northern Cyprus with smaller groups in Iraq, Greece, Bulgaria, the Republic of Macedonia, Kosovo,...

    , Korean
    Korean language
    Korean is the official language of the country Korea, in both South and North. It is also one of the two official languages in the Yanbian Korean Autonomous Prefecture in People's Republic of China. There are about 78 million Korean speakers worldwide. In the 15th century, a national writing...

    , Japanese
    Japanese language
    is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is a member of the Japonic language family, which has a number of proposed relationships with other languages, none of which has gained wide acceptance among historical linguists .Japanese is an...

    , Kazakh
    Kazakh language
    Kazakh is a Turkic language which belongs to the Kipchak branch of the Turkic languages, closely related to Nogai and Karakalpak....

    , Uzbek
    Uzbek language
    Uzbek is a Turkic language and the official language of Uzbekistan. It has about 25.5 million native speakers, and it is spoken by the Uzbeks in Uzbekistan and elsewhere in Central Asia...

    , and Kyrgyz
    Kyrgyz language
    Kyrgyz or Kirgiz, also Kirghiz, Kyrghiz, Qyrghiz is a Turkic language and, together with Russian, an official language of Kyrgyzstan...

    , and in northeast Caucasian languages like Tsez and Avar
    Avar language
    The modern Avar language belongs to the Avar–Andic group of the Northeast Caucasian language family....

    .
  • Under the influence of a Quichua
    Kichwa
    Kichwa is a Quechuan language, and includes all Quechua varieties spoken in Ecuador and Colombia by approximately 2,500,000 people...

     substrate speakers living in the Ecuadorian altiplano
    Altiplano
    The Altiplano , in west-central South America, where the Andes are at their widest, is the most extensive area of high plateau on Earth outside of Tibet...

     have innovated compound verbs in Spanish:
De rabia puso rompiendo la olla, 'In anger (he/she) smashed the pot.' (Lit. from anger put breaking the pot)
Botaremos matándote 'We will kill you.' (Cf. Quichua huañuchi-shpa shitashun, lit. kill-CP throw.1plFut, तेरे को मार डालेंगे )

  • Compound verb equivalents in English (examples from the internet):
What did you go and do that for?
If you are not giving away free information on your web site then a huge proportion of your business is just upping and leaving.
Big Pig, she took and built herself a house out of brush.
  • Caution: In descriptions of Persian
    Persian language
    Persian is an Iranian language within the Indo-Iranian branch of the Indo-European languages. It is primarily spoken in Iran, Afghanistan, Tajikistan and countries which historically came under Persian influence...

     and other Iranian languages
    Iranian languages
    The Iranian languages form a subfamily of the Indo-Iranian languages which in turn is a subgroup of Indo-European language family. They have been and are spoken by Iranian peoples....

     the term 'compound verb' refers to noun-plus-verb compounds, not to the verb–verb compounds discussed here.

Compound adpositions


Compound prepositions formed by prepositions and nouns are common in English and the Romance languages (consider English on top of, Spanish encima de, etc.). Japanese shows the same pattern, except the word order is the opposite (with postpositions): no naka (lit. "of inside", i.e. "on the inside of"). Hindi has a small number of simple (i.e., one-word) postpositions and a large number of compound postpositions, mostly consisting of simple postposition ke followed by a specific postposition (e.g., ke pas, "near"; ke nīche, "underneath").

Examples from different languages


Anishinaabemowin/Ojibwe:
  • mashkikiwaaboo 'tonic': mashkiki 'medicine' + waaboo 'liquid'
  • miskomin 'raspberry': misko 'red' + miin 'berry'
  • dibik-giizis 'moon': dibik 'night' + giizis 'sun'
  • gichi-mookomaan 'white person/American': gichi 'big' + mookomaan 'knife'


Chinese (Cantonese Jyutping
Jyutping
Jyutping is a romanization system for Cantonese developed by the Linguistic Society of Hong Kong in 1993. Its formal name is The Linguistic Society of Hong Kong Cantonese Romanization Scheme...

):
  • 學生 'student': 學 hok6 learn + 生 sang1 grow
  • 太空 'universe': 太 taai3 great + 空 hung1 emptiness
  • 摩天樓 'skyscraper': 摩 mo1 touch + 天 tin1 sky + 樓 lau2 building (with more than 1 storey)
  • 打印機 'printer': 打 daa2 strike + 印 yan3 stamp/print + 機 gei1 machine
  • 百科全書 'encyclopaedia': 百 baak3 100 + 科 fo1 (branch of) study + 全 cyun4 entire/complete + 書 syu1 book


Dutch:
  • Arbeidsongeschiktheidsverzekering 'disability insurance': arbeid 'labour', + ongeschiktheid 'inaptitude', + verzekering 'insurance'.
  • Rioolwaterzuiveringsinstallatie 'wastewater treatment plant': riool 'sewer', + water 'water', + zuivering 'cleaning', + installatie 'installation'.
  • Verjaardagskalender 'birthday calendar': verjaardag 'birthday', + kalender 'calendar'.
  • Klantenservicemedewerker 'customer service representative': klanten 'customers', + service 'service', + medewerker 'worker'.
  • Universiteitsbibliotheek 'university library': universiteit 'university', + bibliotheek 'library'.
  • Doorgroeimogelijkheden 'possibilities for advancement': door 'through', + groei 'grow', + mogelijkheden 'possibilities'.


Finnish:
  • sanakirja 'dictionary': sana 'word', + kirja 'book'
  • tietokone 'computer': tieto 'knowledge, data', + kone 'machine'
  • keskiviikko 'Wednesday': keski 'middle', + viikko 'week'
  • maailma 'world': maa 'land', + ilma 'air'
  • rautatieasema 'railway station': rauta 'iron' + tie 'road' + asema 'station'
  • suihkuturbiiniapumekaanikkoaliupseerioppilas: 'Jet engine assistant mechanic NCO student'
  • atomiydinenergiareaktorigeneraattorilauhduttajaturbiiniratasvaihde: some part of a nuclear plant

German:
  • Wolkenkratzer 'skyscraper': wolken 'clouds', + kratzer 'scraper'
  • Eisenbahn 'railway': Eisen 'iron', + bahn 'track'
  • Kraftfahrzeug 'automobile': Kraft 'power', + fahren/fahr 'drive', + zeug 'machinery'
  • Stacheldraht 'barbed wire': stachel 'barb/barbed', + draht 'wire'
  • Rinderkennzeichnungs- und Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz
    Rinderkennzeichnungs- und Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz
    Rinderkennzeichnungs- und Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz is a law of the German state of Mecklenburg-Vorpommern of 2000, dealing with the supervision of the labeling of beef.The name is an example of the virtually unlimited compounding of nouns that is...

    : literally, Cattle marking and beef labeling supervision duties delegation law


Icelandic:
  • járnbraut 'railway': járn 'iron', + braut 'path' or 'way'
  • farartæki 'vehicle': farar 'journey', + tæki 'apparatus'
  • alfræðiorðabók 'encyclopædia': al 'everything', + fræði 'study' or 'knowledge', + orða 'words', + bók 'book'
  • símtal 'telephone conversation': sím 'telephone', + tal 'dialogue'


Italian:
  • Millepiedi 'centipede': mille 'thousand', + piedi 'feet'
  • Ferrovia 'railway': ferro 'iron', + via 'way'
  • Tergicristallo 'windscreen wiper': tergere 'to wash', + cristallo 'crystal, (pane of) glass'


Japanese:
  • 目覚まし(時計) mezamashi(dokei) 'alarm clock': 目 me 'eye' + 覚まし samashi (-zamashi) 'awakening (someone)' (+ 時計 tokei (-dokei) clock)
  • お好み焼き okonomiyaki
    Okonomiyaki
    is a Japanese dish containing a variety of ingredients. The name is derived from the word okonomi, meaning "what you like" or "what you want", and yaki meaning "grilled" or "cooked" . Okonomiyaki is mainly associated with Kansai or Hiroshima areas of Japan, but is widely available throughout the...

    : お好み okonomi 'preference' + 焼き yaki 'cooking'
  • 日帰り higaeri 'day trip': 日 hi 'day' + 帰り kaeri (-gaeri) 'returning (home)'
  • 国会議事堂 kokkaigijidō 'national diet building': 国会 kokkai 'national diet' + 議事 giji 'proceedings' + 堂 'hall'


Korean:
  • 안팎 anpak 'inside and outside': 안 an 'inside' + 밖 bak 'outside' (As two nouns compound, the consonant sound 'b' fortifies into 'p,' becoming 안팎 anpak rather than 안밖 anbak)


Spanish:
  • Ciencia-ficción 'science fiction': ciencia, 'science', + ficción, 'fiction' (This word is a calque
    Calque
    In linguistics, a calque or loan translation is a word or phrase borrowed from another language by literal, word-for-word or root-for-root translation.-Calque:...

     from the English expression science fiction
    Science fiction
    Science fiction is a genre of fiction dealing with imaginary but more or less plausible content such as future settings, futuristic science and technology, space travel, aliens, and paranormal abilities...

    . In English, the head of a compound word is the last morpheme: science fiction. Conversely, the Spanish head is located at the front, so ciencia ficción sounds like a kind of fictional science rather than scientific fiction.)
  • Ciempiés 'centipede': cien 'hundred', + pies 'feet'
  • Ferrocarril 'railway': ferro 'iron', + carril 'lane'
  • Paraguas 'umbrella': para 'to stop, stops' + aguas '(the) water'
  • Cabizbajo 'keeping the head low, in a bad mood'
  • Subibaja 'seesaw'

Germanic languages


In Germanic languages, compound words are formed by prepending a descriptive word in front of the main word. For example, "starfish" is a specific "fish" with a "star" shape. Likewise, the noun phrase "star shape" means a "star"like "shape" (whatever a star is). Whereas "starfish" has an explicit definition, this is not required, as compounds like "star shape" and "starlike" can be composed when needed and understood by their implicit meaning. The compound word is understood as a word in itself. Therefore, it may in turn be used in new compound words, so forming an arbitrarily long word is trivial. This contrasts to Romance languages, where prepositions are more used to specify word relationships instead of concatenating the words.

As a member of the Germanic family of languages, English is special in that compound words are usually written in their separate parts. Although English does not form compound nouns to the extent of Dutch or German, noun phrases like "Girl Scout troop", "city council member", and "cellar door" are arguably compound nouns and used as such in speech. Writing them as separate words is merely an orthographic convention, possibly a result of influence from French.

Russian language


In the Russian language
Russian language
Russian is a Slavic language used primarily in Russia, Belarus, Uzbekistan, Kazakhstan, Tajikistan and Kyrgyzstan. It is an unofficial but widely spoken language in Ukraine, Moldova, Latvia, Turkmenistan and Estonia and, to a lesser extent, the other countries that were once constituent republics...

 compounding is a common type of word formation
Word formation
In linguistics, word formation is the creation of a new word. Word formation is sometimes contrasted with semantic change, which is a change in a single word's meaning...

, and several types of compounds exist, both in terms of compounded parts of speech and of the way of the formation of a compound.

Compound nouns may be agglutinative compounds, hyphenated compounds (стол-книга 'folding table' lit. 'table-book', i.e., "book-like table"), or abbreviated compounds (portmanteaux: колхоз 'kolkhoz
Kolkhoz
A kolkhoz , plural kolkhozy, was a form of collective farming in the Soviet Union that existed along with state farms . The word is a contraction of коллекти́вное хозя́йство, or "collective farm", while sovkhoz is a contraction of советское хозяйство...

'). Some compounds look like portmanteaux, while in fact they are an agglutinations of type stem + word: Академгородок 'Akademgorodok
Akademgorodok
Akademgorodok , is a part of the Russian city Novosibirsk, located 20 km south of the city center. It is the educational and scientific centre of Siberia...

' (from akademichesky gorodok 'academic village'). In agglutinative compound nouns, an agglutinating infix is typically used: пароход 'steamship': пар + о + ход. Compound nouns may be created as noun+noun, adjective + noun, noun + adjective (rare), noun + verb (or, rather, noun + verbal noun
Verbal noun
In linguistics, the verbal noun turns a verb into a noun and corresponds to the infinitive in English language usage. In English the infinitive form of the verb is formed when preceded by to, e.g...

).

Compound adjectives may be formed either per se, e.g., бело-розовый 'white-pink', or as a result of compounding during the derivation of an adjective from a multiword term: Каменноостровский проспект ([kəmʲɪnnʌʌˈstrovskʲɪj prʌˈspʲɛkt]) 'Stone Island Avenue', a street in St.Petersburg.

Reduplication in Russian language is also a source of compounds.

Quite a few Russian words are borrowed from other languages in an already compounded form, including numerous "classical compound
Classical compound
Classical compounds are compound words composed from Latin or Ancient Greek root words. A large portion of the technical and scientific lexicon of English and other Western European languages consists of classical compounds. For example, bio- combines with -graphy to form biography...

s" or internationalisms
Internationalism (linguistics)
In linguistics, an internationalism or international word is a loanword that occurs in several languages with the same or at least similar meaning and etymology. These words exist in "several different languages as a result of simultaneous or successive borrowings from the ultimate source"...

: автомобиль 'automobile'.

Sanskrit language



Sanskrit is very rich in compound formation with seven major compound types and as many as 55 sub-types. The compound formation process is an open-set, and it is not possible to list all Sanskrit compounds in a dictionary. Compounds of two or three words are more frequent, but longer compounds with some running through pages are not rare in Sanskrit literature. Some examples are below (hyphens below show individual word boundaries for ease of reading but are not required in original Sanskrit).
  • हिमालय (IAST Himālaya, decomposed as hima-ālaya): Name of the Himalaya mountain range. Literally the abode of snow. A compound of two words and four syllables.
  • प्रवर-मुकुट-मणि-मरीचि-मञ्जरी-चय-चर्चित-चरण-युगल (IAST pravara-mukuṭa-maṇi-marīci-mañjarī-caya-carcita-caraṇa-yugala): Literally, O the one whose dual feet are covered by the cluster of brilliant rays from the gems of the best crowns, from the Sanskrit work Panchatantra
    Panchatantra
    The Panchatantra is an ancient Indian inter-related collection of animal fables in verse and prose, in a frame story format. The original Sanskrit work, which some scholars believe was composed in the 3rd century BCE, is attributed to Vishnu Sharma...

    . A compound of nine words and 25 syllables.
  • कमला-कुच-कुङ्कुम-पिञ्जरीकृत-वक्षः-स्थल-विराजित-महा-कौस्तुभ-मणि-मरीचि-माला-निराकृत-त्रि-भुवन-तिमिर (IAST kamalā-kuca-kuṅkuma-piñjarīkṛta-vakṣaḥ-sthala-virājita-mahā-kaustubha-maṇi-marīci-mālā-nirākṛta-tri-bhuvana-timira): Literally O the one who dispels the darkness of three worlds by the shine of Kaustubha
    Kaustubha
    Kaustubh is a divine jewel - the most valuable stone "Mani" is in the possession of lord Vishnu who lives in the Ksheer Sagar - "the ocean of milk".-In History :...

     jewel hanging on the chest which has been made reddish-yellow by the saffron from the bosom of Kamalā (Lakshmi
    Lakshmi
    Lakshmi or Lakumi is the Hindu goddess of wealth, prosperity , light, wisdom, fortune, fertility, generosity and courage; and the embodiment of beauty, grace and charm. Representations of Lakshmi are also found in Jain monuments...

    )
    , an adjective of Rama
    Rama
    Rama or full name Ramachandra is considered to be the seventh avatar of Vishnu in Hinduism, and a king of Ayodhya in ancient Indian...

     in the Kakabhushundi Rāmāyaṇa
    Ramayana
    The Ramayana is an ancient Sanskrit epic. It is ascribed to the Hindu sage Valmiki and forms an important part of the Hindu canon , considered to be itihāsa. The Ramayana is one of the two great epics of India and Nepal, the other being the Mahabharata...

    . A compound of 16 words and 44 syllables.
  • साङ्ख्य-योग-न्याय-वैशेषिक-पूर्व-मीमांसा-वेदान्त-नारद-शाण्डिल्य-भक्ति-सूत्र-गीता-वाल्मीकीय-रामायण-भागवतादि-सिद्धान्त-बोध-पुरः-सर-समधिकृताशेष-तुलसी-दास-साहित्य-सौहित्य-स्वाध्याय-प्रवचन-व्याख्यान-परम-प्रवीणाः (IAST sāṅkhya-yoga-nyāya-vaiśeṣika-pūrva-mīmāṃsā-vedānta-nārada-śāṇḍilya-bhakti-sūtra-gītā-vālmīkīya-rāmāyaṇa-bhāgavatādi-siddhānta-bodha-puraḥ-sara-samadhikṛtāśeṣa-tulasī-dāsa-sāhitya-sauhitya-svādhyāya-pravacana-vyākhyāna-parama-pravīṇāḥ): Literally the acclaimed forerunner in understanding of the canons of Sāṅkhya, Yoga
    Yoga
    Yoga is a physical, mental, and spiritual discipline, originating in ancient India. The goal of yoga, or of the person practicing yoga, is the attainment of a state of perfect spiritual insight and tranquility while meditating on Supersoul...

    , Nyāya
    Nyaya
    ' is the name given to one of the six orthodox or astika schools of Hindu philosophy—specifically the school of logic...

    , Vaiśeṣika
    Vaisheshika
    Vaisheshika or ' is one of the six Hindu schools of philosophy of India. Historically, it has been closely associated with the Hindu school of logic, Nyaya....

    , Pūrva Mīmāṃsā, Vedānta
    Vedanta
    Vedānta was originally a word used in Hindu philosophy as a synonym for that part of the Veda texts known also as the Upanishads. The name is a morphophonological form of Veda-anta = "Veda-end" = "the appendix to the Vedic hymns." It is also speculated that "Vedānta" means "the purpose or goal...

    , Nārada Bhakti Sūtra, Śāṇḍilya Bhakti Sūtra, Bhagavad Gītā
    Bhagavad Gita
    The ' , also more simply known as Gita, is a 700-verse Hindu scripture that is part of the ancient Sanskrit epic, the Mahabharata, but is frequently treated as a freestanding text, and in particular, as an Upanishad in its own right, one of the several books that constitute general Vedic tradition...

    , the Ramayana of Vālmīki
    Valmiki
    Valmiki is celebrated as the poet harbinger in Sanskrit literature. He is the author of the epic Ramayana, based on the attribution in the text of the epic itself. He is revered as the Adi Kavi, which means First Poet, for he discovered the first śloka i.e...

    , Śrīmadbhāgavata
    Bhagavata purana
    The Bhāgavata Purāṇa is one of the "Maha" Puranic texts of Hindu literature, with its primary focus on bhakti to the incarnations of Vishnu, particularly Krishna...

    ; and the most skilled in comprehensive self-study, discoursing and expounding of the complete works of Gosvāmī Tulasīdāsa
    Tulsidas
    Tulsidas , was a Hindu poet-saint, reformer and philosopher renowned for his devotion for the god Rama...

    . An adjective used in a panegyric of Jagadguru Rambhadracharya. The hyphens show only those word boundaries where there is no Sandhi
    Sandhi
    Sandhi is a cover term for a wide variety of phonological processes that occur at morpheme or word boundaries . Examples include the fusion of sounds across word boundaries and the alteration of sounds due to neighboring sounds or due to the grammatical function of adjacent words...

    . On including word boundaries with Sandhis (vedānta=veda-anta, rāmāyaṇa=rāma-ayana, bhāgavatādi=bhāgavata-ādi, siddhānta=siddha-anta, samadhikṛtāśeṣa=samadhikṛta-aśeṣa, svādhyāya=sva-adhyāya), this is a compound of 35 words and 86 syllables.

Recent trends


Although there is no universally agreed-upon guideline regarding the use of compound words in the English language
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

, in recent decades written English has displayed a noticeable trend towards increased use of compounds. Recently, many words have been made by taking syllables of words and compounding them, such as pixel (picture element) and bit (binary digit). This is called a syllabic abbreviation.

There is a trend in Scandinavian languages towards splitting compound words, known in Norwegian as "særskrivingsfeil" (separate writing error). Because the Norwegian language relies heavily on the distinction between the compound word and the sequence of the separate words it consists of, this has dangerous implications. For example "røykfritt" (smokefree, meaning no smoking) has been seen confused with "røyk fritt" (smoke freely).

The German spelling reform of 1996 introduced the option of hyphenating compound nouns when it enhances comprehensibility and readability. This is done mostly with very long compound words by separating them into two or more smaller compounds, like Eisenbahn-Unterführung (railway underpass) or Kraftfahrzeugs-Betriebsanleitung (car manual). Such practice is also permitted in Norwegian (Bokmål and Nynorsk), and encouraged between parts of the word that have very different pronunciation, such as when one part is a loan word or an acronym.

See also

  • Bracketing paradox
    Bracketing paradox
    In linguistic morphology, the term bracketing paradox refers to morphologically complex words which apparently have more than one incompatible analysis, or bracketing, simultaneously....

  • Incorporation (linguistics)
    Incorporation (linguistics)
    Incorporation is a phenomenon by which a word, usually a verb, forms a kind of compound with, for instance, its direct object or adverbial modifier, while retaining its original syntactic function....

  • Multiword expression
    Multiword expression
    A multiword expression is a lexeme made up of a sequence of two or more lexemes that has properties that are not predictable from the properties of the individual lexemes or their normal mode of combination....

  • Neologism
  • Noun adjunct
    Noun adjunct
    In grammar, a noun adjunct or attributive noun or noun premodifier is a noun that modifies another noun and is optional — meaning that it can be removed without changing the grammar of the sentence; it is a noun functioning as an adjective. For example, in the phrase "chicken soup" the noun adjunct...

  • Portmanteau compounds
  • Status constructus
    Status constructus
    The construct state or status constructus is a noun form occurring in Afro-Asiatic languages. It is particularly common in Semitic languages , Berber languages, and in the extinct Egyptian language...

  • Word formation
    Word formation
    In linguistics, word formation is the creation of a new word. Word formation is sometimes contrasted with semantic change, which is a change in a single word's meaning...

  • Syllabic abbreviation
  • Tweebuffelsmeteenskootmorsdoodgeskietfontein
    Tweebuffelsmeteenskootmorsdoodgeskietfontein
    Tweebuffelsmeteenskootmorsdoodgeskietfontein is a farm in the North West province of South Africa, located about 200 km west of Pretoria and 20 km east of Lichtenburg whose 44-character name has entered South African folklore in much the same way that...


External links