All Topics  
Word

 

   Email Print
   Bookmark   Link

 

Word


 
 

A word is a unit of languageLanguage

A language is a system of s, such as voice sounds, gestures or written symbols that encode or decode information....
 that carries meaning and consists of one or more morphemeMorpheme

In morpheme-based morphology, a morpheme is the smallest lingual unit that carries a semantic interpretation....
s which are linked more or less tightly together, and has a phonetical value. Typically a word will consist of a rootRoot (linguistics)

The root is the primary lexical unit of a word, which carries the most significant aspects of semantic content and cannot ...
 or stem and zero or more affixFacts About Affix

An affix is a morpheme that is attached to a base morpheme such as a root or to a stem, to form a word....
es. Words can be combined to create phraseFacts About Phrase

In grammar, a phrase is a group of words that functions as a single unit in the syntax of a sentence....
s, clauseClause

In grammar, a clause is a group of words consisting of a subject and a predicate, although, in non-finite clauses, the subje...
s, and sentencesSentence (linguistics)

In linguistics, a sentence is a unit of language, characterized in most languages by the presence of a finite verb....
. A word consisting of two or more stems joined together form a compoundCompound (linguistics)

A compound is a word that consists of more than one free morpheme....
. A word combined with another word or part of a word form a portmanteauPortmanteau

A portmanteau is a term in linguistics that refers to a word or morpheme that fuses two or more grammatical functions....
.

Etymology

English is directly from Old English word, and has cognates in all branches of GermanicGermanic languages

The Germanic languages are a group of related languages constituting a branch of the Indo-European language family....
, deriving from Proto-Germanic *wurda, continuing a virtual PIEPie Overview

A pie is a baked food, with a baked shell usually made of pastry that covers or completely contains a filling of meat, fish...
 . Cognates outside Germanic include BalticBaltic languages

The Baltic languages are a group of related languages belonging to the Indo-European language family and spoken mainly in ar...
 and Latin . The PIE stem is also found in Greek e??e? (f?e??eta? "speaks, utters" Hes.Hesychius of Alexandria

Hesychius of Alexandria, a grammarian of Alexandria, compiled the richest lexicon of unusual and obscure Greek words that ha...
 ). The PIE root is "say, speak" (also found in Greek e???, ??t??).

The original meaning of word is "utteranceUtterance

An utterance is a complete unit of speech in spoken language....
, speechSpeech

Speech can be described as an act of producing voice through the use of the vocal cords and vocal apparatus or other means, ...
, verbal expression".
Until Early Modern EnglishEarly Modern English

Early Modern English refers to the stage of the English language used from about the end of the Middle English period to 165...
, it could more specifically refer to a name or title.

The technical meaning of "an element of speech" first arises in discussion of grammarFacts About Grammar

Grammar is the study of rules governing the use of language....
  (particularly Latin grammar), as in the prologue to Wyclif's Bible (ca. 1400):
"This word autem, either vero, mai stonde for forsothe, either for but."

Definitions

Depending on the language, words can be difficult to identify or delimit. Dictionaries take upon themselves the task of categorizing a language's lexiconLexicon

A lexicon is usually a list of words together with additional word-specific information, i.e., a dictionary....
 into lemmasLemma (linguistics)

In linguistics, and particularly in morphology, a lemma or citation form is the canonical form of a lexeme....
. These can be taken as an indication of what constitutes a "word" in the opinion of the authors.

Word boundaries

In spoken languageFacts About Spoken language

A spoken language is a human natural language in which the words are uttered through the mouth....
, the distinction of individual words is usually given by rhythm or accent, but short words are often run together. See cliticClitic

In linguistics, a clitic is a word that syntactically functions as a free morpheme, but phonetically appears as a bound morp...
 for phonologically dependent words. Spoken FrenchFrench language

French is the third-largest of the Romance languages in terms of number of native speakers, after Spanish and Portuguese, b...
 has some of the features of a polysynthetic languagePolysynthetic language

Polysynthetic languages are highly synthetic languages, i.e....
: il y est allé ("He went there") is pronounced //. As the majority of the world's languages are not written, the scientific determination of word boundaries becomes important.

There are five ways to determine where the word boundaries of spoken language should be placed:
Potential pause
A speaker is told to repeat a given sentence slowly, allowing for pauses. The speaker will tend to insert pauses at the word boundaries. However, this method is not foolproof: the speaker could easily break up polysyllabic words.

Indivisibility
A speaker is told to say a sentenceSentence (linguistics)

In linguistics, a sentence is a unit of language, characterized in most languages by the presence of a finite verb....
 out loud, and then is told to say the sentence again with extra words added to it. Thus, I have lived in this village for ten years might become I and my family have lived in this little village for about ten or so years. These extra words will tend to be added in the word boundaries of the original sentence. However, some languages have infixInfix

Infix has similar meanings in linguistics and mathematics. ...
es, which are put inside a word. Similarly, some have separable affixSeparable affix

A separable affix is an affix that can be detached from the word it attaches to and located elsewhere in the sentence in a c...
es; in the GermanGerman language

German is a West Germanic language....
 sentence "Ich komme gut zu Hause an," the verb ankommen is separated.

Minimal free forms
This concept was proposed by Leonard BloomfieldLeonard Bloomfield Overview

Leonard Bloomfield was an American linguist, whose influence dominated the development of structural linguistics in America...
. Words are thought of as the smallest meaningful unit of speech that can stand by themselves. This correlates phonemes (units of sound) to lexemeLexeme

A lexeme is an abstract unit of morphological analysis in linguistics, that roughly corresponds to a set of words that are d...
s (units of meaning). However, some written words are not minimal free forms, as they make no sense by themselves (for example, the and of).

Phonetic boundaries
Some languages have particular rules of pronunciationPronunciation

Pronunciation refers to* the way a word or a language is usually spoken;...
 that make it easy to spot where a word boundary should be. For example, in a language that regularly stresses the last syllable of a word, a word boundary is likely to fall after each stressed syllable. Another example can be seen in a language that has vowel harmonyVowel harmony

Vowel harmony is a type of long-distance assimilatory phonological process involving vowels in some languages....
 (like TurkishTurkish language

Turkish is a Turkic language spoken natively by the Turkish people in Turkey, Cyprus, Bulgaria, Greece, Republic of Macedon...
): the vowels within a given word share the same quality, so a word boundary is likely to occur whenever the vowel quality changes. However, not all languages have such convenient phonetic rules, and even those that do present the occasional exceptions.

Semantic units
Much like the above mentioned minimal free forms, this method breaks down a sentence into its smallest semanticSemantics

Semantics refers to the aspects of meaning that are expressed in a language, code, or other form of representation....
 units. However, language often contains words that have little semantic value (and often play a more grammatical role), or semantic units that are compound words.


A further criterion. Pragmatics.
As Plag suggests, the idea of a lexical item being considered a word should also adjust to pragmatic criteria. The word "hello, for example, does not exist outside of the realm of greetings being difficult to assign a meaning out of it. This is a little more complex if we consider "how do you do?": is it a word, a phrase, or an idiom?
In practice, linguists apply a mixture of all these methods to determine the word boundaries of any given sentence. Even with the careful application of these methods, the exact definition of a word is often still very elusive.

There are some words that seem very general but may truly have a technical definition, such as the word "soon," usually meaning within a week.

Orthography

In languages with a literary traditionWriting

Writing may refer to two activities: the inscribing of characters on a medium, with the intention of forming words and other...
, there is interrelation between orthographyOrthography Overview

The orthography of a language is the set of symbols used to write a language, as well as the set of rules describing how to...
 and the question of what is considered a single word.
Word separators (typically space marksSpace (punctuation)

In writing, a space is any empty zone between written sections....
) are common in modern orthography of languages using alphabetic scripts,
but these are (excepting isolated precedents) a modern development (see also history of writingHistory of writing

Writing systems evolved in the 4th millennium BC out of neolithic proto-writing....
).

In English orthographyEnglish orthography

English spelling , although largely phonemic, has more complicated rules than many other spelling systems used by languages ...
, words may contain spaces if they are compoundsCompound (linguistics)

A compound is a word that consists of more than one free morpheme....
 or proper nouns such as ice cream or air raid shelter.

VietnameseVietnamese language

Vietnamese , formerly known under the French colonization as Annamese , is the national and official language of Vietn...
 orthography, although using the Latin alphabetLatin alphabet

The Latin alphabet, also called the Roman alphabet, is the most widely used alphabetic writing system in the world tod...
, delimits monosyllabic morphemes, not words.
Conversely, synthetic languageFacts About Synthetic language

A synthetic language, in linguistic typology, is a language with a high morpheme-per-word ratio....
s often combine many lexical morphemes into single words, making it difficult to boil them down to the traditional sense of words found more easily in analytic languageAnalytic language

An analytic language is a language in which the vast majority of morphemes are free morphemes and are considered to be full...
s; this is especially difficult for polysynthetic languagePolysynthetic language

Polysynthetic languages are highly synthetic languages, i.e....
s such as InuktitutInuktitut

Inuktitut is the name of the varieties of Inuit language spoken in Canada....
 and UbykhUbykh language Summary

Ubykh or Ubyx is a language of the Northwestern Caucasian group, spoken by the Ubykh people up until the early 1990s....
, where entire sentences may consist of single such words.

Logographic scripts use single signs|characters]]) to express a word. Most de facto existing scripts are however partly logographic, and combine logographic with phonetic signs. The most widespread
logographic script in modern use is the Chinese script. While the Chinese script has some true logographs, the largest class of characters used in modern Chinese (some 90%) are so-called pictophonetic compounds (, ). Characters of this sort are composed of two parts: a pictograph, which suggests the general meaning of the character, and a phonetic part, which is derived from a character pronounced in the same way as the word the new character represents. In this sense, the character for most Chinese words consists of a determiner and a syllabogram, similar to the approach used by cuneiform scriptCuneiform script

The cuneiform script is one of the earliest known forms of written expression....
 and Egyptian hieroglyphsEgyptian hieroglyphs

' are a writing system used by the Ancient Egyptians, that contained a combination of logographic, alphabetic, and ideographi...
.

There is a tendency informed by orthography to identify a single Chinese character as corresponding to a single word in the Chinese language, parallel to the tendency to identify the letters between two space marks as a single word in the English language. In both cases, this leads to the identification of compound members as individual words, while e.g. in German orthographyGerman orthography

German orthography, although largely phonemic, shows many instances of spellings that are historic or analogic to other spel...
, compound members are not separated by space marks and the tendency is thus to identify the entire compound as a single word. Compare e.g. English capital city with German Hauptstadt and Chinese ?? (lit. ): all three are equivalent compounds, in the English case consisting of "two words" separated by a space mark, in the German case written as a "single word" without space mark, and in the Chinese case consisting of two logographic characters.

Morphology

In synthetic languageSynthetic language

A synthetic language, in linguistic typology, is a language with a high morpheme-per-word ratio....
s, a single word stemWord stem Summary

A stem, in linguistics, is the combination of the basic form of a word plus any derivational morphemes, but excluding inflec...
 (for example, love) may have a number of different forms (for example, loves, loving, and loved). However, these are not usually considered to be different words, but different forms of the same word. In these languages, words may be considered to be constructed from a number of morphemeMorpheme

In morpheme-based morphology, a morpheme is the smallest lingual unit that carries a semantic interpretation....
s.
In Indo-European languagesIndo-European languages

, [[Bengali language | Bengali]...
 in particular, the morphemes distinguished are
  • the rootRoot (linguistics)

    The root is the primary lexical unit of a word, which carries the most significant aspects of semantic content and cannot ...
  • optional suffixes
  • a desinence.

Thus, the Proto-Indo-European would be analysed as consisting of
  1. , the zero grade of the root
  2. a root-extension (diachronically a suffix), resulting in a complex root
  3. The thematic suffix 
  4. the neuter gender nominative or accusative singular desinence .

Classes

GrammarGrammar

Grammar is the study of rules governing the use of language....
 classifies a language's lexicon into several groups of words. The basic bipartite division
possible for virtually every natural languageNatural language

The term natural language is used to distinguish languages spoken and signed by humans for general-purpose communication fr...
 is that of nounNoun

A noun, or noun substantive, is a part of speech which can co-occur with definite articles and attributive adjective...
s vs. verbVerb

A verb is a part of speech that usually denotes action , occurrence , or a state of being ....
s.

The classification into such classes is in the tradition of Dionysius ThraxDionysius Thrax

Dionysius Thrax was a Hellenistic era Greek grammarian who lived and is thought by some to have worked in Alexandria and l...
, who distinguished eight categories: noun, verb, adjectiveAdjective Overview

An adjective is a part of speech which modifies a noun, usually describing it or making its meaning more specific....
, pronounPronoun

In linguistics and grammar, a pronoun is a pro-form that substitutes for a noun phrase....
, preposition, adverbAdverb

An adverb is a part of speech. It is a word that modifies any other part of language except for nouns; modifiers of nouns a...
, conjunctionConjunction

Conjunction can refer to:*Astronomical conjunction, an astronomical phenomenon...
, interjectionInterjection

An interjection is a part of speech that usually has no grammatical connection to the rest of the sentence and simply expres...
.

In Indian grammatical tradition, Panini introduced a similar fundamental classification into a nominal (nama, suP) and a verbal (akhyata, tiN) class, based on the set of desinences taken by the word.

See also

  • GrammarGrammar

    Grammar is the study of rules governing the use of language....
  • UtteranceUtterance Summary

    An utterance is a complete unit of speech in spoken language....
  • MorphologyMorphology (linguistics)

    In linguistics, morphology is the study of word structure....
  • LexemeLexeme

    A lexeme is an abstract unit of morphological analysis in linguistics, that roughly corresponds to a set of words that are d...
  • LexiconLexicon

    A lexicon is usually a list of words together with additional word-specific information, i.e., a dictionary....
  • Lexis (linguistics)Lexis (linguistics)

    In linguistics, the lexis of a language is the entire store of its lexical items....
  • Lexical itemLexical item

    The lexical items in a language are both the single words and sets of words organized into groups, units or "chunks"....


External links

  • - a working paper by Larry TraskLarry Trask

    Robert Lawrence "Larry" Trask was Professor of Linguistics at the University of Sussex and an authority on Basque language a...
    , Department of Linguistics and English Language, University of SussexUniversity of Sussex

    The University of Sussex is an English campus university located near the East Sussex village of Falmer, near Brighton and H...
    .