A
Swadesh list is one of several lists of vocabulary with basic meanings, developed by
Morris SwadeshMorris Swadesh was an influential and controversial American linguist. He was known for extensive work on Chitimacha, a now-extinct language, and historical linguistics. In the post-World War II years as the Cold War heightened tensions, he was fired from City College of New York in 1949 due to...
in the 1940–50s, which is used in
lexicostatisticsLexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
(quantitative language relatedness assessment) and
glottochronologyGlottochronology is an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic vocabulary of a language changes at a constant average rate...
(language divergence dating).
There are two basic versions of Swadesh list, one with 200 meanings, the other with 100 meanings.
In the composite listing on this page, there are actually 207 meanings in total, since seven of the entries in the 100-meaning list (
breast,
fingernail,
full,
horn,
knee,
moon,
round) were not in the original 200-meaning list. To see which words are in which lists, see Wiktionary's Swadesh list appendix.
Usage in lexicostatistics and glottochronology
The Swadesh word list is used in lexicostatistics and glottochronology to determine the approximate date of first separation of genetically related languages, though other lists may be used. The closeness of the relationship of the languages is suggested to be roughly proportional to the number of
cognateCognates in linguistics are words that have a common etymological origin.An example of cognates within the same language would be English shirt and skirt, the former from Old English scyrte, the latter loaned from Old Norse skyrta, both from the same Common Germanic *skurtjōn-. Words with this type...
words present in the list. The reason that a fixed set of concepts is used, rather than a list of arbitrary words, is that the basic vocabulary learned during early childhood is assumed to change very slowly over time. Note that the task of counting the number of cognate words in the list is far from trivial, and may be subject to dispute, because cognates do not necessarily look similar, and recognition of cognates presupposes knowledge of the sound laws of the respective languages. For example, English 'wheel' and Hindi '
chakraChakra is a Sanskrit word that translates as "wheel" or "turning"....
' are cognates, although they are not recognizable as such without knowledge of the history of both languages. Also, even in cases where the number of cognates is undisputed, use of Swadesh lists for dating is disputed, because of the underlying assumption that the rate of replacement of basic vocabulary is constant over long periods of time. While Swadesh lists are a useful tool to get a rough idea, mainstream
historical linguisticsHistorical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
is usually very sceptical about claims of relatedness based on Swadesh lists exclusively.
The use of Swadesh lists in glottochronology was most popular during the 1960s and 1970s, after which enthusiasm waned and the discussion of the method's merit became emotional, leading to a temporary demise of the method. Refinements since the early 1970s include the incorporation of a geographical dimension into the equations, accounting for borrowing, and the use of robust statistical models, borrowed from
phylogeneticsIn biology, phylogenetics is the study of evolutionary relatedness among various groups of organisms , which is discovered through molecular sequencing data and morphological data matrices...
.
A recent example of the use of Swadesh lists for absolute dating is the study of Gray and Atkinson (2003), calculating a tree of
Indo-European languagesThe Indo-European languages are a family of several hundred related languages and dialects, including most major languages of Europe, Iran, and northern India, and historically also predominant in Anatolia and Central Asia...
with absolute dates for its nodes, using
BayesianBayesian inference is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. The name "Bayesian" comes from the frequent use of Bayes' theorem in the inference process...
principles, dating the
Proto-Indo-European languageThe Proto-Indo-European language is the unattested, reconstructed common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans. The existence of such a language has been accepted by linguists for over a century, and there have been many attempts at reconstruction...
to ca. 7000 BC (see
Indo-HittiteIn Indo-European linguistics, the term Indo-Hittite refers to Sturtevant's 1926 hypothesis that the Anatolian languages may have split off the Proto-Indo-European language considerably earlier than the separation of the remaining Indo-European languages...
). The study, which begins with a merciless criticism of the earlier forms of
glottochronologyGlottochronology is an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic vocabulary of a language changes at a constant average rate...
, is based on the set of 200-word swadesh lists compiled by
Isidore DyenIsidore Dyen was an American linguist, Professor Emeritus of Malayo-Polynesian and Comparative Linguistics at Yale University. He was one of the foremost scholars in the field of Austronesian linguistics....
for 87 Indo-European languages. This 200-word swadesh list was already early abandoned by Swadesh for suspect with too many borrowed items, and has additionally been shown to be very unreliable (cf. Embleton 1995). (Swadesh later introduced a 100 item list which he considered more universal and culture-free. Because of this and false underlying assumptions of rates in language change, the work is generally argued against by practitioners of
historical linguisticsHistorical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
(cf. e.g. Campbell 1998:177ff), although the criticism has very little concrete basis, apart from verbal argument.) Gray and Atkinson use models developed for the analysis of phylogenetic relationships in biology and it remains unclear whether any critical violations of the models' assumptions are violated in the course of language evolution. It remains to be seen if the method will achieve wide acceptance in linguistics.
Swadesh list in English
Below is the Swadesh list of 207 words in the English language. For a Swadesh list that compares English, French, German, Italian, Spanish, Dutch, Esperanto, Swedish, and Latin (with links to other lists in other languages), see Wiktionary:Swadesh list.
- I
- you (singular)
- he
- we
- you (plural)
- they
- this
- that
- here
- there
- who
- what
- where
- when
- how
- not
- all
- many
- some
- few
- other
- one
- two
- three
- four
- five
- big
- long
- wide
- thick
- heavy
- small
- short
- narrow
- thin
- woman
- man (adult male)
- Man (human being)
- child
- wife
- husband
- mother
- father
- animal
- fish
- bird
- dog
- louse
- snake
- worm
- tree
- forest
- stick
- fruit
- seed
- leaf
- root
- bark
- flower
- grass
- rope
- skin
- meat
- blood
- bone
- fat (n.)
- egg
- horn
- tail
- feather
- hair
- head
- ear
- eye
- nose
- mouth
- tooth
- tongue
- fingernail
- foot
- leg
- knee
- hand
- wing
- belly
- guts
- neck
- back
- breast
- heart
- liver
- drink
- eat
- bite
- suck
- spit
- vomit
- blow
- breathe
- laugh
- see
- hear
- know
- think
- smell
- fear
- sleep
- live
- die
- kill
- fight
- hunt
- hit
- cut
- split
- stab
- scratch
- dig
- swim
- fly (v.)
- walk
- come
- lie
- sit
- stand
- turn
- fall
- give
- hold
- squeeze
- rub
- wash
- wipe
- pull
- push
- throw
- tie
- sew
- count
- say
- sing
- play
- float
- flow
- freeze
- swell
- sun
- moon
- star
- water
- rain
- river
- lake
- sea
- salt
- stone
- sand
- dust
- earth
- cloud
- fog
- sky
- wind
- snow
- ice
- smoke
- fire
- ashes
- burn
- road
- mountain
- red
- green
- yellow
- white
- black
- night
- day
- year
- warm
- cold
- full
- new
- old
- good
- bad
- rotten
- dirty
- straight
- round
- sharp
- dull
- smooth
- wet
- dry
- correct
- near
- far
- right
- left
- at
- in
- with
- and
- if
- because
- name
Shorter lists
The
Swadesh–Yakhontov list is a 35-word subset of the Swadesh list posited as especially stable by Russian linguist Sergei Yakhontov (Starostin 1991). It has been used in
lexicostatisticsLexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
by linguists such as
Sergei StarostinDr. Sergei Anatolyevich Starostin was a Russian historical linguist and scholar, best known for his work with hypothetical proto-languages, especially the controversial theory of Altaic languages and the formulation of the Dené-Caucasian hypothesis, which assumes that Northwest Caucasian, Northeast...
. With their Swadesh numbers, they are:
- 1. I
- 2. you (singular)
- 7. this
- 11. who
- 12. what
- 22. one
- 23. two
- 45. fish
- 47. dog
- 48. louse
- 64. blood
- 65. bone
- 67. egg
- 68. horn
- 69. tail
- 73. ear
- 74. eye
- 75. nose
- 77. tooth
- 78. tongue
- 83. hand
- 103. know
- 109. die
- 128. give
- 147. sun
- 148. moon
- 150. water
- 155. salt
- 156. stone
- 163. wind
- 167. fire
- 179. year
- 182. full
- 183. new
- 207. name
Holman
et al. (2008) found that the Swadesh-Yakhontov list was less accurate than the Swadesh-100 list in identifying the relationships between Chinese dialects. However, they calculated the relative stability of the words by comparing retentions between languages in established language families, and found that a different 40-word list was just as accurate as the Swadesh-100 list. They found no statistically significant difference is the correlations in the families of the Old versus the New World. The ranked Swadesh-100 list, with Swadesh numbers and relative stability, is as follows (Holman
et al., Appendix. Asterisked words appear on the 40-word list):
- 22 *louse (42.8)
- 12 *two (39.8)
- 75 *water (37.4)
- 39 *ear (37.2)
- 61 *die (36.3)
- 1 *I (35.9)
- 53 *liver (35.7)
- 40 *eye (35.4)
- 48 *hand (34.9)
- 58 *hear (33.8)
- 23 *tree (33.6)
- 19 *fish (33.4)
- 100 *name (32.4)
- 77 *stone (32.1)
- 43 *tooth (30.7)
- 51 *breasts (30.7)
- 2 *you (30.6)
- 85 *path (30.2)
- 31 *bone (30.1)
- 44 *tongue (30.1)
- 28 *skin (29.6)
- 92 *night (29.6)
- 25 *leaf (29.4)
- 76 rain (29.3)
- 62 kill (29.2)
- 30 *blood (29.0)
- 34 *horn (28.8)
- 18 *person (28.7)
- 47 *knee (28.0)
- 11 *one (27.4)
- 41 *nose (27.3)
- 95 *full (26.9)
- 66 *come (26.8)
- 74 *star (26.6)
- 86 *mountain (26.2)
- 82 *fire (25.7)
- 3 *we (25.4)
- 54 *drink (25.0)
- 57 *see (24.7)
- 27 bark (24.5)
- 96 *new (24.3)
- 21 *dog (24.2)
- 72 *sun (24.2)
- 64 fly (24.1)
- 32 grease (23.4)
- 73 moon (23.4)
- 70 give (23.3)
- 52 heart (23.2)
- 36 feather (23.1)
- 90 white (22.7)
- 89 yellow (22.5)
- 20 bird (21.8)
- 38 head (21.7)
- 79 earth (21.7)
- 46 foot (21.6)
- 91 black (21.6)
- 42 mouth (21.5)
- 88 green (21.1)
- 60 sleep (21.0)
- 7 what (20.7)
- 26 root (20.5)
- 45 claw (20.5)
- 56 bite (20.5)
- 83 ash (20.3)
- 87 red (20.2)
- 55 eat (20.0)
- 33 egg (19.8)
- 6 who (19.0)
- 99 dry (18.9)
- 37 hair (18.6)
- 81 smoke (18.5)
- 8 not (18.3)
- 4 this (18.2)
- 24 seed (18.2)
- 16 woman (17.9)
- 98 round (17.9)
- 14 long (17.4)
- 69 stand (17.1)
- 97 good (16.9)
- 17 man (16.7)
- 94 cold (16.6)
- 29 flesh (16.4)
- 50 neck (16.0)
- 71 say (16.0)
- 84 burn (15.5)
- 35 tail (14.9)
- 78 sand (14.9)
- 5 that (14.7)
- 65 walk (14.4)
- 68 sit (14.3)
- 10 many (14.2)
- 9 all (14.1)
- 59 know (14.1)
- 80 cloud (13.9)
- 63 swim (13.6)
- 49 belly (13.5)
- 13 big (13.4)
- 93 hot (11.6)
- 67 lie (11.2)
- 15 small (6.3)
See also
- A General Service List of English Words
The General Service List is a list of roughly 2000 words published by Michael West in 1953. The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers...
- Swadesh lists for hundreds of languages, listed by individual language, on the Wiktionary Swadesh Lists Category pages
- Swadesh lists for hundreds of languages, grouped by language family, on the Wiktionary Appendix pages
- Lexicostatistics
Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language...
- Glottochronology
Glottochronology is an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic vocabulary of a language changes at a constant average rate...
- Mass lexical comparison
Mass comparison is a method developed by Joseph Greenberg to determine the level of genetic relatedness between languages. It is now usually called multilateral comparison...
- Basic English
Basic English, also known as Simple English, is an English-based controlled language created by Charles Kay Ogden as an international auxiliary language, and as an aid for teaching English as a Second Language. It was presented in Ogden's book Basic English: A General Introduction with Rules and...
- Historical linguistics
Historical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
- Proto-language
A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German term Ursprache is used instead....
- cognate
Cognates in linguistics are words that have a common etymological origin.An example of cognates within the same language would be English shirt and skirt, the former from Old English scyrte, the latter loaned from Old Norse skyrta, both from the same Common Germanic *skurtjōn-. Words with this type...
- Indo-European studies
Indo-European studies is a field of linguistics dealing with Indo-European languages, both current and extinct. Its goal is to amass information about the hypothetical proto-language from which all of these languages are descended, a language dubbed Proto-Indo-European , and its speakers, the...
- The (brief) Wiktionary entry for the term 'Swadesh lists'
External links