Mass lexical comparison
Encyclopedia
Mass comparison is a method developed by Joseph Greenberg
Joseph Greenberg
Joseph Harold Greenberg was a prominent and controversial American linguist, principally known for his work in two areas, linguistic typology and the genetic classification of languages.- Early life and career :...

 to determine the level of genetic relatedness
Genetic relationship (linguistics)
In linguistics, genetic relationship is the usual term for the relationship which exists between languages that are members of the same language family. The term genealogical relationship is sometimes used to avoid confusion with the unrelated use of the term in biological genetics...

 between languages. It is now usually called multilateral comparison. The method is generally rejected by linguists , though it has some supporters.

In spite of widespread skepticism about his method, some of the relationships established by Greenberg gradually came to be generally accepted (e.g. Afro-Asiatic
Afro-Asiatic languages
The Afroasiatic languages , also known as Hamito-Semitic, constitute one of the world's largest language families, with about 375 living languages...

 and Niger–Congo
Niger–Congo languages
The Niger–Congo languages constitute one of the world's major language families, and Africa's largest in terms of geographical area, number of speakers, and number of distinct languages. They may constitute the world's largest language family in terms of distinct languages, although this question...

). Others are widely accepted though disputed by some (e.g. Nilo-Saharan
Nilo-Saharan languages
The Nilo-Saharan languages are a proposed family of African languages spoken by some 50 million people, mainly in the upper parts of the Chari and Nile rivers , including historic Nubia, north of where the two tributaries of Nile meet...

), others are predominantly rejected but have some defenders (e.g. Khoisan
Khoisan languages
The Khoisan languages are the click languages of Africa which do not belong to other language families. They include languages indigenous to southern and eastern Africa, though some, such as the Khoi languages, appear to have moved to their current locations not long before the Bantu expansion...

), while others continue to be widely rejected and have only a handful of defenders (e.g. Amerind
Amerind languages
Amerind is a higher-level language family proposed by Joseph Greenberg in 1960. Greenberg proposed that all of the indigenous languages of the Americas belong to one of three language families, the previously established Eskimo–Aleut and Na–Dene, and with everything else—almost universally believed...

).

The application of mass comparison led Greenberg not only to propose novel classifications but to break apart previously accepted ones. The best-known example is his rejection of the Hamitic language family.

Theory of mass comparison

Mass comparison involves setting up a table of basic vocabulary items and their forms in the languages to be compared. The table can also include common morphemes. The following table was used by to illustrate the technique. It shows the forms of six items of basic vocabulary in nine different languages, identified by letters.
A B C D E F G H I
Head kar kar se kal tu tu to fi pi
Eye min ku min miŋ min min idi iri
Nose tor tör ni tol was waš was ik am
One mit kan kan kaŋ ha kan kεn he čak
Two ni ta ne kil ne ni ne gum gun
Blood kur sem sem šam i sem sem fik pix

The basic relationships can be determined without any experience in the case of languages that are fairly closely related. Knowing a bit about probable paths of sound change allows one to go farther faster. An experienced typologist
Linguistic typology
Linguistic typology is a subfield of linguistics that studies and classifies languages according to their structural features. Its aim is to describe and explain the common properties and the structural diversity of the world's languages...

 — Greenberg was a pioneer in the field — can quickly recognize or reject several potential cognates in this table as probable or improbable. For example, the path p > f is extremely frequent, the path f > p much less so, enabling one to hypothesize that fi : pi and fik : pix are indeed related and go back to protoforms *pi and *pik/x, while knowledge that k > x is extremely frequent, x > k much less so enables one to choose *pik over *pix. Thus, while mass comparison does not attempt to produce reconstructions of protolanguages — according to Greenberg (2005:318) these belong to a later phase of study — phonological considerations come into play from the very beginning.

The tables used in actual research involve much larger numbers of items and languages. The items included may be either lexical, such as 'hand', 'sky', and 'go', or morphological, such as PLURAL and MASCULINE .

Detection of borrowings

Critics of mass comparison generally assume that mass comparison has no means to distinguish borrowed forms from inherited ones, unlike comparative reconstruction, which is able to do so through regular sound correspondences. These questions were addressed by as of the 1950s. According to him, the key points are as follows :
  • Basic vocabulary is much less readily borrowed than cultural vocabulary.

  • "[D]erivational, inflectional, and pronominal morphemes and morph alternations are the least subject of all to borrowing."

  • Any type of linguistic item may be borrowed "on occasion". However, "fundamental vocabulary is proof against mass borrowing."

  • Mass comparison does not possess means to distinguish borrowing in every instance: "in particular and infrequent instances the question of borrowing may be doubtful". However, it is always possible to detect whether borrowing is responsible for "a mass of resemblances" between languages: "Where a mass of resemblances is due to borrowing, they will tend to appear in cultural vocabulary and to cluster in certain semantic areas which reflect the cultural nature of the contact."

  • The technique of mass comparison, as opposed to bilateral comparison, provides a check on whether forms are borrowed or not :

Borrowing can never be an over-all explanation of a mass of recurrent basic resemblances in many languages occurring over a wide geographical area.... Since we find independent sets of resemblances between every pair of languages, among every group of three languages, and so on, each language would have to borrow from every other.

  • "[R]ecurrent sound correspondences" do not suffice to detect borrowing, since "where loans are numerous, they often show such correspondences" .


Greenberg considered that the results achieved through this method approached certainty : "The presence of fundamental vocabulary resemblances and resemblances in items with grammatical function, particularly if recurrent through a number of languages, is a sure indication of genetic relationship."

The place of sound correspondences in the comparative method

It is often reported that Greenberg sought to replace the comparative method with a new method, mass comparison (or, among his less scrupulous critics, "mass lexical comparison"). He consistently rejected this characterization, stating for instance, "The methods outlined here do not conflict in any fashion with the traditional comparative method" (1957:44) and expressing wonderment at "the strange and widely disseminated notion that I seek to replace the comparative method with a new and strange invention of my own" (2002:2). According to Greenberg, mass comparison is the necessary "first step" in the comparative method (1957:44), and "once we have a well-established stock I go about comparing and reconstructing just like anyone else, as can be seen in my various contributions to historical linguistics" (1990, quoted in Ruhlen 1994:285). Reflecting the methodological empiricism
Empiricism
Empiricism is a theory of knowledge that asserts that knowledge comes only or primarily via sensory experience. One of several views of epistemology, the study of human knowledge, along with rationalism, idealism and historicism, empiricism emphasizes the role of experience and evidence,...

 also present in his typological
Linguistic typology
Linguistic typology is a subfield of linguistics that studies and classifies languages according to their structural features. Its aim is to describe and explain the common properties and the structural diversity of the world's languages...

 work, he viewed facts as of greater weight than their interpretations, stating (1957:45):
[R]econstruction of an original sound system has the status of an explanatory theory to account for etymologies already strong on other grounds. Between the *vaida of Bopp and the *γwoidxe of Sturtevant lie more than a hundred years of the intensive development of Indo-European phonological reconstruction. What has remained constant has been the validity of the etymologic relationship among Sanskrit veda, Greek woida, Gothic wita, all meaning "I know", and many other unshakable etymologies both of root and of non-root morphemes recognized at the outset. And who will be bold enough to conjecture from what original the Indo-Europeanist one hundred years from now will derive these same forms?

Summary

The thesis of mass comparison, then, is that:
  • A group of languages is related when they show numerous resemblances in basic vocabulary, including pronouns, and morphemes, forming an interlocking pattern common to the group.

  • While mass comparison cannot identify every instance of borrowing, it can identify broad patterns of borrowing, which suffices in establishing genetic relationship.

  • The results achieved approach certainty.

  • It is unnecessary to establish sets of recurrent sound correspondences or reconstructed ancestral forms to identify genetic relationships. On the contrary, it is not possible to establish such correspondences or to reconstruct such forms until genetic relationships are identified.

The myth of ‘mass lexical comparison’

It is widely believed among linguists that Greenberg's method of language classification was limited to comparisons of words alone, to the neglect of grammatical elements, which could often provide more decisive evidence for language relationship or non-relationship. Greenberg's method, many of them state, is known as "mass lexical comparison" (e.g. ).

In reality, Greenberg never used the phrase "mass lexical comparison". It is not found in any of his principal works on language classification or anywhere else in his works. It does not occur, for instance, in Greenberg 1955, 1957, 1960, 1971, 1987, 2000–2002, or 2005. Furthermore, the phrase "mass lexical comparison" is not used by any of Greenberg's supporters, for example John Bengtson
John Bengtson
John D. Bengtson is a historical and anthropological linguist. He is a past president and currently a vice-president of the Association for the Study of Language in Prehistory, and has served as editor of the journal Mother Tongue...

, Harold C. Fleming
Harold C. Fleming
Harold Crane Fleming is an American anthropologist and historical linguist, specializing in the cultures and languages of the Horn of Africa. As an adherent of the Four Field School of American anthropology, he stresses the integration of physical anthropology, linguistics, archaeology, and...

, Paul Newman
Paul Newman (linguist)
Paul Newman is an American linguist of great influence in the study of African languages. He is the world’s leading authority on the Hausa language of Nigeria and on the Chadic language family...

, Merritt Ruhlen
Merritt Ruhlen
Merritt Ruhlen is an American linguist known for his work on the classification of languages and what this reveals about the origin and evolution of modern humans. Amongst other linguists, Ruhlen's work is recognized as standing outside the mainstream of comparative-historical linguistics...

, Timothy Usher, or William S.-Y. Wang.

Probably the most influential critique of mass comparison is that by Lyle Campbell
Lyle Campbell
Lyle Richard Campbell is a linguist and leading expert on indigenous American languages—especially those of Mesoamerica—and on historical linguistics in general. He also has expertise in Uralic languages. He is presently Professor of Linguistics at the University of Hawaii at Manoa.-Life and...

. According to Campbell, Greenberg's method of language classification "relies on inspectional similarities in vocabulary alone" :
The best-known of the approaches which rely on inspectional resemblances among lexical items is that advocated by Joseph Greenberg, called 'multilateral (or mass) comparison'. It is based on 'looking at ... many languages across a few words' rather than at 'a few languages across many words' . The lexical similarities determined by superficial visual inspection which are shared 'across many languages' alone are taken as evidence of genetic relationship. This approach stops where others begin, at the assembling of lexical similarities. These inspectional resemblances must be investigated to determine why they are similar, whether the similarity is due to inheritance from a common ancestor (the result of a distant genetic relationship) or to borrowing, accident, onomatopoeia, sound symbolism, nursery formations and the various things which we will consider in this chapter. Since multilateral comparison does not do this, its results are controversial and rejected by most mainstream historical linguists.

In short, no technique which relies on inspectional similarities in vocabulary alone has proven adequate for establishing distant family relationships.


Greenberg consistently rejected the assertion that he relied on lexical comparisons alone. He went so far as to publish the first volume of his Eurasiatic work, devoted to grammatical comparisons, two years before the second volume, devoted to vocabulary comparisons, in an attempt to get the point across (2000: vii):
This grammatical evidence is quite sufficient in itself to establish the validity of the Eurasiatic language family. I have chosen to present it first for several reasons. One of these ... is that, despite all the facts regarding the presentation of evidence for linguistic stocks in my previous work, the myth persists that I only take into account vocabulary evidence.


Typical of Greenberg's use of morphological comparisons is this (1955: 82, 83):
In phonology, the most important evidence of the basically Khoisan relationships of Hottentot is the frequency of the click sounds and the essential part they play in the economy of the language. As in other Khoisan languages, they only occur initially.

Hottentot shares with the Bushman languages the following distinctive method of root formation in which the clicks play a fundamental part. Verb, noun and adjective roots are mostly disyllabic, or can be reconstructed as once having been disyllabic. The roots begin most frequently with a click, sometimes with a non-click consonant. This is followed by a restricted group of vowels in the second position, basically o or a. The third position is either vacant or one of a small number of non-click consonants occur: r, m, n or a labial (in Nama Hottentot phonemically a p). In the fourth or final position we find the full set of vowels - a, e, i, o, or u.


Throughout his work on African classifications, Greenberg's method is the same: he begins with a discussion of morphology, and then follows this up with a few pages of lexical comparisons.

Comparisons of the number of pages devoted to grammatical versus lexical comparisons from the beginning, middle, and end of Greenberg's career show the same pattern: In Greenberg 1955 (Africa), there are 38 pages of grammatical comparisons, versus 25 of lexical comparisons. In Greenberg 1971 (Indo-Pacific), there are 21 pages of grammatical comparisons versus 34 of lexical comparisons. In Greenberg 2000-2002 (Eurasiatic), there are 179 pages of grammatical comparisons versus 181 of lexical comparisons.

Greenberg did not come by this position late in his career. As early as 1956, he stated (2005: 60):
Only those resemblances which involve both sound and meaning simultaneously are considered relevant for historical connections. When the morphemes involved are roots this is called lexical comparison, when they are affixes, grammatical. There is no contradiction in the results attained by lexical and grammatical comparison and both methods are employed as far as possible.


In sum, mainstream historical linguists believe that Joseph Greenberg advocated and practiced a technique limited to lexical comparison. Greenberg's published writings show that he advocated the use of both lexical and grammatical data and that he carried out this theoretical desideratum in practice. No such technique as "mass lexical comparison" has ever existed.

The disputed legacy of the comparative method

The conflict over mass comparison can be seen as a dispute over the legacy of the comparative method, developed in the 19th century, primarily by Danish and German linguists, in the study of Indo-European languages
Indo-European languages
The Indo-European languages are a family of several hundred related languages and dialects, including most major current languages of Europe, the Iranian plateau, and South Asia and also historically predominant in Anatolia...

.

Position of Greenberg’s detractors

Since the development of comparative linguistics
Comparative linguistics
Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages to establish their historical relatedness....

 in the 19th century, a linguist who claims that two languages are related, whether or not there exists historical evidence, is expected to back up that claim by presenting general rules that describe the differences between their lexicons, morphologies, and grammars. The procedure is described in detail in the comparative method
Comparative method
In linguistics, the comparative method is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor, as opposed to the method of internal reconstruction, which analyzes the internal...

 article.

For instance, one could prove that Spanish
Spanish language
Spanish , also known as Castilian , is a Romance language in the Ibero-Romance group that evolved from several languages and dialects in central-northern Iberia around the 9th century and gradually spread with the expansion of the Kingdom of Castile into central and southern Iberia during the...

 is related to Italian
Italian language
Italian is a Romance language spoken mainly in Europe: Italy, Switzerland, San Marino, Vatican City, by minorities in Malta, Monaco, Croatia, Slovenia, France, Libya, Eritrea, and Somalia, and by immigrant communities in the Americas and Australia...

 by showing that many words of the former can be mapped to corresponding words of the latter by a relatively small set of replacement rules—such as the correspondence of initial es- and -s, final -os and -i, etc. Many similar correspondences exist between the grammars of the two languages. Since those systematic correspondences are extremely unlikely to be random coincidences, the most likely explanation by far is that the two languages have evolved from a single ancestral tongue (Latin
Latin
Latin is an Italic language originally spoken in Latium and Ancient Rome. It, along with most European languages, is a descendant of the ancient Proto-Indo-European language. Although it is considered a dead language, a number of scholars and members of the Christian clergy speak it fluently, and...

, in this case).

All pre-historical language groupings that are widely accepted today—such as the Indo-European
Indo-European languages
The Indo-European languages are a family of several hundred related languages and dialects, including most major current languages of Europe, the Iranian plateau, and South Asia and also historically predominant in Anatolia...

, Uralic
Uralic languages
The Uralic languages constitute a language family of some three dozen languages spoken by approximately 25 million people. The healthiest Uralic languages in terms of the number of native speakers are Hungarian, Finnish, Estonian, Mari and Udmurt...

, Algonquian
Algonquian languages
The Algonquian languages also Algonkian) are a subfamily of Native American languages which includes most of the languages in the Algic language family. The name of the Algonquian language family is distinguished from the orthographically similar Algonquin dialect of the Ojibwe language, which is a...

, and Bantu
Bantu languages
The Bantu languages constitute a traditional sub-branch of the Niger–Congo languages. There are about 250 Bantu languages by the criterion of mutual intelligibility, though the distinction between language and dialect is often unclear, and Ethnologue counts 535 languages...

 families—have been proved in this way.

Response of Greenberg’s defenders

The actual development of the comparative method was a more gradual process than Greenberg's detractors suppose. It has three decisive moments. The first was Rasmus Rask
Rasmus Christian Rask
Rasmus Rask was a Danish scholar and philologist.-Biography:...

's observation in 1818 of a possible regular sound change in Germanic consonants. The second was Jacob Grimm
Jacob Grimm
Jacob Ludwig Carl Grimm was a German philologist, jurist and mythologist. He is best known as the discoverer of Grimm's Law, the author of the monumental Deutsches Wörterbuch, the author of Deutsche Mythologie and, more popularly, as one of the Brothers Grimm, as the editor of Grimm's Fairy...

's extension of this observation into a general principle (Grimm's law
Grimm's law
Grimm's law , named for Jacob Grimm, is a set of statements describing the inherited Proto-Indo-European stops as they developed in Proto-Germanic in the 1st millennium BC...

) in 1822. The third was Karl Verner
Karl Verner
Karl Verner was a Danish linguist. He is remembered today for Verner's law, which he discovered in 1875.Verner, whose interest in languages was stimulated by reading about the work of Rasmus Christian Rask, began his university studies in 1864. He studied Oriental, Germanic and Slavic languages,...

's resolution of an irregularity in this sound change (Verner's law
Verner's law
Verner's law, stated by Karl Verner in 1875, describes a historical sound change in the Proto-Germanic language whereby voiceless fricatives *f, *þ, *s, *h, *hʷ, when immediately following an unstressed syllable in the same word, underwent voicing and became respectively the fricatives *b, *d, *z,...

) in 1875. Only in 1861 did August Schleicher
August Schleicher
August Schleicher was a German linguist. His great work was A Compendium of the Comparative Grammar of the Indo-European Languages, in which he attempted to reconstruct the Proto-Indo-European language...

, for the first time, present systematic reconstructions of Indo-European proto-forms (Lehmann 1993:26). Schleicher, however, viewed these reconstructions as extremely tentative (1874:8). He never claimed that they proved the existence of the Indo-European family, which he accepted as a given from previous research — primarily that of Franz Bopp
Franz Bopp
Franz Bopp was a German linguist known for extensive comparative work on Indo-European languages.-Biography:...

, his great predecessor in Indo-European studies.

Karl Brugmann
Karl Brugmann
Karl Brugmann was a German linguist. He is a towering figure in Indo-European linguistics.-Biography:He was educated at Halle and Leipzig. He was instructor in the gymnasium at Wiesbaden and at Leipzig, and in 1872-77 was assistant at the Russian Institute of Classical Philology at the latter place...

, who succeeded Schleicher as the leading authority on Indo-European, and the other Neogrammarian
Neogrammarian
The Neogrammarians were a German school of linguists, originally at the University of Leipzig, in the late 19th century who proposed the Neogrammarian hypothesis of the regularity of sound change...

s of the late 19th century, distilled the work of these scholars into the famous (if often disputed) principle that "every sound change, insofar as it occurs automatically, takes place according to laws that admit of no exception" (Brugmann 1878).http://www.utexas.edu/cola/centers/lrc/books/read14.html

The Neogrammarians did not, however, regard regular sound correspondences or comparative reconstructions as relevant to the proof of genetic relationship between languages. In fact, they made almost no statements on how languages are to be classified (Greenberg 2005:158). The only Neogrammarian to deal with this question was Berthold Delbrück
Berthold Delbrück
Berthold Gustav Gottlieb Delbrück was a German linguist who devoted himself to the study of the comparative syntax of the Indo-European languages.-Biography:...

, Brugmann’s collaborator on the Grundriß der vergleichenden Grammatik der indogermanischen Sprachen
Grundriß der vergleichenden Grammatik der indogermanischen Sprachen
Grundriß der vergleichenden Grammatik der indogermanischen Sprachen is a major work of historical linguistics by Karl Brugmann and Berthold Delbrück, published in two editions between 1886 and 1916...

(Greenberg 2005:158-159, 288). According to Delbrück (1904:121-122, quoted in Greenberg 2005:159), Bopp had proved the existence of Indo-European in the following way:
The proof was produced by juxtaposing words and forms of similar meanings. When one considers that in these languages the formation of the inflectional forms of the verb, noun and pronoun agrees in essentials and likewise that an extraordinary number of inflected words agree in their lexical parts, the assumption of chance agreement must appear absurd.


Furthermore, Delbrück took the position later enunciated by Greenberg on the priority of etymologies to sound laws (1884:47, quoted in Greenberg 2005:288): "obvious etymologies are the material from which sound laws are drawn."

The opinion that sound correspondences or, in another version of the opinion, reconstruction of a proto-language are necessary to show relationship between languages thus dates from the 20th, not the 19th century, and was never a position of the Neogrammarians. Indo-European was recognized by scholars such as William Jones
William Jones (philologist)
Sir William Jones was an English philologist and scholar of ancient India, particularly known for his proposition of the existence of a relationship among Indo-European languages...

 (1786) and Franz Bopp (1816) long before the development of the comparative method.

Furthermore, Indo-European was not the first language family to be recognized by students of language. Semitic
Semitic languages
The Semitic languages are a group of related languages whose living representatives are spoken by more than 270 million people across much of the Middle East, North Africa and the Horn of Africa...

 had been recognized by European scholars in the 17th century, Finno-Ugric
Finno-Ugric languages
Finno-Ugric , Finno-Ugrian or Fenno-Ugric is a traditional group of languages in the Uralic language family that comprises the Finno-Permic and Ugric language families....

 in the 18th. Dravidian
Dravidian languages
The Dravidian language family includes approximately 85 genetically related languages, spoken by about 217 million people. They are mainly spoken in southern India and parts of eastern and central India as well as in northeastern Sri Lanka, Pakistan, Nepal, Bangladesh, Afghanistan, Iran, and...

 was recognized in the mid-19th century by Robert Caldwell
Robert Caldwell
Bishop Robert Caldwell was an Evangelist missionary and linguist, who academically established the Dravidian family of languages. He served as Assistant Bishop of Tirunelveli from 1877. He was described in The Hindu as a 'pioneering champion of the downtrodden' and an 'avant-garde social reformer'...

 (1856), well before the publication of Schleicher's comparative reconstructions.

Finally, the supposition that all of the language families generally accepted by linguists today have been proved by the comparative method is untrue. For example, although Eskimo–Aleut has long been accepted as a valid family, "Proto-Eskimo–Aleut has not yet been reconstructed" (Bomhard 2008:209). Other families were accepted for decades before comparative reconstructions of them were put forward, for example Afro-Asiatic
Afro-Asiatic languages
The Afroasiatic languages , also known as Hamito-Semitic, constitute one of the world's largest language families, with about 375 living languages...

 and Sino-Tibetan
Sino-Tibetan languages
The Sino-Tibetan languages are a language family comprising, at least, the Chinese and the Tibeto-Burman languages, including some 250 languages of East Asia, Southeast Asia and parts of South Asia. They are second only to the Indo-European languages in terms of the number of native speakers...

. Many languages are generally accepted as belonging to a language family even though no comparative reconstruction exists, often because the languages are only attested in fragmentary form, such as the Anatolian
Anatolian languages
The Anatolian languages comprise a group of extinct Indo-European languages that were spoken in Asia Minor, the best attested of them being the Hittite language.-Origins:...

 language Lydian
Lydian language
Lydian was an Indo-European language spoken in the region of Lydia in western Anatolia . It belongs to the Anatolian group of the Indo-European language family....

 (Greenberg 2005:161). Conversely, detailed comparative reconstructions exist for some language families which nonetheless remain controversial, such as Altaic and Nostratic (however, a specification is needed here: Nostratic is a proposed proto-proto-language, while Altaic is a "simple" proto-language - with Altaic languages widely accepted as typologically related. Detractors of both proposals simply claim that the data collected to prove by comparativism the existence of both families is scarce, wrong and non sufficient. Keep in mind that regular phonological correspondences need thousands of lexicon lists to be prepared and compared before being established. These lists are lacking for both the proposed families. Furthermore, other specific problems affect "comparative" lists of both proposals, like the late attestation for Altaic languages, or the comparison of not certain proto-forms, like proto-Kartvelian, for Nostratic.).

A continuation of earlier methods?

Greenberg claimed that he was at bottom merely continuing the simple but effective method of language classification that had resulted in the discovery of numerous language families prior to the elaboration of the comparative method (1955:1-2, 2005:75) and that had continued to do so thereafter, as in the classification of Hittite
Hittite language
Hittite is the extinct language once spoken by the Hittites, a people who created an empire centred on Hattusa in north-central Anatolia...

 as Indo-European in 1917 (Greenberg 2005:160-161). This method consists in essentially two things: resemblances in basic vocabulary and resemblances in inflectional morphemes. If mass comparison differs from it in any obvious way, it would seem to be in the theoretization of an approach that had previously been applied in a relatively ad hoc manner and in the following additions:
  • The explicit preference for basic vocabulary over cultural vocabulary.

  • The explicit emphasis on comparison of multiple languages rather than bilateral comparisons.

  • The very large number of languages simultaneously compared (up to several hundred).

  • The introduction of typologically based paths of sound change.


The positions of Greenberg and his critics therefore appear to provide a starkly contrasted alternative:
  • According to Greenberg, the identification of sound correspondences and the reconstruction of protolanguages arise from genetic classification.

  • According to Greenberg’s critics, genetic classification arises from the identification of sound correspondences or (others state) the reconstruction of protolanguages.

Time limits of the comparative method

Besides systematic changes, languages are also subject to random mutations (such as borrowings from other languages, irregular inflections, compounding, and abbreviation) that affect one word at a time, or small subsets of words. For example, Spanish perro (dog), which does not come from Latin, cannot be rule-mapped to its Italian equivalent cane (the Spanish word can would be the Latin-derived equivalent but is much less used in everyday conversations, being reserved for more formal purposes). As those sporadic changes accumulate, they will increasingly obscure the systematic ones — just as enough dirt and scratches on a photograph will eventually make the face unrecognizable.

On this point, Greenberg and his critics agree, as over against the Moscow school, but they draw contrasting conclusions:
  • Greenberg’s critics argue that the comparative method has an inherent limit of 6,000 – 10,000 years (depending on the author), and that beyond this too many irregularities of sound change have accumulated for the method to function. Since according to them the identification of regular sound correspondences is necessary to establish genetic relationship, they conclude that genetic relationships older than 10,000 years (or less) cannot be determined. In consequence, it is not possible to go much beyond those genetic classifications that have already been arrived at (e.g. Ringe 1992:1).

  • Greenberg argued that cognates often remain recognizable even when recurrent sound changes have been overlaid by idiosyncratic ones or interrupted by analogy, citing the cases of English brother (2002:4), which is easily recognizable as a cognate of German Bruder even though it violates Verner’s law, and Latin quattuor (1957:45), easily recognizable as a reflex of Proto-Indo-European *kʷetwor even though the changes e > a and t > tt violate the usual sound changes from Proto-Indo-European to Latin. (In the case of brother, the sound changes are actually known, but intricate, and are only decipherable because the language is heavily documented from an early date. In the case of quattuor, the changes are genuinely irregular, and the form of the word can only be explained through means other than regular sound change, such as the operation of analogy
    Analogy
    Analogy is a cognitive process of transferring information or meaning from a particular subject to another particular subject , and a linguistic expression corresponding to such a process...

    .)

  • In contrast, the "Moscow school" of linguists, perhaps best known for its advocacy of the Nostratic hypothesis
    Nostratic languages
    Nostratic is a proposed language family that includes many of the indigenous language families of Eurasia, including the Indo-European, Uralic and Altaic as well as Kartvelian languages...

     (though active in many other areas), has confidence in the traceability of regular sound changes at very great time depths, and believes that reconstructed proto-languages can be pyramided on top of each other so as to attain still earlier proto-languages, without violating the principles of the standard comparative method.

The mathematics of language comparison

An unexpected offshoot of the controversy over mass comparison has been a new debate over the mathematics of language classification.

Greenberg’s initial position

From an early date, Greenberg (esp. 1957:36-44) argued that mass comparison rests on a mathematical basis. Although "[t]he most straightforward method of eliminating chance would be the calculation of the expected number of chance resemblances between two languages", "[i]n practice, this proves extremely difficult" (1957:37). For one thing, "it requires ... a frequency weighting of phonemes", along with an evaluation of "the possibilities of phonemic combination" (ib.). Complicating the issue, it is often the case that languages with very different levels of relationship show "approximately the same" number of resemblances (ib.). The solution, according to Greenberg, is to consider multiple languages at once (ib.): the key lies in their overlapping relationships, not in their two-by-two resemblances. According to him (1957:38-39):
The following fundamental probability considerations apply. The likelihood of finding a resemblance in sound and meaning in three languages is the square of its probability in two languages. In general, the probability for a single language must be raised to the (n – th) power for n languages. Thus if five languages each showed a total of 8 per cent sound-meaning resemblance to one another, on a chance basis one would expect (0.08) or 0.00004096 resemblances in all five languages. This is approximately 1/25,000.

Ringe’s attack on Greenberg

In a series of articles, Donald Ringe (1992, 1993, 1995, 1996) has attacked Amerind, Nostratic, and other proposals of "long-range" language classification on mathematical grounds. In his view "the remoter relationships cannot be demonstrated because the languages in question have diverged too much" (ib.). According to Ringe, sound change renders linguistic relationships untraceable over time. The level of chance resemblances between languages, however, is relatively high. Not having considered the possibility of chance resemblances, the relationships Greenberg alleges are indistinguishable from coincidence.

Greenberg’s response

Greenberg replied that he had considered this very possibility in his previous work, which Ringe had failed to cite and thus, presumably, to read . This began an ongoing series of exchanges between scholars inclined to support mass comparison and those inclined to oppose it.

State of the question

Today there are three distinct views on the mathematics of language classification with respect to mass comparison.
  • One view, represented by Donald Ringe, holds that language classification can be mathematized and the results show that mass comparison is invalid, finding only resemblances that do not rise above the level of chance.

  • Another view, represented by William Baxter and Alexis Manaster Ramer
    Alexis Manaster Ramer
    Alexis Manaster Ramer is a Polish-born American linguist .He has published extensively on syntactic typology Alexis Manaster Ramer (born 1956) is a Polish-born American linguist (PhD 1981, University of Chicago).He has published extensively on syntactic typology Alexis Manaster Ramer (born 1956)...

     (1999) and by Greenberg himself, holds that language classification can be mathematized and the results show that mass comparison is valid, finding resemblances that go far beyond what can be expected by chance.

  • A third view, represented by Johanna Laakso, holds that language classification cannot be reduced to mathematics. Laakso is unwilling to give up correspondences that appear to be "intuitive" but in fact represent logical perceptions that most people are unwilling to theorize because of their subtlety and fineness, yet which are quite real and indispensable. As she states in a review of Angela Marcantonio's The Uralic Language Family:http://homepage.univie.ac.at/Johanna.Laakso/am_rev.html

[T]he traditional model of linguistic relatedness cannot be completely and exactly algorithmised; rather, it is a pattern explanation consisting of many interlinked parts, complex and yet tolerating gaps in its construction. In many details it seems to be based on intuition and fingertip feeling, but, actually, it is dependent of various external and internal background factors.

Toward a resolution of the conflict?

In spite of the apparently intractable nature of the conflict between Greenberg and his critics, a few linguists have begun to argue for its resolution. Edward Vajda
Edward Vajda
Edward Vajda is a historical linguist at Western Washington University. He has become known for his work on the proposed Dené–Yeniseian language family, seeking to establish that the Ket language of Siberia has a common linguistic ancestor with the Na-Dené languages of North America...

, acclaimed for his recent demonstration of Dené–Yeniseian, attempts to stake out a position that is sympathetic to both Greenberg’s approach and that of its critics, such as Lyle Campbell and Johanna Nichols
Johanna Nichols
Linguist Johanna Nichols is a professor emerita on active duty in the Department of Slavic Languages and Literatures at the University of California, Berkeley. Her research interests include the Slavic languages, the linguistic prehistory of northern Eurasia, language typology, ancient linguistic...

.http://www.uaf.edu/anlc/docs/dy_vajda_perspective.pdf George Starostin
Georgiy Starostin
Georgiy Sergeevich Starostin is a Russian linguistics researcher at the Center of Comparative Studies at the Russian State University for the Humanities, and a participant at the Santa Fe Institute's Evolution of Human Languages project...

, a member of the Moscow school, argues that Greenberg’s work, while perhaps not going beyond inspection
Inspection
An inspection is, most generally, an organized examination or formal evaluation exercise. In engineering activities inspection involves the measurements, tests, and gauges applied to certain characteristics in regard to an object or activity...

, presents interesting sets of forms that call for further scrutiny by comparative reconstruction, specifically with regard to the proposed Khoisan http://starling.rinet.ru/Texts/khoisan.pdf and Amerind http://www.nostratic.ru/books/(316)gell-starostin-jlr1.pdf families.

Works cited


  • Bomhard, Allan R. 2008. Reconstructing Proto-Nostratic: Comparative Phonology, Morphology, and Vocabulary, 2 volumes. Leiden: Brill.

  • Bopp, Franz. 1816. Über das Conjugationssystem der Sanskritsprache in Vergleichung mit jenem der griechischen, lateinischen, persischen und germanischen Sprache. Frankfurt-am-Main: Andreäischen Buchhandlung.

  • Brugmann, Karl. 1878. Preface to the first issue of Morphologische Untersuchungen auf dem Gebiete der indogermanischen Sprachen. Leipzig: S. Hirzel. (The preface is signed Hermann Osthoff and Karl Brugmann but was written by Brugmann alone.)

  • Brugmann, Karl and Berthold Delbrück. 1886-1893. Grundriß der vergleichenden Grammatik der indogermanischen Sprachen, 5 volumes (some multi-part, for a total of 8 volumes). Strassburg: Trübner.

  • Caldwell, Robert. 1856. A Comparative Grammar of the Dravidian or South-Indian Family of Languages. London: Harrison.

  • Delbrück, Berthold. 1884. Einleitung in das Sprachstudium, 2d edition. Leipzig: Breitkopf und Härtel.

  • Delbrück, Berthold. 1904. Einleitung in das Studium der indogermanischer Sprachen, 4th and renamed edition of Einleitung in das Sprachstudium, 1880. Leipzig: Breitkopf und Härtel.


(Photo-offset reprint of eight articles published in the Southwestern Journal of Anthropology from 1949 to 1954, with minor corrections.)

  • Greenberg, Joseph H. 1960. "The general classification of Central and South American languages." In Selected Papers of the Fifth International Congress of Anthropological and Ethnological Sciences, 1956, edited by Anthony F.C. Wallace, 791-94. Philadelphia|publisher=University of Pennsylvania Press. (Reprinted in Greenberg 2005, 59-64.)

(Heavily revised version of Greenberg 1955.)(From the same publisher: second, revised edition, 1966; third edition, 1970. All three editions simultaneously published at The Hague by Mouton & Co.)
  • Greenberg, Joseph H. 1971. "The Indo-Pacific hypothesis." Current Trends in Linguistics, Volume 8: Linguistics in Oceania, edited by Thomas F. Sebeok, 807-871. The Hague: Mouton. (Reprinted in Greenberg 2005.)

  • Laakso, Johanna. 2003. "Linguistic shadow-boxing." Review of The Uralic Language Family: Facts, Myths and Statistics by Angela Marcantonio.

  • Lehmann, Winfred P. 1993. Theoretical Bases of Indo-European Linguistics. London: Routledge

  • Ringe, Donald. 1992. "On calculating the factor of chance in language comparison." American Philosophical Society, Transactions 82.1, 1-110.

  • Ringe, Donald. 1993. "A reply to Professor Greenberg." American Philosophical Society, Proceedings 137, 91-109.

  • Ringe, Donald A., Jr. 1995. "'Nostratic' and the factor of chance." Diachronica 12.1, 55-74.

  • Ringe, Donald A., Jr. 1996. "The mathematics of 'Amerind'." Diachronica 13, 135-54.


  • Ruhlen, Merritt. 1994. On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford: Stanford University Press.

  • Schleicher, August. 1861-1862. Compendium der vergleichenden Grammatik der indogermanischen Sprachen. Kurzer Abriss der indogermanischen Ursprache, des Altindischen, Altiranischen, Altgriechischen, Altitalischen, Altkeltischen, Altslawischen, Litauischen und Altdeutschen, 2 volumes. Weimar: H. Boehlau.

  • Schleicher, August. 1874. A Compendium of the Comparative Grammar of the Indo-European, Sanskrit, Greek, and Latin Languages, translated from the third German edition by Herbert Bendall. London: Trübner and Co. (An abridgement of the German original.)

Further reading

Anti-Greenbergian

  • Hock, Hans Henrich and Brian D. Joseph. 1996. Language History, Language Change, and Language Relationship: An Introduction to Historical and Comparative Linguistics. Berlin: Mouton de Gruyter.


  • Kessler, Brett and A. Lehtonen. 2006. "Multilateral comparison and significance testing of the Indo-Uralic question." In Phylogenetic Methods and the Prehistory of Languages, edited by Peter Foster and Colin Renfrew. McDonald Institute for Archaeological Research. (Also: Unofficial prepublication draft (2004).)

  • Matisoff, James. 1990. "On megalocomparison." Language 66, 109-20.



Greenbergian
  • Greenberg, Joseph H. 1990. "The American Indian language controversy." Review of Archaeology 11, 5-14.

  • Newman, Paul. 1995. On Being Right: Greenberg’s African Linguistic Classification and the Methodological Principles Which Underlie It. Bloomington: Institute for the Study of Nigerian Languages and Cultures, African Studies Program, Indiana University.

  • Ruhlen, Merritt. 1994. The Origin of Language: Tracing the Evolution of the Mother Tongue. New York: John Wiley and Sons.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK