{{About|the comparative method in linguistics|other kinds of comparative methods|Comparative (disambiguation)}}
[[File:Romance-lg-classification-en.png|thumb|375px|Linguistic map representing a [[Tree model]] of the Romance languages based on the comparative method. Here the family tree has been rendered as a [[Venn diagram]] without overlapping subareas. The [[Wave model (linguistics)|wave model]] allows overlapping regions.]]
In [[linguistics]], the '''comparative method''' is a technique for studying the development of languages by performing a feature-by-feature comparison of two or more languages with common descent from a shared ancestor, as opposed to the method of [[internal reconstruction]], which analyzes the internal development of a single language over time. Ordinarily both methods are used together to reconstruct prehistoric phases of languages, to fill in gaps in the historical record of a language, to discover the development of phonological, morphological, and other linguistic systems, and to confirm or refute hypothesized relationships between languages.
The comparative method was developed over the 19th century. Key contributions were made by the Danish scholars [[Rasmus Christian Rask|Rasmus Rask]] and [[Karl Verner]] and the German scholar [[Jacob Grimm]]. The first linguist to offer [[Linguistic reconstruction|reconstructed forms]] from a [[proto-language]] was [[August Schleicher]], in his ''Compendium der vergleichenden Grammatik der indogermanischen Sprachen'', originally published in 1861. Here is Schleicher’s explanation of why he offered reconstructed forms:
In the present work an attempt is made to set forth the inferred Indo-European original language side by side with its really existent derived languages. Besides the advantages offered by such a plan, in setting immediately before the eyes of the student the final results of the investigation in a more concrete form, and thereby rendering easier his insight into the nature of particular Indo-European languages, there is, I think, another of no less importance gained by it, namely that it shows the baselessness of the assumption that the non-Indian Indo-European languages were derived from Old-Indian (Sanskrit).
[[File:Fi-ugr-turk-comparison.png|thumb|375px|Various linguists have seen these North Eurasian languages as part of:
- a [[Ural–Altaic]] [[language family]] (popular until 1960s)
- a [[Uralic]] and an [[Altaic]] family ([[Anna V. Dybo|Dybo]], [[Roy Andrew Miller|Miller]], [[Nicholas Poppe|Poppe]])
- separate Uralic, [[Turkic languages|Turkic]] and [[Mongolian language|Mongolian]] families ([[Gerard Clauson|Clauson]], [[Gerhard Doerfer|Doerfer]], [[Stefan Georg|Georg]])
- a [[Eurasiatic]] or [[Nostratic]] [[macrofamily]] ([[Joseph Greenberg|Greenberg]], [[Sergei Starostin|Starostin]], [[Allan Bomhard|Bomhard]])
]]
== Demonstrating genetic relationship ==
The comparative method aims to prove that two or more historically [[attested language]]s are descended from a single [[proto-language]] by comparing lists of [[cognate]] terms. From them, regular sound correspondences between the languages are established, and a sequence of regular [[sound change]]s can then be postulated, which allows the proto-language to be [[Linguistic reconstruction|reconstructed]]. Relation is deemed certain only if at least a partial reconstruction of the common ancestor is [[feasible]], and if regular sound correspondences can be established with chance similarities ruled out.
===Terminology===
''Descent'' is defined as transmission across the generations: children learn a language from the parents' generation and after being influenced by their peers transmit it to the next generation, and so on. For example, a continuous chain of speakers across the centuries links [[Vulgar Latin]] to all of its modern descendants.
Two languages are ''[[genetic (linguistics)|genetic]]ally related'' if they descended from the same [[Proto-language|ancestor language]]. For example, [[Spanish language|Spanish]] and [[French language|French]] both come from [[Latin]] and therefore belong to the same family, the [[Romance languages]].
However, it is possible for languages to have different degrees of relatedness. [[English language|English]], for example, is related to both [[German language|German]] and [[Russian language|Russian]], but is more closely related to the former than it is to the latter. Although all three languages share a common ancestor, [[Proto-Indo-European language|Proto-Indo-European]], English and German also share a more recent common ancestor, [[Proto-Germanic language|Proto-Germanic]], while Russian does not. Therefore, English and German are considered to belong to a different subgroup, the [[Germanic languages]].
''Shared retentions'' from the parent language are not sufficient evidence of a sub-group. For example, as a result of heavy [[loanword|borrowing]] from [[Arabic language|Arabic]] into [[Persian language|Persian]], Modern Persian in fact takes more of its [[vocabulary]] from Arabic than from its direct ancestor, [[Proto-Indo-Iranian language|Proto-Indo-Iranian]]. The division of related languages into sub-groups is more certainly accomplished by finding ''shared linguistic innovations'' from the parent language.
===Origin and development of the method===
[[File:Sajnovics - Demonstratio.jpg|thumb|left|Title page of Sajnovic's 1770 work.]]
Languages have been compared since antiquity. For example, in the 1st century BC the Romans were aware of the similarities between Greek and Latin, which they explained mythologically, as the result of Rome being a Greek colony speaking a debased dialect. In the 9th or 10th century, [[Yehuda Ibn Quraysh]] compared the phonology and morphology of Hebrew, Aramaic, and Arabic, but attributed this resemblance to the Biblical story of Babel, with Abraham, Isaac and Joseph retaining Adam's language, with other languages at various removes becoming more altered from the original Hebrew.
In publications of 1647 and 1654, [[Marcus van Boxhorn]] first described a rigid methodology for historical linguistic comparisons and proposed the existence of an [[Indo-European]] proto-language (which he called "Scythian") unrelated to Hebrew, but ancestral to Germanic, Greek, Romance, Persian, Sanskrit, Slavic, Celtic and Baltic languages. The Scythian theory was further developed by [[Andreas Jäger]] (1686) and [[William Wotton]] (1713), who made first forays to reconstruct this primitive common language. In 1710 and 1723, [[Lambert ten Kate]] first formulated the regularity of [[sound law]]s, introducing among others, the term [[root vowel]].
Another early systematic attempt to prove the relationship between two languages on the basis of similarity of [[grammar]] and [[lexicon]] was made by the Hungarian [[János Sajnovics]] in 1770, when he attempted to demonstrate the relationship between [[Sami languages|Sami]] and [[Hungarian language|Hungarian]] (work that was later extended to the whole [[Finno-Ugric languages|Finno-Ugric language family]] in 1799 by his countryman [[Samuel Gyarmathi]]), But the origin of modern [[historical linguistics]] is often traced back to [[William Jones (philologist)|Sir William Jones]], an English [[Philology|philologist]] living in [[India]], who in 1786 made his famous {{nowrap|observation:}}
“The [[Sanskrit|Sanscrit language]], whatever be its antiquity, is of a wonderful structure; more perfect than the [[Ancient Greek language|Greek]], more copious than the [[Latin language|Latin]], and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both the [[Gothic language|Gothick]] and the [[Celtic languages|Celtick]], though blended with a very different idiom, had the same origin with the Sanscrit; and the [[Old Persian language|old Persian]] might be added to the same family.”
The comparative method developed out of attempts to reconstruct the proto-language mentioned by Jones, which he did not name, but subsequent linguists named [[Proto-Indo-European language|Proto-Indo-European]] (PIE). The first professional comparison between the [[Indo-European languages]] known then was made by the German linguist [[Franz Bopp]] in 1816. Though he did not attempt a reconstruction, he demonstrated that Greek, Latin and Sanskrit shared a common structure and a common lexicon. [[Karl Wilhelm Friedrich von Schlegel|Friedrich Schlegel]] in 1808 first stated the importance of using the eldest possible form of a language when trying to prove its relationships; in 1818, [[Rasmus Christian Rask]] developed the principle of regular sound changes to explain his observations of similarities between individual words in the Germanic languages and their cognates in Greek and {{nowrap|Latin.}} [[Jacob Grimm]] - better known for his [[Grimm's Fairy Tales|''Fairy Tales'']] - in ''Deutsche Grammatik'' (published 1819-37 in four volumes) made use of the comparative method in attempting to show the development of the [[Germanic languages]] from a common origin, the first systematic study of [[diachronic]] language change.
Both Rask and Grimm were unable to explain apparent exceptions to the sound laws that they had discovered. Although [[Hermann Grassmann]] explained one of these anomalies with the publication of [[Grassmann's law]] in 1862, it was [[Karl Verner]] who in 1875 made a methodological breakthrough when he identified a pattern now known as [[Verner's law]], the first sound law based on comparative evidence showing that a [[phonology|phonological]] change in one [[phoneme]] could depend on other factors within the same word, such as the neighbouring phonemes and the position of the [[Stress (linguistics)|accent]], now called ''conditioning environments''.
Similar discoveries made by the ''Junggrammatiker'' (usually translated as [[Neogrammarians]]) at the [[University of Leipzig]] in the late 1800s led them to conclude that all sound changes were ultimately regular, resulting in the famous statement by [[Karl Brugmann]] and [[Hermann Osthoff]] in 1878 that "sound laws have no exceptions". This idea is fundamental to the modern comparative method, since the method necessarily assumes regular correspondences between sounds in related languages, and consequently regular sound changes from the proto-language. This ''Neogrammarian Hypothesis'' led to application of the comparative method to reconstruct [[Proto-Indo-European language|Proto-Indo-European]], with [[Indo-European languages|Indo-European]] being at that time by far the most well-studied language family. Linguists working with other families soon followed suit, and the comparative method quickly became the established method for uncovering linguistic relationships.
===Application===
{{IPA notice}}
There is no fixed set of steps to be followed in the application of the comparative method, but [[Lyle Campbell]] suggests some basic steps and so does [[Terry Crowley]], who are both authors of introductory texts in historical linguistics. The abbreviated summary below is based on their concepts of how to proceed.
====Step 1, assemble potential cognate lists====
This step involves making lists of words that are likely cognates among the languages being compared. If there is a regularly recurring match between the phonetic structure of basic words with similar meanings a genetic kinship can probably be established. For example, looking at the [[Polynesian languages|Polynesian family]] linguists might come up with a list similar to the following (a list actually used by them would be much longer):
{| class=wikitable
! Gloss
! one
! two
! three
! four
! five
! man
! sea
! taboo
! octopus
! canoe
! enter
|-
| [[Tongan language|Tongan]]
| align=center | {{IPA|taha}}
| align=center | {{IPA|ua}}
| align=center | {{IPA|tolu}}
| align=center | {{IPA|fā}}
| align=center | {{IPA|nima}}
| align=center | {{IPA|taŋata}}
| align=center | {{IPA|tahi}}
| align=center | {{IPA|tapu}}
| align=center | {{IPA|feke}}
| align=center | {{IPA|vaka}}
| align=center | {{IPA|hū}}
|-
| [[Samoan language|Samoan]]
| align=center | {{IPA|tasi}}
| align=center | {{IPA|lua}}
| align=center | {{IPA|tolu}}
| align=center | {{IPA|fā}}
| align=center | {{IPA|lima}}
| align=center | {{IPA|taŋata}}
| align=center | {{IPA|tai}}
| align=center | {{IPA|tapu}}
| align=center | {{IPA|feʔe}}
| align=center | {{IPA|vaʔa}}
| align=center | {{IPA|ulu}}
|-
| [[Māori language|Māori]]
| align=center | {{IPA|tahi}}
| align=center | {{IPA|rua}}
| align=center | {{IPA|toru}}
| align=center | {{IPA|ɸā}}
| align=center | {{IPA|rima}}
| align=center | {{IPA|taŋata}}
| align=center | {{IPA|tai}}
| align=center | {{IPA|tapu}}
| align=center | {{IPA|ɸeke}}
| align=center | {{IPA|waka}}
| align=center | {{IPA|uru}}
|-
| [[Rapanui language|Rapanui]]
| align=center | {{IPA|-tahi}}
| align=center | {{IPA|-rua}}
| align=center | {{IPA|-toru}}
| align=center | {{IPA|-ha}}
| align=center | {{IPA|-rima}}
| align=center | {{IPA|taŋata}}
| align=center | {{IPA|tai}}
| align=center | {{IPA|tapu}}
| align=center | {{IPA|heke}}
| align=center | {{IPA|vaka}}
| align=center | {{IPA|uru}}
|-
| [[Rarotongan language|Rarotongan]]
| align=center | {{IPA|taʔi}}
| align=center | {{IPA|rua}}
| align=center | {{IPA|toru}}
| align=center | {{IPA|ʔā}}
| align=center | {{IPA|rima}}
| align=center | {{IPA|taŋata}}
| align=center | {{IPA|tai}}
| align=center | {{IPA|tapu}}
| align=center | {{IPA|ʔeke}}
| align=center | {{IPA|vaka}}
| align=center | {{IPA|uru}}
|-
| [[Hawaiian language|Hawaiian]]
| align=center | {{IPA|kahi}}
| align=center | {{IPA|lua}}
| align=center | {{IPA|kolu}}
| align=center | {{IPA|hā}}
| align=center | {{IPA|lima}}
| align=center | {{IPA|kanaka}}
| align=center | {{IPA|kai}}
| align=center | {{IPA|kapu}}
| align=center | {{IPA|heʔe}}
| align=center | {{IPA|waʔa}}
| align=center | {{IPA|ulu}}
|}
[[loanword|Borrowing]]s or [[false cognate]]s could skew or obscure the correct data. For example, English ''taboo'' ({{IPA|[tæbu]}}) is like the six Polynesian forms due to borrowing from Tongan into English, and not because of a genetic similarity. This problem can usually be overcome by using basic vocabulary such as kinship terms, numbers, body parts, pronouns, and other basic terms. Nonetheless, even basic vocabulary can be sometimes borrowed. [[Finnish language|Finnish]], for example, borrowed the word for "mother", ''äiti'', from [[Gothic language|Gothic]] ''aiþei''. While [[English language|English]] borrowed the pronouns "they", "them", and "their(s)" from [[Old Norse language|Norse]], Thomason and Everett argue that [[Pirahã language|Pirahã]], a [[Muran languages|Muran language]] of South America for which a number of controversial claims are made, borrowed all its [[pronoun]]s from [[Nhengatu]].
====Step 2, establish correspondence sets====
The next step is to determine the regular sound correspondences exhibited by the potential cognates lists. Mere phonetic similarity, as between [[English language|English]] ''day'' and [[Latin]] ''dies'' (both with the same meaning), has no probative value. English initial ''d-'' does ''not'' regularly match {{nowrap|Latin ''d-'',}} and whatever sporadic matches can be observed are due either to chance (as in the above example) or to [[loanword|borrowing]] (for example, Latin ''diabolus'' and English ''devil'', both ultimately of Greek origin). English and Latin ''do'' exhibit a regular correspondence of ''t-'' : ''d-'' (where the notation "A : B" means "A corresponds to B"); for example,
{| class="wikitable"
| align=left | '''English'''
| align=center | '''t'''en
| align=center | '''t'''wo
| align=center | '''t'''ow
| align=center | '''t'''ongue
| align=center | '''t'''ooth
|-
| align=left | '''Latin'''
| align=center | '''d'''ecem
| align=center | '''d'''uo
| align=center | '''d'''ūco
| align=center | '''d'''ingua
| align=center | '''d'''ent-
|}
If there are many regular correspondence sets of this kind (the more the better), then a common origin becomes a virtual certainty, particularly if some of the correspondences are non-trivial or unusual.
====Step 3, discover which sets are in complementary distribution====
During the late 18th to late 19th century, two major developments improved the method's effectiveness.
First, it was found that many sound changes are conditioned by a specific ''context''. For example, in both [[Ancient Greek|Greek]] and [[Sanskrit]], an [[Aspiration (phonetics)|aspirated]] [[Stop consonant|stop]] evolved into an unaspirated one, but only if a second aspirate occurred later in the same word; this is [[Grassmann's law]], first described for [[Sanskrit]] by [[Sanskrit grammarians|Sanskrit grammarian]] [[Pāṇini]] and promulgated by [[Hermann Grassmann]] in 1863.
Second, it was found that sometimes sound changes occurred in contexts that were later lost. For instance, in Sanskrit [[Velar consonant|velars]] (''k''-like sounds) were replaced by [[Palatal consonant|palatals]] (''ch''-like sounds) whenever the following vowel was ''*i'' or ''*e''. Subsequent to this change, all instances of ''*e'' were replaced by ''a''. The situation would have been unreconstructable, had not the original distribution of ''e'' and ''a'' been recoverable from the evidence of other [[Indo-European languages]]. For instance, [[Latin]] suffix ''que'', "and", preserves the original ''*e'' vowel that caused the consonant shift in Sanskrit:
{| class="wikitable"
| '''1.'''
| align=center | ''*ke''
| Pre-Sanskrit "and"
|-
| '''2.'''
| align=center | ''*ce''
| Velars replaced by palatals before ''*i'' and ''*e''
|-
| '''3.'''
| align=center | ''ca''
| The attested Sanskrit form. ''*e'' has become ''a''
|}
[[Verner's Law]], discovered by [[Karl Verner]] in about 1875, is a similar case: the [[Voice (phonetics)|voicing]] of consonants in [[Germanic languages]] underwent a change that was determined by the position of the old Indo-European [[Stress (linguistics)|accent]]. Following the change, the accent shifted to initial position. Verner solved the puzzle by comparing the Germanic voicing pattern with Greek and Sanskrit accent patterns.
This stage of the comparative method, therefore, involves examining the correspondence sets discovered in step 2 and seeing which of them apply only in certain contexts. If two (or more) sets apply in [[complementary distribution]], they can be assumed to reflect a single original [[phoneme]]: "some sound changes, particularly conditioned sound changes, can result in a proto-sound being associated with more than one correspondence set".
For example, the following potential cognate list can be established for [[Romance languages]], which descend from [[Latin]]:
{| class="wikitable"
!
! [[Italian language|Italian]]
! [[Spanish language|Spanish]]
! [[Portuguese language|Portuguese]]
! [[French language|French]]
! Gloss
|-
| '''1.'''
| align=center | corpo
| align=center | cuerpo
| align=center | corpo
| align=center | corps
| align=center | body
|-
| '''2.'''
| align=center | crudo
| align=center | crudo
| align=center | cru
| align=center | cru
| align=center | raw
|-
| '''3.'''
| align=center | catena
| align=center | cadena
| align=center | cadeia
| align=center | chaîne
| align=center | chain
|-
| '''4.'''
| align=center | cacciare
| align=center | cazar
| align=center | caçar
| align=center | chasser
| align=center | to hunt
|}
They evidence two correspondence sets, ''k : k'' and ''k : {{IPAlink|ʃ}}:
{| class="wikitable"
!
! [[Italian language|Italian]]
! [[Spanish language|Spanish]]
! [[Portuguese language|Portuguese]]
! [[French language|French]]
|-
| '''1.'''
| align=center | k
| align=center | k
| align=center | k
| align=center | k
|-
| '''2.'''
| align=center | k
| align=center | k
| align=center | k
| align=center | {{IPA|ʃ}}
|}
Since French ''{{IPA|ʃ}}'' only occurs before ''a'' where the other languages also have ''a'', while French ''k'' occurs elsewhere, the difference is due to different environments (post-initial a or non-a) and the sets are complementary. They can therefore be assumed to reflect a single proto-phoneme (in this case ''*k'', spelled
in [[Latin language|Latin]]). The original words are corpus, crudus, catena and captiare, all with an initial k-sound. If more evidence along these lines were given, one might conclude to an alteration of the original k because of a different environment.
A more complex case involves consonant clusters in [[Proto-Algonquian language|Proto-Algonquian]]. The Algonquianist [[Leonard Bloomfield]] used the reflexes of the clusters in four of the daughter languages to reconstruct the following correspondence sets:
{| class="wikitable"
!
! [[Anishinaabe language|Ojibwe]]
! [[Fox language|Meskwaki]]
! [[Plains Cree language|Plains Cree]]
! [[Menominee language|Menomini]]
|-
| '''1.'''
| align=center | kk
| align=center | hk
| align=center | hk
| align=center | hk
|-
| '''2.'''
| align=center | kk
| align=center | hk
| align=center | sk
| align=center | hk
|-
| '''3.'''
| align=center | sk
| align=center | hk
| align=center | sk
| align=center | {{IPA|t͡ʃk}}
|-
| '''4.'''
| align=center | {{IPA|ʃk}}
| align=center | {{IPA|ʃk}}
| align=center | sk
| align=center | sk
|-
| '''5.'''
| align=center | sk
| align=center | {{IPA|ʃk}}
| align=center | hk
| align=center | hk
|}
Although all five correspondence sets overlap with one another in various places, they are not in complementary distribution, and so Bloomfield recognized that a different cluster must be reconstructed for each set; his reconstructions were, respectively, ''*hk'', ''*xk'', ''*čk'' (={{IPA|[t͡ʃk]}}), ''*šk'' (={{IPA|[ʃk]}}), and ''çk'' (where ''‘x’'' and ''‘ç’'' are arbitrary symbols, not attempts to guess the phonetic value of the proto-phonemes).
====Step 4, reconstruct proto-phonemes====
Typology assists in deciding what reconstruction best fits the data. For example, the voicing of voiceless plosives between vowels is common, but not the devoicing of voiced plosives there. If a correspondence ''-t-'' : ''-d-'' between vowels is found in two languages, the proto-[[phoneme]] is more likely to be ''*-t-'', with a development to the voiced form in the second language. The opposite reconstruction would create a rare type.
However, unusual sound changes do occur. The [[Proto-Indo-European language|Proto-Indo-European]] word for ''two'', for example, is reconstructed as ''*dwō'', which is reflected in [[Classical Armenian]] as ''erku''. Several other cognates demonstrate a regular change ''*dw-'' → ''erk-'' in Armenian. Similarly, in Bearlake, a dialect of the [[Athabaskan languages|Athabaskan language]] of [[Slavey language|Slavey]], there has been a sound change of Proto-Athabaskan ''*ts'' → Bearlake ''{{IPA|kʷ}}''. It is very unlikely that ''*dw-'' changed directly into ''erk-'' and ''*ts'' into ''{{IPA|kʷ}}'', but instead they must have gone through several intermediate steps to arrive at the later forms. It is not phonetic similarity which matters when utilizing the comparative method, but regular sound correspondences.
By the [[Principle of Economy]], the reconstruction of a proto-phoneme should require as few sound changes as possible to arrive at the modern reflexes in the daughter languages. For example, [[Algonquian languages]] exhibit the following correspondence set:
{| class="wikitable"
! [[Anishinaabe language|Ojibwe]]
! [[Mi'kmaq language|Míkmaq]]
! [[Cree language|Cree]]
! [[Munsee language|Munsee]]
! [[Blackfoot language|Blackfoot]]
! [[Arapaho language|Arapaho]]
|-
| align=center | m
| align=center | m
| align=center | m
| align=center | m
| align=center | m
| align=center | b
|}
The simplest reconstruction for this set would be either ''*m'' or ''*b''. Both ''*m'' → ''b'' and ''*b'' → ''m'' are likely. Because ''m'' occurs in five of the languages, and ''b'' in only one, if ''*b'' is reconstructed, then it is necessary to assume five separate changes of ''*b'' → ''m'', whereas if ''*m'' is reconstructed, it is only necessary to assume a single change of ''*m'' → ''b''. ''*m'' would be most economical.
====Step 5, examine the reconstructed system typologically====
In the final step, the linguist checks to see how the proto-[[phoneme]]s fit the known [[linguistic typology|typological constraints]]. For example, in a hypothetical system,
{| class="wikitable"
! p
! t
! k
|-
! b
!
!
|-
!
! n
! ŋ
|-
!
! l
!
|}
there is only one [[Voiced bilabial plosive|voiced plosive]], ''*b'', and although there is an [[alveolar nasal|alveolar]] and a [[velar nasal]], ''*n'' and ''*ŋ'', there is no corresponding [[Bilabial nasal|labial nasal]]. However, languages generally (though not always) tend to maintain symmetry in their phonemic inventories. In this case, the linguist might attempt to find evidence that what was earlier reconstructed as ''*b'' is in fact ''*m'', or that the ''*n'' and ''*ŋ'' are in fact ''*d'' and ''*g''.
Even a symmetrical system can be typologically suspicious. For example, the traditional [[Proto-Indo-European language|Proto-Indo-European]] plosive inventory is:
{| class="wikitable"
!
! [[Labial consonant|Labial]]s
! [[Dental consonant|Dental]]s
! [[Velar consonant|Velar]]s
! [[labialized velar consonant|Labiovelars]]
! [[Palatovelar]]s
|-
! [[Voiceless consonant|Voiceless]]
| align=center|p
| align=center|t
| align=center|k
| align=center|{{IPA|kʷ}}
| align=center|{{IPA|kʲ}}
|-
! [[Voiced consonant|Voiced]]
| align=center|(b)
| align=center|d
| align=center|g
| align=center|{{IPA|ɡʷ}}
| align=center|{{IPA|ɡʲ}}
|-
! [[Voiced consonant|Voiced]] [[Aspiration (phonetics)|aspirated]]
| align=center|{{IPA|bʱ}}
| align=center|{{IPA|dʱ}}
| align=center|{{IPA|ɡʱ}}
| align=center|{{IPA|ɡʷʱ}}
| align=center|{{IPA|ɡʲʱ}}
|}
An earlier voiceless aspirated row was removed on grounds of insufficient evidence. Since the mid-20th century, a number of linguists have argued that this phonology is implausible; that it is extremely unlikely for a language to have a voiced aspirated ([[breathy voice]]) series without a corresponding voiceless aspirated series. A potential solution was provided by [[Thomas Gamkrelidze]] and [[Vyacheslav V. Ivanov]], who argued that the series traditionally reconstructed as plain voiced should in fact be reconstructed as [[Glottalization|glottalized]] — either [[Implosive consonant|implosive]] {{IPA|(ɓ, ɗ, ɠ)}} or [[Ejective consonant|ejective]] {{IPA|(pʼ, tʼ, kʼ)}}. The plain voiceless and voiced aspirated series would thus be replaced by just voiceless and voiced, with aspiration being a non-distinctive quality of both. This example of the application of linguistic typology to linguistic reconstruction has become known as the [[Glottalic theory|Glottalic Theory]]. It has a large number of proponents but is not generally accepted. As an alternative, the voiceless aspirated row was restored.
The reconstruction of proto-sounds logically precedes the reconstruction of grammatical [[morpheme]]s (word-forming affixes and inflectional endings), patterns of [[declension]] and [[Grammatical conjugation|conjugation]], and so on. The full reconstruction of an unrecorded protolanguage is an open-ended task.
===Problems with the history of historical linguistics===
The limitations of the comparative method were recognized by the very linguists who developed it, but it is still seen as a valuable tool. In the case of Indo-European, the method seemed to at least partially validate the centuries-old search for an [[Ursprache]], the original language of the [[Garden of Eden]], from which all others not assigned by [[God]] in the confusion resulting from construction of the [[Tower of Babel]] descended. These others were presumed ordered in a [[family tree]], becoming the [[Tree model]] of the [[neogrammarians]].
The archaeologists followed suit, attempting to find archaeological evidence of a culture or cultures that could be presumed to have spoken a [[proto-language]], such as [[Vere Gordon Childe]]'s ''The Aryans: a study of Indo-European origins'', 1926. Childe was a philologist turned archaeologist. These views culminated in the ''Siedlungsarchaologie'', or "settlement-archaeology", of [[Gustaf Kossinna]], becoming known as "Kossinna's Law." He asserted that cultures represent ethnic groups, including their languages. It was rejected as a law in the post-World-War-II era. The fall of Kossinna's Law removed the temporal and spatial framework previously applied to many proto-languages. Fox concludes:
The Comparative Method ''as such'' is not, in fact, historical; it provides evidence of linguistic relationships to which we may give a historical interpretation. ...[Our increased knowledge about the historical processes involved] has probably made historical linguists less prone to equate the idealizations required by the method with historical reality. ...Provided we keep [the interpretation of the results and the method itself] apart, the Comparative Method can continue to be used in the reconstruction of earlier stages of languages.
Proto-languages can be verified in many historical instances, such as Latin. Although no longer a law, settlement-archaeology is known to be essentially valid for some cultures that straddle history and prehistory, such as the Celtic Iron Age (mainly Celtic) and [[Mycenaean civilization]] (mainly Greek). None of these models can be or have been completely rejected, and yet none alone are sufficient.
===Problems with the neogrammarian hypothesis===
The foundation of the comparative method, and of comparative linguistics in general, is the [[Neogrammarian]]s' fundamental assumption that "sound laws have no exceptions." When it was initially proposed, critics of the Neogrammarians proposed an alternate position, summarized by the maxim "each word has its own history". Several types of change do in fact alter words in non-regular ways. Unless identified, they may hide or distort laws and cause false perceptions of relationship.
====Borrowing====
All languages [[loanword|borrow words]] from other languages in various contexts. They are likely to have followed the laws of the languages from which they were borrowed rather than the laws of the borrowing language.
====Areal diffusion====
Borrowing on a larger scale occurs in [[Areal feature (linguistics)|areal diffusion]], when features are adopted by contiguous languages over a geographical area. The borrowing may be [[Phonology|phonological]], [[Morpheme|morphological]] or [[Lexeme|lexical]]. A false proto-language over the area may be reconstructed for them or may be taken to be a third language serving as a source of diffused features.
Several areal features and other influences may converge to form a [[sprachbund]], a wider region sharing features that appear to be related but are diffusional. For instance, the [[East Asian languages|East Asian Sprachbund]] suggested several false classifications of such languages as [[Chinese languages|Chinese]], [[Korean language|Korean]], [[Japanese language|Japanese]], and [[Vietnamese language|Vietnamese]] before it was recognized.
====Random mutations====
Sporadic changes, such as irregular inflections, compounding, and abbreviation, do not follow any laws. For example, the [[Spanish language|Spanish]] words ''palabra'' ('word'), ''peligro'' ('danger') and ''milagro'' ('miracle') should have been ''parabla'', ''periglo'', ''miraglo'' by regular sound changes from the Latin ''parabŏla'', ''perĩcǔlum'' and ''mĩrãcǔlum'', but the ''r'' and ''l'' changed places by sporadic [[Metathesis (linguistics)|metathesis]].
====Analogy====
[[Analogy#Linguistics|Analogy]] is the sporadic change of a feature to be like another feature in the same or a different language. It may affect a single word or be generalized to an entire class of features, such as a verb paradigm. For example, the [[Russian language|Russian]] word for ''nine'', by regular sound changes from [[Proto-Slavic language|Proto-Slavic]], should have been {{IPA|/nʲevʲatʲ/}}, but is in fact {{IPA|/dʲevʲatʲ/}}. It is believed that the initial ''{{IPA|nʲ-}}'' changed to ''{{IPA|dʲ-}}'' under influence of the word for "ten" in Russian, {{IPA|/dʲesʲatʲ/}}.
====Gradual application====
Students of contemporary language changes, such as [[William Labov]], note that even a systematic sound change is at first applied in an unsystematic fashion, with the percentage of its occurrence in a person's speech dependent on various social factors. The sound change gradually spreads, a process known as [[lexical diffusion]]. While not invalidating the Neogrammarians' axiom that "sound laws have no exceptions", their gradual application shows that they do not always apply to all lexical items at the same time. Hock notes, "While it probably is true in the long run every word has its own history, it is not justified to conclude as some linguists have, that therefore the Neogrammarian position on the nature of linguistic change is falsified."
===Problems with the Tree Model===
The comparative method is used to construct a [[Tree model]] (German ''Stammbaum'') of language evolution, in which daughter languages are seen as branching from the [[proto-language]], gradually growing more distant from it through accumulated [[phonology|phonological]], [[morpho-syntactic]], and [[lexicon|lexical]] changes.
[[File:Uto-Aztecan Family Tree.jpg|thumb|720px|center|An example of the Tree Model, used to represent the [[Uto-Aztecan languages|Uto-Aztecan]] language family spoken throughout the southern and western [[United States]] and [[Mexico]]. Families are in '''bold''', individual languages in ''italics''. Not all branches and languages are shown.]]
====The presumption of a well-defined node====
[[File:Wave Model Schmidt.jpeg|thumb|300px|The [[Wave model (linguistics)|Wave Model]] has been proposed as an alternative model of language change. Each wave, an [[isogloss]], is a circle in the [[Venn diagram]], but the circles are not to be seen as simultaneous or extending over the same areas. The language must be found most certainly at the intersection of the greatest number of circles. It tapers off to intermediate times and locations. Some isoglosses may not even be found in languages of the same family. The tree model presumes that all the circles coincide in time and space.]]
The [[tree model]] features nodes that are presumed to be distinct proto-languages existing independently in distinct regions during distinct historical times. The reconstruction of unattested proto-languages lends itself to that illusion: they cannot be verified and the linguist is free to select whatever definite times and places for them seem best. Right from the outset of Indo-European studies, however, [[Thomas_Young_(scientist)|Thomas Young]] said:It is not, however, very easy to say what the definition should be that should constitute a separate language, but it seems most natural to call those languages distinct, of which the one cannot be understood by common persons in the habit of speaking the other ....Still, however, it may remain doubtfull whether the Danes and the Swedes could not, in general, understand each other tolerably well ... nor is it possible to say if the twenty ways of pronouncing the sounds, belonging to the Chinese characters, ought or ought not to be considered as so many languages or dialects .... But, ... the languages so nearly allied must stand next to each other in a systematic order ....
The assumption of uniformity in a proto-language, implicit in the comparative method, is problematic. Even in small language communities there are always [[Dialect|dialect differences]], whether based on area, gender, class, or other factors. The [[Pirahã language]] of [[Brazil]] is spoken by only several hundred people, but it has at least two different dialects, one spoken by men and one by women. Campbell points out:
It is not so much that the comparative method 'assumes' no variation; rather, it is just that there is nothing built into the comparative method which would allow it to address variation directly....This assumption of uniformity is a reasonable idealization; it does no more damage to the understanding of the language than, say, modern reference grammars do which concentrate on a language's general structure, typically leaving out consideration of regional or social variation.
Different dialects, as they evolve into separate languages, remain in contact with one another and influence each other. Even after they are considered distinct, languages near to one another continue to influence each other, often sharing grammatical, phonological, and lexical innovations. A change in one language of a family may spread to neighboring languages; and multiple waves of change are communicated like waves across language and dialect boundaries, each with its own randomly delimited range. If a language is divided into an inventory of features, each with its own time and range ([[isogloss]]es), they do not all coincide. History and prehistory may not offer a time and place for a distinct coincidence, as may be the case for [[Italic languages|proto-Italic]], in which case the proto-language is only a concept. However, Hock observes:
The discovery in the late nineteenth century that [[isogloss]]es can cut across well-established linguistic boundaries at first created considerable attention and controversy. And it became fashionable to oppose a wave theory to a tree theory... Today, however, it is quite evident that the phenomena referred to by these two terms are complementary aspects of linguistic change...
====Subjectivity of the reconstruction====
The reconstruction of unknown proto-languages is inherently subjective.{{Citation needed|date=April 2011}} In the [[Proto-Algonquian language|Proto-Algonquian]] example above, the choice of ''*m'' as the parent [[phoneme]] is only ''likely'', not ''certain''. It is conceivable that a Proto-Algonquian language with ''*b'' in those positions split into two branches, one which preserved ''*b'' and one which changed it to ''*m'' instead; and while the first branch only developed into [[Arapaho language|Arapaho]], the second spread out wider and developed into all the other [[Algonquian peoples|Algonquian]] tribes. It is also possible that the nearest common ancestor of the [[Algonquian languages]] used some other sound instead, such as ''*p'', which eventually mutated to ''*b'' in one branch and to ''*m'' in the other. Since reconstruction involves many of these choices, some linguists prefer to view the reconstructed features as abstract representations of sound correspondences, rather than as objects with a historical time and place.
The existence of proto-languages and the validity of the comparative method is verifiable in cases where the reconstruction can be matched to a known language, which may only be known as a shadow in the [[loanword]]s of another language. For example [[Finnic languages]] such as [[Finnish language|Finnish]] have borrowed many words from an early stage of [[Germanic languages|Germanic]], and the shape of the loans matches the forms that have been reconstructed for [[Proto-Germanic]]. Finnish ''kuningas'' 'king' and ''kaunis'' 'beautiful' match the Germanic reconstructions *''kuningaz'' and *''skauniz'' (>German ''König'' 'king', ''schön'' 'beautiful').
====Additional models====
As alternatives to the [[tree model]], the [[Wave model (linguistics)|wave model]] dates to the 19th century, [[glottochronology]] and [[mass lexical comparison]] to the 20th. Most historical linguists consider the latter two methods flawed and unreliable.
==See also==
*[[Historical linguistics]]
*[[Comparative linguistics]]
*[[Proto-language]]
*[[Lexicostatistics]]
*[[Swadesh list]]