All Topics  
Comparative method

 
Comparative Method

   Email Print
   Bookmark   Link






 

Comparative method



 
 
In linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
, the comparative method is a technique for studying the development of languages. It requires the use of two or more languages. It is opposed to the method of internal reconstruction
Internal reconstruction

Internal reconstruction is a method of recovering information about a language's past from the characteristics of the language at a later date. Whereas the comparative method compares variations between languages ? such as in sets of cognates ? under the assumption that they descend from a single proto-language, internal reconstruction compar...
, which studies the internal development of a single language over time. Ordinarily both methods are used together. They constitute a powerful means to reconstruct prehistoric phases of languages, to fill in gaps in the historical record of a language, to study the development of phonological, morphological, and other linguistic systems, and to confirm or infirm hypothesized relationships between languages.

The comparative method was gradually developed during the course of the 19th century.






Discussion
Ask a question about 'Comparative method'
Start a new discussion about 'Comparative method'
Answer questions from other users
Full Discussion Forum



Encyclopedia


In linguistics
Linguistics

Linguistics is the science study of natural language. Linguistics encompasses a number of sub-fields. An important topical division is between the study of language structure and the study of Meaning ....
, the comparative method is a technique for studying the development of languages. It requires the use of two or more languages. It is opposed to the method of internal reconstruction
Internal reconstruction

Internal reconstruction is a method of recovering information about a language's past from the characteristics of the language at a later date. Whereas the comparative method compares variations between languages ? such as in sets of cognates ? under the assumption that they descend from a single proto-language, internal reconstruction compar...
, which studies the internal development of a single language over time. Ordinarily both methods are used together. They constitute a powerful means to reconstruct prehistoric phases of languages, to fill in gaps in the historical record of a language, to study the development of phonological, morphological, and other linguistic systems, and to confirm or infirm hypothesized relationships between languages.

The comparative method was gradually developed during the course of the 19th century. The key contributions were made by the Danish scholars Rasmus Rask
Rasmus Christian Rask

Rasmus Rask , Denmark scholar and philologist, was born at Br?ndekilde on the Danish island of Funen....
 and Karl Verner
Karl Verner

Karl Verner was a Denmark linguistics. He is remembered today for Verner's law, which he discovered in 1875.Verner, whose interest in languages was stimulated by reading about the work of Rasmus Christian Rask, began his university studies in 1864....
 and the German scholar Jacob Grimm
Jacob Grimm

Jacob Ludwig Carl Grimm , German Confederation philologist, jurist and mythology, was born at Hanau, in Hesse-Kassel . He is best known as the discoverer of Grimm's Law, the author of the monumental German Dictionary, his Deutsche Mythologie and more popularly, as one of the Brothers Grimm, as the editor of Grimm's Fairy Tales....
. The first linguist to offer reconstructed forms
Linguistic reconstruction

Linguistic reconstruction is the practice of establishing the features of the unattested ancestor of one or more given languages. There are two kinds of reconstruction....
 from a proto-language
Proto-language

A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German language term Ursprache is used instead....
 was August Schleicher, in his Compendium der vergleichenden Grammatik der indogermanischen Sprachen, originally published in 1861.

Contrary to what is often assumed today, Schleicher and the other comparative linguists
Comparative linguistics

Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages in order to establish their history relatedness....
 of the 19th century did not view the comparative method as a means to establish the validity of the language families they studied, which they considered to be already established through an interlocking web of lexical and morphological similarities. Characteristic is Schleicher’s explanation of why he took the then-radical step of offering reconstructed forms:

In the present work an attempt is made to set forth the inferred Indo-European original language side by side with its really existent derived languages. Besides the advantages offered by such a plan, in setting immediately before the eyes of the student the final results of the investigation in a more concrete form, and thereby rendering easier his insight into the nature of particular Indo-European languages, there is, I think, another of no less importance gained by it, namely that it shows the baselessness of the assumption that the non-Indian Indo-European languages were derived from Old-Indian (Sanskrit).


During the first half of the 20th century, the conviction gradually took hold that reconstructions arrived at through the comparative method were the only valid means to establish genetic (i.e. genealogical) relationship between languages. This has since remained the prevailing view among historical linguists
Historical linguistics

Historical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
. It is contested only by followers of Joseph Greenberg
Joseph Greenberg

Joseph Harold Greenberg was a prominent and controversial American linguistics, principally known for his work in two areas, linguistic typology and the genetic relationship of languages....
.

Demonstrating genetic relationship


In its application to the demonstration of genetic relationship, the comparative method aims to prove that two or more historically attested languages are descended from a single hypothethical proto-language
Proto-language

A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German language term Ursprache is used instead....
 by comparing lists of cognate
Cognate

Cognates in linguistics are words that have a common etymology origin.An example of cognates within the same language would be English shirt vs....
 terms. From these cognate lists, regular sound correspondences between the languages are established, and a sequence of regular sound change
Sound change

Sound change includes any processes of language change that affect pronunciation or sound system structures . Sound change can consist of the replacement of one phoneme by another, the complete loss of the affected sound, or even the introduction of a new sound in a place where there previously was none....
s can then be postulated which allows the proto-language to be reconstructed
Linguistic reconstruction

Linguistic reconstruction is the practice of establishing the features of the unattested ancestor of one or more given languages. There are two kinds of reconstruction....
 from its daughter language
Daughter language

In historical linguistics, a daughter language is a language descended from another language through a process of Genetic descent....
s. Relation is deemed certain only if a partial reconstruction of the common ancestor is feasible, and if regular sound correspondences can be established with chance similarities ruled out.

Developed in the 19th century through the study of the Indo-European languages
Indo-European languages

The Indo-European languages are a Language family of several hundred related languages and dialects, including most major languages of Europe, the Iranian plateau , Central Asia and the Indian subcontinent ....
, the comparative method remains the standard by which mainstream linguists judge whether two languages are related, with alternative lexicostatistical
Lexicostatistics

Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language....
 methods such as glottochronology
Glottochronology

Glottochronology is an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic vocabulary of a language changes at a constant average rate....
 generally considered to be unreliable as there are cases when lexicostatistics does not work at all. Potential problems with the comparative method have also arisen as a result of a number of advances in linguistic thought, in large part due to some of the "basic assumptions" of the comparative method. However, as Campbell (2004:146-7) observes, "What textbooks call the 'basic assumptions' of the comparative method might better be viewed as the consequences of how we reconstruct and of our views of sound change."

Terminology

In the present context, related has a specific meaning: two languages are genetic
Genetic (linguistics)

Genetic, in linguistics, means due to descent from a common ancestor language, rather than borrowing at some time in the past between languages that were not necessarily descended from a common ancestor....
ally related if they are descended from the same ancestor language
Proto-language

A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German language term Ursprache is used instead....
. Thus, for example, Spanish
Spanish language

Spanish or Castilian is a Romance languages that originated in northern Spain, and gradually spread in the Kingdom of Castile and evolved into the principal language of government and trade....
 and French
French language

French is a Romance language spoken around the world by around 80 million people as first language, by 190 million as second language, and by about another 200 million people as an acquired tongue, with significant speakers in 54 countries....
 are both descended from Latin
Latin

Latin is an Italic language, historically spoken in Latium and Ancient Rome. Through the Military history of the Roman Empire, Latin spread throughout the Mediterranean and a large part of Europe....
. Therefore, French and Spanish are considered to belong to the same family of languages, the Romance languages
Romance languages

The Romance languages are a branch of the Indo-European languages comprising all the languages that descend from Latin language, the language of ancient Rome....
.

Descent, in turn, is defined in terms of transmission across the generations: children learn a language from the parents' generation and are then influenced by their peers; they then transmit it to the next generation, and so on (how and why changes are introduced is a complicated, unresolved issue). A continuous chain of speakers across the centuries links Vulgar Latin
Vulgar Latin

Vulgar Latin is a blanket term covering the popular dialects and sociolects of the Latin which diverged from each other in the early Middle Ages, evolving into the Romance languages by the 9th century....
 to all of its modern descendants.

However, it is possible for languages to have different degrees of relatedness. English
English language

English is a West Germanic language that originated in Anglo-Saxon England and has lingua franca status in many parts of the world as a result of the military, economic, scientific, political and cultural influence of the British Empire in the 18th, 19th and early 20th centuries and that of the United States from the mid 20th century onwa...
, for example, is related to both German
German language

German is a West Germanic languages, thus related to and classified alongside English language and Dutch language. It is one of the world's world language and the most widely spoken mother tongue in the European Union....
 and Russian
Russian language

Russian is the most geographically widespread language of Eurasia, the most widely spoken of the Slavic languages, and the largest native language in Europe....
, but is more closely related to the former than it is to the latter. The reason for this is that although all three languages share a common ancestor, Proto-Indo-European
Proto-Indo-European language

The Proto-Indo-European language is the unattested, linguistic reconstruction common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans....
, English and German also share, as a more recent common ancestor, one of the daughter languages of Proto-Indo-European – Proto-Germanic
Proto-Germanic language

Proto-Germanic, or Common Germanic, as it is sometimes known, is the hypothetical common ancestor of all the Germanic languages such as modern English language, Dutch language, German language, Danish language, Norwegian language, Icelandic language, Faroese language, and Swedish language....
 – while Russian does not. Therefore, English and German are considered to belong to a different subgroup of the Indo-European language
Indo-European languages

The Indo-European languages are a Language family of several hundred related languages and dialects, including most major languages of Europe, the Iranian plateau , Central Asia and the Indian subcontinent ....
 family (the Germanic languages
Germanic languages

The Germanic languages are a group of related languages that constitute a branch of the Indo-European languages language family. The common ancestor of all the languages in this branch is Proto-Germanic, spoken in approximately the mid-1st millennium BC in Pre-Roman Iron Age....
) than Russian (which belongs to the Slavic
Slavic languages

File:Slavic europe.svgThe Slavic languages , a group of closely related languages of the Slavic peoples and a subgroup of Indo-European languages, have speakers in most of Eastern Europe, in much of the Balkans, in parts of Central Europe, and in the northern part of Asia....
 group). The division of related languages into sub-groups by the comparative method is accomplished by finding languages with large numbers of shared linguistic innovations from the parent language; two languages having many shared retentions from the parent language is not sufficient evidence of a sub-group.

This definition of relatedness implies that, even if two languages are quite similar in their vocabularies, they are not necessarily closely related. As a result of heavy borrowing
Loanword

A loanword is a word directly taken into one language from another with little or no translation. By contrast, a calque or loan translation is a related concept whereby it is the Meaning or idiom that is borrowed rather than the lexical item itself....
 over the years from Arabic
Arabic language

Arabic is a Central Semitic language, thus related to and classified alongside other Semitic languages languages such as Hebrew language and Aramaic language....
 into Persian
Persian language

name=Persian|nativename=|pronunciation=[f??r'si]|image=|caption=Farsi in Perso-Arabic script |states= Iran, Afghanistan, Tajikistan, Uzbekistan, and Bahrain....
, Modern Persian in fact takes more of its vocabulary
Vocabulary

A person's vocabulary is the set of words they are familiar with in a language. A vocabulary usually grows and evolves with age, and serves as a useful and fundamental tool for communication and learning....
 from Arabic than from its direct ancestor, Proto-Indo-Iranian
Proto-Indo-Iranian language

Proto-Indo-Iranian, is the Linguistic reconstruction proto-language of the Indo-Iranian languages branch of Indo-European language. Its speakers, the hypothetical Proto-Indo-Iranians, are assumed to have lived in the late 3rd millennium BC, and are usually connected with the early Andronovo archaeological horizon....
. But under the definition just given, Persian is considered to be descended from Proto-Indo-Iranian, and not from Arabic.

The comparative method is a method for proving relatedness in the sense just given, as well as a method for reconstructing the sound system
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
 and vocabulary of the common ancestral language and uncovering the sound changes the languages of a family have undergone.

Origin and development


The first known systematic attempt to prove the relationship between two languages on the basis of similarity of grammar
Grammar

Grammar is the field of linguistics that covers the conventions governing the use of any given natural language. It includes morphology and syntax, often complemented by phonetics, phonology, semantics, and pragmatics....
 and lexicon
Lexicon

In linguistics, the lexicon of a language is its vocabulary, including its words and expressions. More formally, it is a language's inventory of lexemes....
 was made by the Hungarian János Sajnovics
János Sajnovics

J?nos Sajnovics de Tordas et K?loz was a Hungary linguistics and Society of Jesus. He is best known for his pioneering work in comparative linguistics, particularly his systematic demonstratation of the relationship between Sami languages and Hungarian language....
 in 1770, when he attempted to demonstrate the relationship between Sami
Sami languages

Sami or Saami is a general name for a group of Uralic languages spoken by the Sami people in parts of northern Finland, Norway, Sweden and extreme northwestern Russia, in Northern Europe....
 and Hungarian
Hungarian language

Hungarian is a Uralic languages unrelated to most other languages in Europe. It is mainly spoken in Hungary and by the Hungarian minorities in the seven neighbouring countries....
 (work that was later extended to the whole Finno-Ugric language family
Finno-Ugric languages

Finno-Ugric is a group of languages in the Uralic languages family, comprising Finnish language, Estonian language, Hungarian language and related languages....
 in 1799 by his countryman Samuel Gyarmathi
Samuel Gyarmathi

S?muel Gyarmathi was a Hungary linguistics, born in Cluj-Napoca . He is best known for his systematic demonstration of the comparative linguistics of the Finno-Ugric languages in the book Affinitas linguae hungaricae cum linguis fennicae originis grammatice demonstrata which built on the earlier work of J?nos Sajnovics....
), but the origin of modern historical linguistics
Historical linguistics

Historical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
 is often traced back to Sir William Jones
William Jones (philologist)

Sir William Jones was an England Philology and student of ancient India, particularly known for his proposition of the existence of a relationship among Indo-European languages....
, an English philologist
Philology

Philology, derived from the Greek language considers both morphology and Meaning in linguistic expression, combining linguistics and literary studies....
 living in India
India

India, officially the Republic of India , is a country in South Asia. It is the List of countries and outlying territories by total area country by geographical area, the List of countries by population country, and the most populous liberal democracy in the world....
, who in 1782 made his famous observation:

“The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek
Ancient greek language

#REDIRECT Ancient Greek...
, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists. There is a similar reason, though not quite so forcible, for supposing that both the Gothick
Gothic language

Gothic is an extinct language Germanic language that was spoken by the Goths. It is known primarily from Codex Argenteus, a 6th century copy of a 4th century Bible translation, and is the only East Germanic languages with a sizable corpus....
 and the Celtick
Celtic languages

The Celtic languages are descended from Proto-Celtic, or "Common Celtic", a branch of the greater Indo-European languages language family. The term "Celtic" was used to describe this language group by Edward Lhuyd in 1707, having much earlier been used by Greek and Roman writers to describe tribes in central Gaul....
, though blended with a very different idiom, had the same origin with the Sanskrit; and the old Persian
Old Persian language

The Old Persian language is one of the two attested Iranian languages . Old Persian appears primarily in the inscriptions, clay tablets, seal s of the Achaemenid dynasty era ....
 might be added to the same family.” (Jones 1786, quoted in Lehman 1967 and Szemerényi 1996:4)


An insight often attributed to Jones is conceiving of the idea of a proto-language
Proto-language

A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German language term Ursprache is used instead....
, and consequently of the type of "family tree" model of language development (one proto-language splitting into various daughter languages, some of those then splitting again into further languages), upon which the comparative method is based. However, Jones' role in the development of these ideas has recently been called into question. According to the comparative linguist Lyle Campbell
Lyle Campbell

Lyle Richard Campbell is a linguist and leading expert on American Indian languages?especially those of Mesoamerica?and on historical linguistics in general....
, the widely quoted passage from Jones has been removed from its proper context, and a reading of his work reveals his ideas of linguistic development as less clear. Many of the linguistic classifications proposed by Jones were also erroneous; for instance, he connected Austronesian languages with Sanskrit
Sanskrit

Sanskrit is a historical Indo-Aryan language, one of the liturgical languages of Hinduism and Buddhism, and one of the 22 official languages of India....
, and failed to include Slavic
Slavic languages

File:Slavic europe.svgThe Slavic languages , a group of closely related languages of the Slavic peoples and a subgroup of Indo-European languages, have speakers in most of Eastern Europe, in much of the Balkans, in parts of Central Europe, and in the northern part of Asia....
 in the Indo-European family.

The comparative method itself developed out of the attempts to reconstruct the proto-language which Jones had hypothesized about, known as Proto-Indo-European
Proto-Indo-European language

The Proto-Indo-European language is the unattested, linguistic reconstruction common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans....
 (PIE). The first attempt to analyse the relationships between the Indo-European languages
Indo-European languages

The Indo-European languages are a Language family of several hundred related languages and dialects, including most major languages of Europe, the Iranian plateau , Central Asia and the Indian subcontinent ....
 was made by the German linguist Franz Bopp
Franz Bopp

Franz Bopp was a Germany linguistics known for extensive comparative work on Indo-European languages....
 in 1816. Though he did not attempt a reconstruction, he tried to prove that Greek, Latin and Sanskrit were related by systematically demonstrating that they shared a both common structure and a common lexicon.

It was the German scholar Friedrich Schlegel
Karl Wilhelm Friedrich von Schlegel

Karl Wilhelm Friedrich Schlegel was a Germany poet, critic and scholar. He was the younger brother of August Wilhelm Schlegel....
 who in 1808 first stated the importance of using the eldest possible form of a language when trying to prove its relationships; then, in 1818, the Danish philologist Rasmus Christian Rask
Rasmus Christian Rask

Rasmus Rask , Denmark scholar and philologist, was born at Br?ndekilde on the Danish island of Funen....
  developed the principle of regular sound changes to explain his observations of similarities between individual words in the Germanic languages and their cognates in Greek and Latin. It was another German, Jacob Grimm
Jacob Grimm

Jacob Ludwig Carl Grimm , German Confederation philologist, jurist and mythology, was born at Hanau, in Hesse-Kassel . He is best known as the discoverer of Grimm's Law, the author of the monumental German Dictionary, his Deutsche Mythologie and more popularly, as one of the Brothers Grimm, as the editor of Grimm's Fairy Tales....
 - better known for his Fairy Tales
Grimm's Fairy Tales

Children's and Household Tales is a collection of Germany origin fairy tales first published in 1812 by Jacob Grimm and Wilhelm Grimm, the Brothers Grimm....
 - who in Deutsche Grammatik (published 1819-37 in four volumes) first made use of something resembling the modern comparative method in attempting to show the development of the Germanic languages
Germanic languages

The Germanic languages are a group of related languages that constitute a branch of the Indo-European languages language family. The common ancestor of all the languages in this branch is Proto-Germanic, spoken in approximately the mid-1st millennium BC in Pre-Roman Iron Age....
 from a common origin, the first systematic study of diachronic
Diachronic

Diachronic or Diachronous is a technical term for something happening over time. It is used in several fields of research.*Diachronic linguistics : see Historical linguistics...
 language change.

Both Rask and Grimm were unable to explain apparent exceptions to the sound laws that they had discovered. Though the German linguist Hermann Grassmann
Hermann Grassmann

Hermann G?nther Grassmann was a Germany polymath, renowned in his day as a linguistics and now admired as a mathematics. He was also a physics, Humanism, general scholar, and publisher....
 explained one of these anomalies with the publication of Grassmann's law
Grassmann's Law

Grassmann's law, named after its discoverer Hermann Grassmann, is a dissimilatory phonological process in Ancient Greek and Sanskrit which states that if an Aspiration d consonant is followed by another aspirated consonant in the next syllable, the first one loses the aspiration....
 in 1862, it was in 1875 that a Danish scholar, Karl Verner
Karl Verner

Karl Verner was a Denmark linguistics. He is remembered today for Verner's law, which he discovered in 1875.Verner, whose interest in languages was stimulated by reading about the work of Rasmus Christian Rask, began his university studies in 1864....
, made a methodological breakthrough when he identified a pattern now known as Verner's law
Verner's law

Verner's law, stated by Karl Verner in 1875, describes a historical sound change in the Proto-Germanic language whereby voiceless fricatives *f, *?, *s, *h , when immediately following an unstressed syllable in the same word, underwent voicing and became respectively the fricatives *b, *d, *z, *g ....
, the first sound law to use comparative evidence to show that a phonological
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
 change in one phoneme
Phoneme

In human language, a phoneme is the smallest posited linguistically distinctive unit of sound. Phonemes carry no semantic content themselves. In theoretical terms, phonemes are not the physical segment s themselves, but cognitive abstractions or categorizations of them....
 could depend on other factors within the same word, such as the neighbouring phonemes and the position of the accent
Stress (linguistics)

In linguistics, stress is the relative emphasis that may be given to certain syllables in a word. The term is also used for similar patterns of phonetic prominence inside syllables....
: in other words, the modern concept of conditioning environments.

Similar discoveries were made by a group of radical young German academics at the University of Leipzig
University of Leipzig

The University of Leipzig , located in Leipzig in the Free State of Saxony, Germany, is one of the oldest University in Europeand currently the List_of_universities_in_Germany#Universities_by_age university in Germany....
 known as Junggrammatiker (usually rendered as Neogrammarians in English) in the late 1800s, leading them to conclude that all sound changes were ultimately regular, and resulting in the famous statement by Karl Brugmann
Karl Brugmann

Karl Brugmann was a German linguist. He is a towering figure in Indo-European linguistics.During most of his professional life , Brugmann was professor of Sanskrit and comparative linguistics at the University of Leipzig....
 and Hermann Osthoff
Hermann Osthoff

Hermann Osthoff was a German people linguist. He was involved in Indo-European studies and the Neogrammarian school. He had created "Osthoff's Law" ....
 in 1878 that "sound laws have no exceptions". This revolutionary idea is fundamental to the modern comparative method, since the method necessarily assumes regular correspondences between sounds in related languages, and consequently regular sound changes from the proto-language. It was this Neogrammarian Hypothesis which led to application of the comparative method to reconstruct Proto-Indo-European
Proto-Indo-European language

The Proto-Indo-European language is the unattested, linguistic reconstruction common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans....
, with Indo-European
Indo-European languages

The Indo-European languages are a Language family of several hundred related languages and dialects, including most major languages of Europe, the Iranian plateau , Central Asia and the Indian subcontinent ....
 being at that time by far the most well-studied language family. Linguists working with other families soon followed suit, and the comparative method quickly became the established method for uncovering linguistic relationships.

Application

There is no concrete set of steps to be followed in the application of the comparative method, but linguists generally agree on the basic steps, which are as follows:

Assemble potential cognate lists
Genetic kinship between two (or more) languages can be established if they show a number of regular correspondences in native vocabulary, which means that there is a regularly recurring match between the phonetic structure of basic words with similar meanings. Thus, this step simply involves making lists of words which are likely cognates among the languages being compared. For example, looking at the Polynesian family
Polynesian languages

The Polynesian languages are a language family spoken in the region known as Polynesia. They are classified as part of the Austronesian languages, belonging to the Eastern Eastern Malayo-Polynesian languages branch of that family....
 linguists would come up with a list similar to the following, although in practice a real list would be much longer:

Gloss  one   two   three   four   five   man   sea   taboo   octopus   canoe   enter 
 Tongan
Tongan language

Tongan is an Austronesian languages language spoken in Tonga. It has around 100,000 speakers and is a national language of Tonga. It is a Verb Subject Object language....
 Samoan
Samoan language

The Samoan or Samoan language is the traditional language of Samoa and American Samoa and is an official language—alongside English language—in both jurisdictions....
 Maori
Maori language

Maori or te reo Maori, also commonly shortened to te reo , functions as one of the official languages of New Zealand. Linguists classify it within the Eastern Polynesian languages as closely related to Cook Islands Maori, Tuamotuan language and Tahitian language; somewhat less closely to Hawaiian language and Marquesan language; a...
 Rapanui
 Rarotongan 
 Hawaiian
Hawaiian language

The Hawaiian language is an Austronesian languages that takes its name from Hawaii , the largest island in the tropical North Pacific archipelago where it developed....


Caution needs to be exercised to avoid including borrowing
Loanword

A loanword is a word directly taken into one language from another with little or no translation. By contrast, a calque or loan translation is a related concept whereby it is the Meaning or idiom that is borrowed rather than the lexical item itself....
s or false cognate
False cognate

False cognates are pairs of words in the same or different languages that are similar in form and meaning but have different root . That is, they appear to be or are sometimes considered cognates when in fact they are not....
s in the list, which could skew or obscure the correct data. For example, there is a similarity between English taboo and the six Polynesian forms. Though this may seem to be a cognate, showing that English is genetically related to the Polynesian languages, it is not, as the similarity is due to borrowing from Tongan into English. This problem can usually be overcome by using basic vocabulary such as kinship terms, numbers, body parts, pronouns, and other basic terms. Nonetheless, even basic vocabulary can be borrowed. Finnish
Finnish language

Finnish is the language spoken by the majority of the population in Finland and by Finnish people outside of Finland. It is one of the official languages of Finland and an official minority language in Sweden....
, for example, borrowed the word for "mother", äiti, from Gothic
Gothic language

Gothic is an extinct language Germanic language that was spoken by the Goths. It is known primarily from Codex Argenteus, a 6th century copy of a 4th century Bible translation, and is the only East Germanic languages with a sizable corpus....
 aiþei, while Pirahã
Pirahã language

Pirah? is a language spoken by the Pirah? people — an indigenous people of Amazonas , Brazil, who live along the Maici river, a tributary of the Amazon River....
, a Muran language
Muran languages

Muran is a small language family of Amazonas , Brazil....
 of South America, borrowed all its pronoun
Pronoun

In linguistics and grammar, a pronoun is a pro-form that substitutes for a noun with or without a Determiner , such as Wiktionary:you and Wiktionary:they in English language....
s from Nhengatu; likewise, English
English language

English is a West Germanic language that originated in Anglo-Saxon England and has lingua franca status in many parts of the world as a result of the military, economic, scientific, political and cultural influence of the British Empire in the 18th, 19th and early 20th centuries and that of the United States from the mid 20th century onwa...
 borrowed the pronouns "they", "them", and "their(s)" from Norse.

Establish correspondence sets
Once potential cognate lists are established, the next step is to determine the regular sound correspondences they exhibit. The notion of regular correspondence is very important here: mere phonetic similarity, as between English
English language

English is a West Germanic language that originated in Anglo-Saxon England and has lingua franca status in many parts of the world as a result of the military, economic, scientific, political and cultural influence of the British Empire in the 18th, 19th and early 20th centuries and that of the United States from the mid 20th century onwa...
 day and Latin
Latin

Latin is an Italic language, historically spoken in Latium and Ancient Rome. Through the Military history of the Roman Empire, Latin spread throughout the Mediterranean and a large part of Europe....
 dies (both with the same meaning), has no probative value. English initial d- does not regularly match Latin d-, and whatever sporadic matches can be observed are due either to chance (as in the above example) or to borrowing
Loanword

A loanword is a word directly taken into one language from another with little or no translation. By contrast, a calque or loan translation is a related concept whereby it is the Meaning or idiom that is borrowed rather than the lexical item itself....
 (e.g. Latin diabolus and English devil, both ultimately of Greek origin). The Neogrammarian
Neogrammarian

The Neogrammarians were a Germany school of linguistics, originally at the University of Leipzig, in the late 19th century who proposed the Neogrammarian hypothesis of the regularity of sound change....
s first emphasized this point in the late 1800s, and their motto, "sound laws have no exceptions", has remained a fundamental axiom in historical linguistics to this day.

For example, although the correspondence d- : d- (where the notation "A : B" means "A corresponds to B") in English and Latin day and dies above is not regular, English and Latin do exhibit a very regular correspondence of t- : d-. For example:

 English   ten   two   tow   tongue   tooth 
 Latin   decem   duo   duco   dingua   dent- 


Since a truly systematic correspondence is unlikely to be accidental, if alternative possibilities like massive borrowing can be ruled out, then the correspondence can be attributed to common descent. If there are many regular correspondence sets of this kind (the more the better), then common origin becomes a virtual certainty, particularly if some of the correspondences are non-trivial or unusual.

Discover which sets are in complementary distribution
During the time the comparative method was being developed (late 18th to late 19th century), two major developments occurred which improved the method's effectiveness.

First, it was found that many sound changes are conditioned by a particular context. Thus for example, in both Greek
Ancient Greek

Ancient Greek is the historical stage in the development of the Greek language spanning across the Archaic Greece , Classical Greece , and Hellenistic civilization periods of ancient Greece and the classical antiquity....
 and Sanskrit
Sanskrit

Sanskrit is a historical Indo-Aryan language, one of the liturgical languages of Hinduism and Buddhism, and one of the 22 official languages of India....
, an aspirated
Aspiration (phonetics)

In phonetics, aspiration is the strong burst of Earth's atmosphere that accompanies either the release or, in the case of preaspiration, the closure of some obstruents....
 stop
Stop consonant

A stop, plosive, or occlusive is a consonant sound produced by stopping the airflow in the vocal tract. The terms plosive and stop are usually used interchangeably, but they are not perfect synonyms....
 evolved into an unaspirated one, but only if a second aspirate occurred later on in the same word; this is Grassmann's law
Grassmann's Law

Grassmann's law, named after its discoverer Hermann Grassmann, is a dissimilatory phonological process in Ancient Greek and Sanskrit which states that if an Aspiration d consonant is followed by another aspirated consonant in the next syllable, the first one loses the aspiration....
, known to the Sanskrit grammarian Pa?ini
Pa?ini

was an Iron Age India Sanskrit grammarian from Pushkalavati, Gandhara .He is known for his Vyakarana, particularly for his formulation of the 3,959 rules of Sanskrit Morphology in the grammar known as 'Ashtadhyayi' , the foundational text of the grammatical branch of the Vedanga, the auxiliary scholarly disciplines of historical Ved...
 and promulgated as a historical discovery by Hermann Grassmann
Hermann Grassmann

Hermann G?nther Grassmann was a Germany polymath, renowned in his day as a linguistics and now admired as a mathematics. He was also a physics, Humanism, general scholar, and publisher....
 in 1863.

Second, it was found that sometimes sound changes occurred in contexts that were later lost. For instance, in Sanskrit velars
Velar consonant

Velars are consonants articulated with the back part of the tongue against the soft palate, the back part of the roof of the mouth, known also as the Soft palate)....
 (k-like sounds) were replaced by palatals
Palatal consonant

Palatal consonants are consonants articulated with the body of the tongue raised against the hard palate . Consonants with the tip of the tongue curled back against the palate are called retroflex consonant....
 (ch-like sounds) whenever the following vowel was *i or *e. Subsequent to this change, all instances of *e were replaced by a. The situation would have been unreconstructable, had not the original distribution of e and a been recoverable from the evidence of other Indo-European languages
Indo-European languages

The Indo-European languages are a Language family of several hundred related languages and dialects, including most major languages of Europe, the Iranian plateau , Central Asia and the Indian subcontinent ....
. Thus, for instance, Latin
Latin

Latin is an Italic language, historically spoken in Latium and Ancient Rome. Through the Military history of the Roman Empire, Latin spread throughout the Mediterranean and a large part of Europe....
 que, "and", preserves the original *e vowel that caused the consonant shift in Sanskrit:

 1.   *ke   Pre-Sanskrit "and" 
 2.   *ce   Velars replaced by palatals before *i and *e 
 3.   ca   *e becomes a 


Ca is the attested Sanskrit form for and. This finding was made independently by several scholars during the 1870s.

Verner's Law
Verner's law

Verner's law, stated by Karl Verner in 1875, describes a historical sound change in the Proto-Germanic language whereby voiceless fricatives *f, *?, *s, *h , when immediately following an unstressed syllable in the same word, underwent voicing and became respectively the fricatives *b, *d, *z, *g ....
, discovered by Karl Verner
Karl Verner

Karl Verner was a Denmark linguistics. He is remembered today for Verner's law, which he discovered in 1875.Verner, whose interest in languages was stimulated by reading about the work of Rasmus Christian Rask, began his university studies in 1864....
 in about 1875, is a similar case: the voicing
Voice (phonetics)

Voice or voicing is a term used in phonetics and phonology to characterize speech sound, with sounds described as either voiceless or voiced....
 of consonants in Germanic languages
Germanic languages

The Germanic languages are a group of related languages that constitute a branch of the Indo-European languages language family. The common ancestor of all the languages in this branch is Proto-Germanic, spoken in approximately the mid-1st millennium BC in Pre-Roman Iron Age....
 underwent a change that was determined by the position of the old Indo-European accent
Stress (linguistics)

In linguistics, stress is the relative emphasis that may be given to certain syllables in a word. The term is also used for similar patterns of phonetic prominence inside syllables....
. Following the change, the accent shifted across the board to initial position. Verner solved the puzzle by comparing the Germanic voicing pattern with data from Greek and Sanskrit accent.

This stage of the comparative method, therefore, involves examining the correspondence sets discovered in step 2 and seeing which of them apply only in certain contexts. If two (or more) sets involve identical or similar sounds, and apply in complementary distribution
Complementary distribution

Complementary distribution in linguistics is the relationship between two different elements, where one element is found in a particular environment and the other element is found in the opposite environment....
, then the sets can be assumed to reflect a single original phoneme
Phoneme

In human language, a phoneme is the smallest posited linguistically distinctive unit of sound. Phonemes carry no semantic content themselves. In theoretical terms, phonemes are not the physical segment s themselves, but cognitive abstractions or categorizations of them....
. This is because "some sound changes, particularly conditioned sound changes, can result in a proto-sound being associated with more than one correspondence set".

To take another example, the Romance languages
Romance languages

The Romance languages are a branch of the Indo-European languages comprising all the languages that descend from Latin language, the language of ancient Rome....
, descended from Latin
Latin

Latin is an Italic language, historically spoken in Latium and Ancient Rome. Through the Military history of the Roman Empire, Latin spread throughout the Mediterranean and a large part of Europe....
, exhibit two different correspondence sets which both involve k:

 Italian
Italian language

Italian is a Romance languages spoken by about 63 million people as a first language, primarily in Italy. In Switzerland, Italian is one of four Linguistic geography of Switzerlands....
 
 Spanish
Spanish language

Spanish or Castilian is a Romance languages that originated in northern Spain, and gradually spread in the Kingdom of Castile and evolved into the principal language of government and trade....
 
 Portuguese
Portuguese language

Portuguese is a Romance language that originated in what is now Galicia and Portugal. It is derived from the Latin language spoken by the Romanization Pre-Roman peoples of the Iberian Peninsula around 2000 years ago....
 
 French
French language

French is a Romance language spoken around the world by around 80 million people as first language, by 190 million as second language, and by about another 200 million people as an acquired tongue, with significant speakers in 54 countries....
 
 1.   k   k   k   k 
 2.   k   k   k   
Voiceless postalveolar fricative

The voiceless palato-alveolar fricative or domed postalveolar fricative is a type of consonantal sound, used in some Speech communication languages....
 


 Italian
Italian language

Italian is a Romance languages spoken by about 63 million people as a first language, primarily in Italy. In Switzerland, Italian is one of four Linguistic geography of Switzerlands....
 
 Spanish
Spanish language

Spanish or Castilian is a Romance languages that originated in northern Spain, and gradually spread in the Kingdom of Castile and evolved into the principal language of government and trade....
 
 Portuguese
Portuguese language

Portuguese is a Romance language that originated in what is now Galicia and Portugal. It is derived from the Latin language spoken by the Romanization Pre-Roman peoples of the Iberian Peninsula around 2000 years ago....
 
 French
French language

French is a Romance language spoken around the world by around 80 million people as first language, by 190 million as second language, and by about another 200 million people as an acquired tongue, with significant speakers in 54 countries....
 
 Gloss 
 1.   corpo   cuerpo   corpo   corps   body 
 2.   crudo   crudo   cru   cru   raw 
 3.   catena   cadena   cadeia   chaîne   chain 
 4.   cacciare   cazar   caçar   chasser   to hunt 


What linguists do in this situation is try to see if the two sets occur in complementary distribution, in which case they reflect a single proto-phoneme, or if both occur in identical environments, in which case they must both reflect separate proto-phonemes. In this case, French only occurs before a in the other languages (which becomes in French), while French k occurs elsewhere. Both sets 1 and 2 can therefore be assumed to reflect a single proto-phoneme (in this case *k, spelled in Latin).

A more complex case involves consonant clusters in Proto-Algonquian
Proto-Algonquian language

Proto-Algonquian is the name given to the posited proto-language of the languages of the Algonquian languages. One theory, first put forth by Frank Siebert in 1967, is that it was spoken between 2500 and 3000 years ago between Georgian Bay, Ontario and Lake Ontario, Ontario, in Canada, and at least as far south as Niagara Falls , although th...
, which have been notoriously difficult to reconstruct. The Algonquianist Leonard Bloomfield
Leonard Bloomfield

Leonard Bloomfield was an United States linguistics, whose influence dominated the development of structuralism#Structuralism in linguistics in America between the 1930s and the 1950s....
 used the reflexes of the clusters in four of the daughter languages of Proto-Algonquian to come up with the following correspondence sets:

 Ojibwe   Meskwaki
Fox language

Fox is an Algonquian languages Indigenous languages of the Americas language, spoken by around 1000 Fox , Sauk, and Kickapoo in various locations in the Midwest and in northern Mexico....
 
 Plains Cree
Plains Cree language

Plains Cree is an Algonquian languages, often considered a dialect of Cree language, spoken by about 34,000 people in Manitoba, Saskatchewan, Alberta, and Montana....
 
 Menomini
Menominee language

The Menominee language is an Algonquian language originally spoken by the Menominee people of northern Wisconsin and Michigan. It is still spoken on the Menominee Nation lands in Northern Wisconsin in the United States....
 
 1.   kk   hk   hk   hk 
 2.   kk   hk   sk   hk 
 3.   sk   hk   sk    
 4.         sk   sk 
 5.   sk      hk   hk 


Although all five correspondence sets overlap with one another in various places, they are not in complementary distribution, and so Bloomfield recognized that a different cluster must be reconstructed for each set; his reconstructions were, respectively, *hk, *xk, *ck (=), *šk (=), and çk (where ‘x’ and ‘ç’ are arbitrary symbols, not attempts to guess the phonetic value of the proto-phonemes).

Reconstruct proto-phonemes
This step tends to be much more subjective than the previous ones. A linguist here has to rely mostly on their general intuitions about what types of sound changes are likely and which are unlikely. For example, the voicing of voiceless plosives between vowels is an extremely common sound change, occurring in languages all over the world, whilst the devoicing of voiced plosives between vowels is extremely uncommon. Therefore, if a linguist were comparing two languages with a correspondence of -t- : -d- between vowels, they would reconstruct the proto-phoneme
Phoneme

In human language, a phoneme is the smallest posited linguistically distinctive unit of sound. Phonemes carry no semantic content themselves. In theoretical terms, phonemes are not the physical segment s themselves, but cognitive abstractions or categorizations of them....
 as being *-t-, and assume that it became voiced to -d- in the second language (unless they had a very good reason not to).

Sometimes, sound changes occur that are extremely unusual or unexpected. The Proto-Indo-European
Proto-Indo-European language

The Proto-Indo-European language is the unattested, linguistic reconstruction common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans....
 word for two, for example, is reconstructed as *dwo, which is reflected in Classical Armenian as erku. Several other cognates demonstrate that the change *dw- ? erk- in the history of Armenian was a regular one. Similarly, in Bearlake, a dialect of the Athabaskan language
Athabaskan languages

Athabaskan or Athabascan is the name of a large group of closely related Indigenous peoples of the Americas of North America, located in two main Southern and Northern groups in western North America, and of their language family....
 of Slavey
Slavey language

Slavey is an Athabaskan languages spoken among the Slavey First Nations of Canada in the Northwest Territories where it also has official language....
, there has been a sound change of Proto-Athabaskan *ts ? Bearlake . It is very unlikely that *dw- changed directly into erk- and *ts into , but instead they must have gone through several intermediate steps to arrive at the later forms. The lesson here is that with enough sound changes, a given sound can change into just about any other sound. This is why it is not phonetic similarity which matters when utilizing the comparative method, but regular sound correspondences.

Another assumption used in determining a proto-phoneme is that the reconstruction should ideally involve as few sound changes as possible to arrive at the modern reflexes in the daughter languages. In other words, unless there is persuasive evidence to the contrary, whatever value is the most common reflex in the daughter languages should be reconstructed as the value of the proto-phoneme. For example, Algonquian languages
Algonquian languages

The Algonquian languages are a subfamily of Native American languages that includes most of the languages in the Algic languages language family ....
 exhibit the following correspondence set:

 Ojibwe   Míkmaq
Mi'kmaq language

The M?kmaq or Mi'kmaq language is an Eastern Algonquian languages language spoken by nearly 11,000 Mi'kmaq in Canada and the United States out of a total ethnic M?kmaq population of roughly 20,000....
 
 Cree
Cree language

Cree is the name for a group of closely-related Algonquian languages spoken by approximately 117,000 people across Canada, from the Northwest Territories to Labrador, making it by far the most spoken Native American languages in Canada....
 
 Munsee
Munsee language

Munsee is an endangered language of the Eastern Algonquian languages subgroup of the Algonquian languages language family, itself a member of the Algic languages language family....
 
 Blackfoot
Blackfoot language

Blackfoot also known as Siksika , Pikanii, and Blackfeet, is the name of any of the Algonquian languages spoken by the Blackfoot tribe of Native Americans in the United States, who currently live in the northwestern plains of North America....
 
 Arapaho
Arapaho language

The Arapaho language or "hinono'eitiit" is a Plains Algonquian languages spoken almost entirely by elders in Wyoming, and to a much lesser extent in Oklahoma....
 
 m   m   m   m   m   b 


The simplest reconstruction for this set would be either *m or *b. Both *m ? b and *b ? m (where "*A ? B" means "*A becomes B") are conceivable sound changes, so the principle of reconstructing "likely" changes over "unlikely" ones is not useful here. Instead, because the reflex of this proto-phoneme is m in five of the languages compared here, and b in only one of them, if *b is reconstructed, then it is necessary to assume five separate changes of *b ? m, whereas if *m is reconstructed, it is only necessary to assume a single change of *m ? b in one language in the family. Since the assumption is that reconstructions should require the fewest number of changes possible to arrive at the modern reflexes, linguists would reconstruct *m here.

Examine the reconstructed system typologically
In the final step, the linguist takes all the proto-phoneme
Phoneme

In human language, a phoneme is the smallest posited linguistically distinctive unit of sound. Phonemes carry no semantic content themselves. In theoretical terms, phonemes are not the physical segment s themselves, but cognitive abstractions or categorizations of them....
s that have been reconstructed using steps 1-4, and checks to see how the system fits with what is currently known about typological constraints
Linguistic typology

Linguistic typology is a subfield of linguistics that studies and classifies languages according to their structural features. Its aim is to describe and explain the structural diversity of the world's languages....
. For example, if the reconstructed phonemes fit together in the following hypothetical system, the linguist would be suspicious, because languages generally (though not always) tend to maintain symmetry in their phonemic inventories:

  p     t     k  
  b  
  n     ?  
  l  


In this hypothetical reconstructed system, there is only one voiced plosive
Voiced bilabial plosive

The voiced bilabial plosive is a type of consonantal sound, used in some Speech communication languages. The symbol in the International Phonetic Alphabet that represents this sound is , and the equivalent X-SAMPA symbol is b....
, *b, and although there is an alveolar
Alveolar nasal

The alveolar nasal is a type of consonantal sound used in numerous spoken languages. The symbol in the International Phonetic Alphabet that represents dental consonant, alveolar consonant, and postalveolar consonant nasal consonant is , and the equivalent X-SAMPA symbol is n....
 and a velar nasal
Velar nasal

The velar nasal is a type of consonantal sound, used in some Speech communication languages. The symbol in the International Phonetic Alphabet that represents this sound is , and the equivalent X-SAMPA symbol is N....
, *n and *?, there is no corresponding labial nasal
Bilabial nasal

The bilabial nasal is a type of consonantal sound used in almost all spoken languages. The symbol in the International Phonetic Alphabet that represents this sound is , and the equivalent X-SAMPA symbol is m....
. In this case, the linguist would have to return to step 4 and reevaluate their earlier conclusions. In this case, they would try to figure out if there is any evidence to suggest that what was earlier reconstructed as *b is in fact *m, or evidence that what was earlier reconstructed as *n and *? are in fact *d and *g.

Even a symmetrical system can be typologically suspicious. For example, the Proto-Indo-European
Proto-Indo-European language

The Proto-Indo-European language is the unattested, linguistic reconstruction common ancestor of the Indo-European languages, spoken by the Proto-Indo-Europeans....
 plosive inventory, as traditionally reconstructed, is as follows:

 Labial
Labial consonant

Labials are consonants articulated either with both lips or with the lower lip and the upper teeth . English is a bilabial nasal consonant sonorant, and are bilabial stop consonant , and are labiodental fricative consonant....
 Dental
Dental consonant

In linguistics, a dental consonant or dental is a consonant that is articulated with the tongue against the upper teeth, such as , , , and in some languages....
 Velar
Velar consonant

Velars are consonants articulated with the back part of the tongue against the soft palate, the back part of the roof of the mouth, known also as the Soft palate)....
 Labiovelar
Labiovelar consonant

The term labiovelar is ambiguous. It may mean Labial-velar consonant , or it may mean labialization velar consonant .When the manner of articulation is a stop consonant, nasal consonant, or fricative consonant, these are quite different....
 Palatovelars 
 Voiceless ptk
 Voiced (b)dg
 Voiced aspirated
Aspiration (phonetics)

In phonetics, aspiration is the strong burst of Earth's atmosphere that accompanies either the release or, in the case of preaspiration, the closure of some obstruents....
 


Since the mid-20th century, a number of linguists have argued that this system is, at best, very suspicious typologically. They state that it is extremely unlikely, or maybe even impossible, for a language to have a voiced aspirated (breathy voice
Breathy voice

Breathy voice is a phonation in which the vocal cords vibrate, as they do in normal voicing, but are held further apart, so that a larger volume of air escapes between them....
) series without a corresponding voiceless aspirated series. These linguists therefore argue, on typological grounds, that it is necessary to reevaluate the traditional reconstruction of Proto-Indo-European. A potential solution was provided by Thomas Gamkrelidze and Vyacheslav V. Ivanov, who argued that the series traditionally reconstructed as plain voiced should in fact be reconstructed as glottalized
Glottalization

Glottalization is the complete or partial closure of the glottis during the articulation of another sound. Glottalization of vowels and voiced consonants is most often realized as creaky voice ....
 — either implosive
Implosive consonant

Implosive consonants are stop consonant with a mixed glottalic ingressive and pulmonic egressive airstream mechanism. That is, the airstream is controlled by moving the glottis downward in addition to expelling air from the lungs....
  or ejective
Ejective consonant

In phonetics, ejective consonants are voiceless consonants that are pronounced with simultaneous closure of the glottis. In the phonology of a particular language, ejectives may contrast with aspiration or tenuis consonants....
 . The plain voiceless and voiced aspirated series would thus be seen as just voiceless and voiced, with aspiration being a non-distinctive quality of both. This example of the application of linguistic typology to linguistic reconstruction has become known as the Glottalic Theory
Glottalic theory

The glottalic theory holds that Proto-Indo-European language had Ejective consonant stop consonant, , but not the breathy voice ones, , of traditional Proto-Indo-European reconstructions....
. It has a large number of proponents but is not generally accepted.

The reconstruction of proto-sounds and their historical transformations enables linguists to proceed further: they can compare grammatical morpheme
Morpheme

In morpheme-based morphology, a is the smallest linguistic unit that has semantics Meaning .In spoken language, morphemes are composed of phonemes , and in written language morphemes are composed of graphemes ....
s (word-forming affixes and inflectional endings), patterns of declension
Declension

In linguistics, declension is the occurrence of inflection in nouns, pronouns and adjectives, indicating such features as grammatical number , grammatical case , and grammatical gender....
 and conjugation
Conjugation

Conjugation may refer to:*Grammatical conjugation, the modification of a verb from its basic form, including:**Latin conjugation**Spanish conjugation...
, and so on. The full reconstruction of an unrecorded protolanguage can never be complete (for example, proto-syntax
Syntax

In linguistics, syntax is the study of the principles and rules for constructing Sentence s in natural languages. In addition to referring to the discipline, the term syntax is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the Irish syntax"....
 is far more elusive than phonology
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
 or morphology
Morphology (linguistics)

Morphology is the identification, analysis and description of structure of words . While words are generally accepted as being the smallest units of syntax, it is clear that in most languages, words can be related to other words by rules....
, and all elements of linguistic structure undergo inevitable erosion and gradual loss or replacement over time), but a consistent partial reconstruction can and must be attempted as proof of genetic relationship.

Limitations

A number of difficulties with aspects related to the method are now recognized, but the comparative method is still seen as being one of the most valuable tools in comparative linguistics, and linguists continue to use it widely; other proposed approaches to determining linguistic relationships and reconstructing proto-languages, such as glottochronology
Glottochronology

Glottochronology is an approach in historical linguistics for estimating the time at which languages diverged, based on the assumption that the basic vocabulary of a language changes at a constant average rate....
 and mass lexical comparison
Mass lexical comparison

Mass comparison is a method developed by Joseph Greenberg to determine the level of genetic relationship between languages. It is now usually called multilateral comparison....
, are considered flawed and unreliable by nearly all linguists. Linguists recognize, however, that results obtained with the comparative method are not historical fact. Fox (1997:141-2), for example, concludes:

“The Comparative Method as such is not, in fact, historical; it provides evidence of linguistic relationships to which we may give a historical interpretation. ...[Our increased knowledge about the historical processes involved] has probably made historical linguists less prone to equate the idealizations required by the method with historical reality. ...Provided we keep [the interpretation of the results and the method itself] apart, the Comparative Method can continue to be used in the reconstruction of earlier stages of languages.”


Neogrammarian Hypothesis
The foundation of the comparative method, and of comparative linguistics in general, is the Neogrammarian
Neogrammarian

The Neogrammarians were a Germany school of linguistics, originally at the University of Leipzig, in the late 19th century who proposed the Neogrammarian hypothesis of the regularity of sound change....
s' fundamental assumption that "sound laws have no exceptions." When it was initially proposed, critics of the Neogrammarians proposed an alternate position, summarized by the maxim "each word has its own history". The so-called Neogrammarian Hypothesis is now well-established and well-supported, though there remain some situations in which its application can yield faulty results.

Borrowings, areal diffusion and random mutations
Even the Neogrammarians recognized that, apart from the general sound change laws, languages are also subject to borrowing
Loanword

A loanword is a word directly taken into one language from another with little or no translation. By contrast, a calque or loan translation is a related concept whereby it is the Meaning or idiom that is borrowed rather than the lexical item itself....
s from other languages and other sporadic changes (such as irregular inflections, compounding, and abbreviation) that affect one word at a time, or small subsets of words.

While borrowed words should be excluded from the analysis, on the grounds that they are not genetic by definition, they do add noise to the data, and thus may hide systematic laws or distort their analysis. Moreover, there is the danger of circular reasoning — namely, of assuming that a word has been borrowed solely because it does not fit the current assumptions about the regular sound laws.

Attempts to apply the comparative method to languages which have been affected by the process of areal diffusion
Areal feature (linguistics)

In linguistics, an areal feature is any typology feature shared by languages within the same geographical area.Resemblances between two or more languages can be due to genetic relation , or due to loanword at some time in the past between languages that were not necessarily genetically related....
 can also be problematic. This is, in essence, a subtle form of borrowing, which can take place when a significant number of speakers of one language have some competence in another, possibly unrelated language. This may cause the languages to acquire phonological
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
 characteristics from one another, sometimes even without the conscious borrowing of lexical
Lexeme

A lexeme is an abstract Unit of Morphology Semantic analysis in linguistics, that roughly corresponds to a set of forms taken by a single word....
 or morphological
Morpheme

In morpheme-based morphology, a is the smallest linguistic unit that has semantics Meaning .In spoken language, morphemes are composed of phonemes , and in written language morphemes are composed of graphemes ....
 forms, with the result that the two languages may end up appearing to be genetically related when in fact they are not. It is also possible that two or more unrelated languages may appear to be related as the result of independent areal diffusion from a third unrelated language. It becomes especially hard when several areal features and other influences converge to form a sprachbund
Sprachbund

A Sprachbund , from the German language word for ?language union?, also known as a linguistic area, convergence area, diffusion area or language crossroads, is a group of languages that have become similar in some way because of geographical proximity and language contact....
, making their identification all the more important; for instance, the East Asian Sprachbund
East Asian languages

East Asian languages describe two notional groupings of languages in East Asia and Southeast Asia Asia:* Languages which have been greatly influenced by Classical Chinese and the Written Chinese, in particular Chinese language, Japanese language, Korean language and Vietnamese language ....
 suggested several false classifications of such languages as Chinese, Korean
Korean language

Korean is the official language of North Korea and South Korea. It is also one of the two official languages in the Yanbian Korean Autonomous Prefecture in People's Republic of China....
, Japanese
Japanese language

IPA: [n?iho?go] is a language spoken by over 130 million people in Japan and in Japanese emigrant communities. It is related to the Ryukyuan languages....
, and Vietnamese
Vietnamese language

Vietnamese , formerly known under French colonization as Annamese , is the national language and official language language of Vietnam. It is the mother tongue of the Vietnamese people , who constitute 86% of Demographics of Vietnam, and of about three million overseas Vietnamese, most of whom live in the United States....
 before it was recognized.

The other exceptions to the sound laws are a more serious problem, because they occur in generic language transmission. Such a sporadic change, with no apparent logical reason, is illustrated by the Spanish
Spanish language

Spanish or Castilian is a Romance languages that originated in northern Spain, and gradually spread in the Kingdom of Castile and evolved into the principal language of government and trade....
 words palabra ('word'), peligro ('danger') and milagro ('miracle'). By regular sound changes from the Latin parabola, periculum and mirãculum, these should have become parabla, periglo, miraglo; but the r and l changed places by sporadic metathesis
Metathesis (linguistics)

Metathesis is a sound change that alters the order of phonemes in a word. The most common instance of metathesis is the reversal of the order of two adjacent phonemes, such as "comfterble" for comfortable ....
.

Analogy
A source of sporadic changes that was recognized by the Neogrammarians themselves was analogy
Analogy

Analogy is both the cognition process of transferring information from a particular subject to another particular subject , and a language expression corresponding to such a process....
, in which a word is sporadically changed to be closer to another word in the lexicon which is perceived as being somehow related to it. For example, the Russian
Russian language

Russian is the most geographically widespread language of Eurasia, the most widely spoken of the Slavic languages, and the largest native language in Europe....
 word for nine, by regular sound changes from Proto-Slavic
Proto-Slavic language

Proto-Slavic is the proto-language from which Slavic languages later emerged. It was spoken before the seventh century. As with all other proto-languages, no attested writings have been found; the language has been reconstructed by applying the comparative method to all the attested Slavic languages as well as other Indo-European languages....
, should have been , but is in fact . It is believed that the initial changed to under influence of the word for "ten" in Russian, . Analogy very often affects verbs whose paradigms have become irregular through conditional sound-changes.

Gradual application
More recently, William Labov
William Labov

William Labov is an United States linguist, widely regarded as the founder of the discipline of variationist sociolinguistics. He has been described as "an enormously original and influential figure who has created much of the methodology" of sociolinguistics....
 and other linguists who have studied contemporary language changes in detail have discovered that even a systematic sound change is at first applied in an unsystematic fashion, with the percentage of its occurrence in a person's speech dependent on various social factors. Often the sound change begins to affect some words in a language, and then gradually spreads to others, a process known as lexical diffusion
Lexical diffusion

In historical linguistics, lexical diffusion is both a phenomenon and a theory. The phenomenon is that by which a phoneme is modified in a subset of the lexicon, and spreads gradually to other lexical items....
. While not invalidating the Neogrammarians' axiom that "sound laws have no exceptions", this does seem to show that sound laws do not always apply to all lexical items at the same time. As Hock (1991:446-7) notes, "While it probably is true in the long run every word has its own history, it is not justified to conclude as some linguists have, that therefore the Neogrammarian position on the nature of linguistic change is falsified."

Problems with the Tree Model
Another weakness of the comparative method lies in its reliance on the Tree Model (German Stammbaum). In this model, daughter languages are seen as branching out from the proto-language, gradually growing more and more distant from the proto-language through accumulated phonological
Phonology

Phonology is the systematic use of sound to encode meaning in any spoken human language, or the field of linguistics studying this use. Just as a language has syntax and vocabulary, it also has a phonology in the sense of a sound system....
, morpho-syntactic, and lexical
Lexicon

In linguistics, the lexicon of a language is its vocabulary, including its words and expressions. More formally, it is a language's inventory of lexemes....
 changes; and possibly splitting into further daughter languages. This model is usually represented by upside-down tree-like diagrams. For example, here is a diagram of the Uto-Aztecan
Uto-Aztecan languages

Uto-Aztecan is a Indigenous languages of the Americas language family. It is one of the largest and most well-established linguistic families of the Americas....
 family of languages, spoken throughout the southern and western United States
United States

The United States of America is a Federal government constitutional republic comprising U.S. state and a federal district. The country is situated mostly in central North America, where its Contiguous United States and Washington, D.C., the Capital districts and territories, lie between the Pacific Ocean and Atlantic Oceans, Borders of the U...
 and Mexico
Mexico

The United Mexican States , commonly known as Mexico , is a federalism constitutionalism republic in North America. It is bordered on the north by the United States; on the south and west by the Pacific Ocean; on the southeast by Guatemala, Belize, and the Caribbean Sea; and on the east by the Gulf of Mexico....
:

Uto Aztecan Family Tree

Wave model

Since languages change gradually, there are long periods in which different dialects of a language, as they evolve into separate languages, remain in contact with one another and influence each other. Therefore, the Tree Model
Tree model

In historical linguistics, the Tree Model is a model of language change in which daughter languages are Genetic descended from a proto-language through a regular process of gradual change and is due in its most strict formulation to the Neogrammarians....
 does not reflect the reality of how languages change, as even once they are completely separated, languages which are near to one another will continue to influence each other, often sharing grammatical, phonological, and lexical innovations. A change in one language of a family will often spread to neighboring languages; and multiple waves of change may partially overlap like waves on the surface of a pond, across language and dialect boundaries, each with its own randomly delimited range. The following diagram illustrates this conception of language change, called the Wave Model
Wave model (linguistics)

In historical linguistics, the wave model or wave theory is a model of language change in which new features of a language spread from a central point in continuously weakening concentric circles, similar to the waves created when a stone is thrown into a body of water....
:

Wave Model Schmidt
However, Hock (1991:454) observes:

“The discovery in the late nineteenth century that isogloss
Isogloss

An isogloss is the geographical boundary or delineation of a certain linguistics feature, e.g. the pronunciation of a vowel, the meaning of a word, or use of some syntactic feature....
es can cut across well-established linguistic boundaries at first created considerable attention and controversy. And it became fashionable to oppose a wave theory to a tree theory... Today, however, it is quite evident that the phenomena referred to by these two terms are complementary aspects of linguistic change...


What seemed at the outstart as two incompatible conceptions of how languages change had already coalesced into one single explanatory theory. As demonstrated by Labov (2007), what needed to be reconciled within one framework of thinking were the transmission and the diffusion principles of linguistic change
Language change

Language change is the manner in which the Phonetics, Morphology , Semantics, Syntax, and other features of a language are modified over time. All languages are continually changing....
. The transmission of change within a speech community is characterized by incrementation within a faithfully reproduced pattern characteristic of the tree model, while diffusion across communities shows weakening of the original pattern and a loss of structural features. This is the result of the differences between the learning abilities of children and adults as intercommunal contacts are primarily between the latter.

Non-uniformity of the proto-language
Another assumption implicit in the methodology of the comparative method is that the proto-language is uniform. This is problematic, as even in extremely small language communities there are always dialect differences
Dialect

A dialect is a variety of a language that is characteristic of a particular group of the language's speakers. The term is applied most often to regional speech patterns, but a dialect may also be defined by other factors, such as social class....
, whether based on area, gender, class, or other factors (the Pirahã language
Pirahã language

Pirah? is a language spoken by the Pirah? people — an indigenous people of Amazonas , Brazil, who live along the Maici river, a tributary of the Amazon River....
 of Brazil
Brazil

Brazil , officially the Federative Republic of Brazil , is a country in South America. It is the List of countries and outlying territories by total area country by geographical area, occupying nearly half of South America, the List of countries by population country, and the fourth most populous democracy in the world....
 is spoken by only several hundred people, but has at least two different dialects, one spoken by men and one by women, for example). Therefore, the single proto-language reconstructed by the comparative method is an idealized language which never existed. This may not be as serious an issue as it at first appears, however; Campbell (2004:146-7) for instance, points out:

“It is not so much that the comparative method 'assumes' no variation; rather, it is just that there is nothing built into the comparative method which would allow it to address variation directly....This assumption of uniformity is a reasonable idealization; it does no more damage to the understanding of the language than, say, modern reference grammars do which concentrate on a language's general structure, typically leaving out consideration of regional or social variation.”


Subjectivity of the reconstruction
While the identification of systematic sound correspondences between known languages is fairly objective, the reconstruction of their common ancestral language is inherently subjective. In the Proto-Algonquian
Proto-Algonquian language

Proto-Algonquian is the name given to the posited proto-language of the languages of the Algonquian languages. One theory, first put forth by Frank Siebert in 1967, is that it was spoken between 2500 and 3000 years ago between Georgian Bay, Ontario and Lake Ontario, Ontario, in Canada, and at least as far south as Niagara Falls , although th...
 example above, the choice of *m as the parent phoneme
Phoneme

In human language, a phoneme is the smallest posited linguistically distinctive unit of sound. Phonemes carry no semantic content themselves. In theoretical terms, phonemes are not the physical segment s themselves, but cognitive abstractions or categorizations of them....
 is only likely, not certain. It is conceivable that a Proto-Algonquian language with *b in those positions split into two branches, one which preserved *b and one which changed it to *m instead; and while the first branch only developed into Arapaho
Arapaho language

The Arapaho language or "hinono'eitiit" is a Plains Algonquian languages spoken almost entirely by elders in Wyoming, and to a much lesser extent in Oklahoma....
, the second spread out wider and developed into all the other Algonquian tribes. It is also possible that the nearest common ancestor of the Algonquian languages
Algonquian languages

The Algonquian languages are a subfamily of Native American languages that includes most of the languages in the Algic languages language family ....
 used some other sound instead, such as *p, which eventually mutated to *b in one branch and to *m in the other.

Since the reconstruction of a proto-language involves many of these choices, some linguists prefer to view the proto-phonemes that are reconstructed as abstract representations of sound correspondences, rather than a literal guess about what sounds were present in the proto-language. On the other hand, there are a number of well-known cases where reconstructions have been confirmed as correct by independent evidence such as loanword
Loanword

A loanword is a word directly taken into one language from another with little or no translation. By contrast, a calque or loan translation is a related concept whereby it is the Meaning or idiom that is borrowed rather than the lexical item itself....
s. For example Finnic languages
Finnic languages

Finnic languages may refer to:*Finno-Permic languages*Finno-Volgaic languages*Baltic-Finnic languages and/or Volga-Finnic languages...
 such as Finnish
Finnish language

Finnish is the language spoken by the majority of the population in Finland and by Finnish people outside of Finland. It is one of the official languages of Finland and an official minority language in Sweden....
 have borrowed many words from an early stage of Germanic
Germanic languages

The Germanic languages are a group of related languages that constitute a branch of the Indo-European languages language family. The common ancestor of all the languages in this branch is Proto-Germanic, spoken in approximately the mid-1st millennium BC in Pre-Roman Iron Age....
, and the shape of the loans matches the forms that have been reconstructed for Proto-Germanic: compare, e.g., Finnish kuningas 'king' and kaunis 'beautiful' to the Germanic reconstructions *kuningaz and *skauniz (> German König 'king', schön 'beautiful').

See also

  • Historical linguistics
    Historical linguistics

    Historical linguistics is the study of language change. It has five main concerns:* to describe and account for observed changes in particular languages;...
  • Comparative linguistics
    Comparative linguistics

    Comparative linguistics is a branch of historical linguistics that is concerned with comparing languages in order to establish their history relatedness....
  • Proto-language
    Proto-language

    A proto-language is the common ancestor of the languages that form a language family. Occasionally, the German language term Ursprache is used instead....
  • Lexicostatistics
    Lexicostatistics

    Lexicostatistics is an approach to comparative linguistics that involves quantitative comparison of lexical cognates. Lexicostatistics is related to the comparative method but does not reconstruct a proto-language....
  • Swadesh list
    Swadesh list

    A Swadesh list is one of several lists of vocabulary with "basic" meanings, developed by Morris Swadesh in the 1940?50s, which is used in lexicostatistics and glottochronology ....


External links

  • (PDF)