Wolfgang von Kempelen's Speaking Machine
Encyclopedia
Wolfgang von Kempelen's Speaking Machine is a manually-operated speech synthesizer that began development in 1769, by Austro-Hungarian
Austria-Hungary
Austria-Hungary , more formally known as the Kingdoms and Lands Represented in the Imperial Council and the Lands of the Holy Hungarian Crown of Saint Stephen, was a constitutional monarchic union between the crowns of the Austrian Empire and the Kingdom of Hungary in...

 author and inventor Wolfgang von Kempelen
Wolfgang von Kempelen
Johann Wolfgang Ritter von Kempelen de Pázmánd was a Hungarian author and inventor with Irish ancestors.-Life:...

. It was in this same year that he completed his far more infamous contribution to history: The Turk
The Turk
The Turk, also known as the Mechanical Turk or Automaton Chess Player , was a fake chess-playing machine constructed in the late 18th century. From 1770 until its destruction by fire in 1854, it was exhibited by various owners as an automaton, though it was exposed in the early 1820s as an...

, a chess
Chess
Chess is a two-player board game played on a chessboard, a square-checkered board with 64 squares arranged in an eight-by-eight grid. It is one of the world's most popular games, played by millions of people worldwide at home, in clubs, online, by correspondence, and in tournaments.Each player...

-playing automaton
Automaton
An automaton is a self-operating machine. The word is sometimes used to describe a robot, more specifically an autonomous robot. An alternative spelling, now obsolete, is automation.-Etymology:...

, later revealed to be a very far-reaching and elaborate hoax
Hoax
A hoax is a deliberately fabricated falsehood made to masquerade as truth. It is distinguishable from errors in observation or judgment, or rumors, urban legends, pseudosciences or April Fools' Day events that are passed along in good faith by believers or as jokes.-Definition:The British...

 due to the chess-playing human-being occupying its innards.[4] But while the Turk’s construction was completed in six months, Kempelen’s Speaking Machine occupied the next twenty years of his life.[2] After two conceptual “dead ends” over the first five years of research, Kempelen’s third direction ultimately led him to the design he felt comfortable deeming “final”: a functional representational model of the human vocal tract
Vocal tract
The vocal tract is the cavity in human beings and in animals where sound that is produced at the sound source is filtered....

.[3]

First Design

Kempelen’s first experiment with speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

 involved only the most rudimentary elements of the vocal tract necessary to produce speech-like sounds. A kitchen bellows
Bellows
A bellows is a device for delivering pressurized air in a controlled quantity to a controlled location.Basically, a bellows is a deformable container which has an outlet nozzle. When the volume of the bellows is decreased, the air escapes through the outlet...

, used to stoke fires in wood-burning stoves, was invoked as a set of lung
Lung
The lung is the essential respiration organ in many air-breathing animals, including most tetrapods, a few fish and a few snails. In mammals and the more complex life forms, the two lungs are located near the backbone on either side of the heart...

s to supply the airflow. A reed extracted from a common bagpipe was implemented as the glottis
Glottis
The glottis is defined as the combination of the vocal folds and the space in between the folds .-Function:...

, the source of the raw fundamental sound in the vocal tract. The bell of a clarinet
Clarinet
The clarinet is a musical instrument of woodwind type. The name derives from adding the suffix -et to the Italian word clarino , as the first clarinets had a strident tone similar to that of a trumpet. The instrument has an approximately cylindrical bore, and uses a single reed...

 made for a sufficient mouth, despite its rigid form. This basic model was able to produce simple vowel sounds only, though some additional articulation was possible by positioning one’s hand at the bell opening to obstruct airflow. The physical hardware for constructing the nasals
Nasal consonant
A nasal consonant is a type of consonant produced with a lowered velum in the mouth, allowing air to escape freely through the nose. Examples of nasal consonants in English are and , in words such as nose and mouth.- Definition :...

, plosives and fricatives that most consonants require was not present, however. Kempelen, like many other early pioneers of phonetics
Phonetics
Phonetics is a branch of linguistics that comprises the study of the sounds of human speech, or—in the case of sign languages—the equivalent aspects of sign. It is concerned with the physical properties of speech sounds or signs : their physiological production, acoustic properties, auditory...

, misunderstood the source of the perceived “higher frequencies” of certain sounds as a function of the glottis, rather than as the function of the formants of the entire vocal tract, so he abandoned his single-reed design for a multiple-reed approach.[2][3]

Second Design

The second design involved a console, similar to that of a musical organ
Organ (music)
The organ , is a keyboard instrument of one or more divisions, each played with its own keyboard operated either with the hands or with the feet. The organ is a relatively old musical instrument in the Western musical tradition, dating from the time of Ctesibius of Alexandria who is credited with...

 of the period, in which the operator manned a set of keys, one for each letter. The sounds were produced by a common bellows that fed air through various pipes with the appropriate shapes and obstructions needed to produce that letter. Through experimentation, he came to find that the reed’s resonant length was not crucial to the creation of the high-frequency components of certain vowels and fricatives, so he tuned them all to be the same pitch
Pitch (music)
Pitch is an auditory perceptual property that allows the ordering of sounds on a frequency-related scale.Pitches are compared as "higher" and "lower" in the sense associated with musical melodies,...

 for the sake of consistency between letters. While not all letters were represented at this point, Kempelen had developed the technology required to produce most vowels and several consonants, including the plosive “p”, and the nasal
Nasal consonant
A nasal consonant is a type of consonant produced with a lowered velum in the mouth, allowing air to escape freely through the nose. Examples of nasal consonants in English are and , in words such as nose and mouth.- Definition :...

 “m”, and thus was in a position to begin forming syllables and short words. However, this immediately led to the primary flaw of his second design: the parallel nature of the multiple reeds allowed for more than one letter to be sounded at a time. And in the process of building syllables and words, the sonic “overlap” (now referred to as co-articulation) rendered sounds very uncharacteristic of human speech, undermining the intention of the design altogether. Kempelen comments:

“In order to continue my experiments it was necessary, above all, that I should have a perfect knowledge of what I wanted to imitate. I had to make a formal study of speech and continually consult nature as I conducted my experiments. In this way my talking machine and my theory concerning speech made equal progress, the one serving as guide to the other.”[3]

“It was possible, following the methods I’d been using, to invent separate letters, but never to combine them to form syllables, and that it was absolutely necessary to follow nature which has only one glottis and one mouth, through which every sound emerges and which gives a unity to them.”[2][3]

Thus, Kempelen began work on his third, and ultimately final design, which itself was in many ways a “close-as-possible” representation of the physiology
Physiology
Physiology is the science of the function of living systems. This includes how organisms, organ systems, organs, cells, and bio-molecules carry out the chemical or physical functions that exist in a living system. The highest honor awarded in physiology is the Nobel Prize in Physiology or...

 of the vocal tract.

Third Design

The third approach followed a similar design as the very first, which was conceptually more accurate to the natural design of the human vocal tract than that of the second design. It consisted, like before, of a bellows, a reed and a simulated mouth (this time made of India rubber, for better creation of vowel sounds via manipulation by hand), but also included a “throat” to which a “nasal cavity” was attached (complete with two “nostril
Nostril
A nostril is one of the two channels of the nose, from the point where they bifurcate to the external opening. In birds and mammals, they contain branched bones or cartilages called turbinates, whose function is to warm air on inhalation and remove moisture on exhalation...

s” for making the “n” and “m” sounds), as well as several levers and tubes dedicated to “s” and “sh” sounds, a rod that would interfere with the reeds vibration to make a rolling “r” sound, and a separate, smaller bellows that would allow air to pass the reed while the mouth was completely closed (a feature required for the “b” sound). At one point, a special valve intended to simulate the “f” fricative was included, but was later removed when it was revealed that the same sound could be achieved by simply closing all of the orifices of the machine and allowing air to leak from the cracks. Similarly, at one point in the design, there was an alternate “mouth” assembly consisting of a wooden box with a pair of hinged shutters that acted as lips. Inside the box resided a hinged, wooden, string-operated flap that acted as a tongue. The purpose of this assembly was to mimic the mouth and tongue in the construction of plosives such as “b” and “d”, but was later removed when Kempelen recognized that without a proper tongue, the machine would never be able to produce the “t”, “k”, “d” and “g” sounds. He found his way around this entire problem by replacing the “t” and “k” sounds with the “p” sound, and the “d” and “g” sounds with the “b” sound (which itself was simply a slight variation of the “p” sound). In the context of a familiar word, listeners often ignored the mispronunciation altogether (a phenomenon later explored by researchers in the field of cognitive science
Cognitive science
Cognitive science is the interdisciplinary scientific study of mind and its processes. It examines what cognition is, what it does and how it works. It includes research on how information is processed , represented, and transformed in behaviour, nervous system or machine...

). Kempelen believed that people were more forgiving of the errors made by his machine due to the frequency of the reed and vocal tract resonant length he chose to use, which create a resonance much more like a young child, than that of an adult.[2][3]
This third design, unlike those before it, was completely capable of speaking complete phrases in French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...

, Italian
Italian language
Italian is a Romance language spoken mainly in Europe: Italy, Switzerland, San Marino, Vatican City, by minorities in Malta, Monaco, Croatia, Slovenia, France, Libya, Eritrea, and Somalia, and by immigrant communities in the Americas and Australia...

 and English
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...

 (German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

 was possible, but required a greater skill-level by the operator, due to the more frequent use of consonants in the German language). Its greatest limitation was the bellows, which, although they were six times the capacity of human lungs, ran empty of air much faster than that of its human counterpart. Because the design was based on a single reed as the glottal sound-source, he had none of the problems of co-articulation that came inherently with the second design. But that single reed also meant that the Speaking Machine “spoke” in monotone
Monotone
Monotone refers to a sound, for example speech or music, that has a single unvaried tone.Monotone or monotonicity may also refer to:*Monotone , an open source revision control system*Monotone class theorem, in measure theory...

[4]. Kempelen expended some time to try and introduce several prosodic pitch-variation mechanisms into the reed assembly, but to no avail. He decided to leave the design to be improved upon by the next batch of experimenters.
All of these important additions for the third design came from the two decades of intensive research of the vocal tract in relation to spoken languages by Kempelen, for which the behavior of each crucial physiological element of speech production was scrutinized and replicated acoustically and/or mechanically.[3]

A Significant Contribution

Shortly after the completion and exhibition of his Speaking Machine, in 1804, von Kempelen died, though not before publishing an extremely comprehensive journal of the past twenty years of his research in phonetics. The 456 page book, titled Mechanismus der menschlichen Sprache nebst Beschreibung einer sprechenden Maschine (which translates to The Mechanism of Human Speech, with a Description of a Speaking Machine, published in 1791)[2][4], contained every technical aspect of both Kempelen’s construction of the Speaking Machine (including the preliminary designs) and his studies of the human vocal tract.[3]

In 1837, Sir Charles Wheatstone resurrected the work of Wolfgang von Kempelen, creating an improved replica of his Speaking Machine.[3][4] Using new technology developed over the previous 50 years, Wheatstone was able to further analyze and synthesize components of acoustic speech, giving rise to the second wave of scientific interest in phonetics. After viewing Wheatstone’s improved replica of the Speaking Machine at an exposition, a young Alexander Graham Bell
Alexander Graham Bell
Alexander Graham Bell was an eminent scientist, inventor, engineer and innovator who is credited with inventing the first practical telephone....

 set out to construct his own speaking machine with the help and encouragement of his father.[4][5] Bell’s experiments and research ultimately led to his invention of the telephone
Telephone
The telephone , colloquially referred to as a phone, is a telecommunications device that transmits and receives sounds, usually the human voice. Telephones are a point-to-point communication system whose most basic function is to allow two people separated by large distances to talk to each other...

in 1876[4], which revolutionized global communication.

In 1968, Marcel van den Broecke (University of Amsterdam) built a replica as part of a MA thesis, about which he reported in "Sound Structures", Marcel van den Broecke, Vincent van Heuven and Wim Zonneveld (eds.), chapter 2, p 9-19: "Wolfgang von Kempelen's Speaking Machine as a Performer", Foris Publications, Dordrecht-Netherlands/Cinnaminson-USA, 1983.
Acoustic predictions using N-tube approximations of the vocal tract and applying them to the replica's characteristics showed what had already been established perceptually, namely that the machine could only produce two vowel-like sounds, viz. an /a/-like vowel and an /o/-like vowel. Of the consonants produced, the general purpose plosive is very convincing. A general purpose nasal can also easily be identified, but sibilants and the rattling /r/ are as unpleasant as eye witness von Windisch reported two centuries earlier.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK