Pronunciation Lexicon Specification - AbsoluteAstronomy.com

The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition

Speech recognition

Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

and speech synthesis

Speech synthesis

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

engines within voice browsing applications. The language is intended to be easy to use by developers while supporting the accurate specification of pronunciation information for international use.

The language allows one or more pronunciations for a word or phrase to be specified using a standard pronunciation alphabet or if necessary using vendor specific alphabets. Pronunciations are grouped together into a PLS document which may be referenced from other markup languages, such as the Speech Recognition Grammar Specification SRGS and the Speech Synthesis Markup Language SSML

Speech Synthesis Markup Language

Speech Synthesis Markup Language is an XML-based markup language for speech synthesis applications. It is a recommendation of the W3C's voice browser working group. SSML is often embedded in VoiceXML scripts to drive interactive telephony systems. However, it also may be used alone, such as for...

Usage

Here is an example PLS document:

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-US">

judgment
judgement
ˈdʒʌdʒ.mənt

fiancé
fiance
fiˈɒns.eɪ

ˌfiː.ɑːnˈseɪ

which could be used to improve TTS

Speech synthesis

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

as shown in the following SSML 1.0

Speech Synthesis Markup Language

document:

xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
xml:lang="en-US">

In the judgement of my fiancé, Las Vegas is the best place for a honeymoon.
I replied that I preferred Venice and didn't think the Venetian casino was an
acceptable compromise.

but also to improve ASR

ASR

ASR may refer to:* Asr, the daily afternoon prayer in Islam* ASR Records, a record label* Academy at Swift River, a coeducational therapeutic boarding school for teenagers...

in the following SRGS 1.0 grammar:

xmlns="http://www.w3.org/2001/06/grammar"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/06/grammar
http://www.w3.org/TR/speech-grammar/grammar.xsd"
xml:lang="en-US" root="movies" mode="voice">

Terminator 2: Judgment Day
My Big Fat Obnoxious Fiance
Pluto's Judgement Day

Multiple pronunciations for the same orthography

For ASR

ASR

ASR may refer to:* Asr, the daily afternoon prayer in Islam* ASR Records, a record label* Academy at Swift River, a coeducational therapeutic boarding school for teenagers...

systems it is common to rely on multiple pronunciations of the same word or phrase in order to cope with variations of pronunciation within a language. In the Pronunciation Lexicon language, multiple pronunciations are represented by more than one (or ) element within the same element.

In the following example the word "Newton" has two possible pronunciations.

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-GB">

Newton
ˈnjuːtən

ˈnuːtən

Multiple orthographies

In some situations there are alternative textual representations for the same word or phrase. This can arise due to a number of reasons. See Section 4.5 of PLS for details. Because these are representations that have the same meaning (as opposed to homophones), it is recommended that they be represented using a single element that contains multiple graphemes.

Here are two simple examples of multiple orthographies: alternative spelling of an English word and multiple writings of a Japanese word.

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-US">

colour
color
ˈkʌlər

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="jp">

nihongo
日本語
にほんご
ɲihoŋo

Homophones

Most languages have homophones, words with the same pronunciation but different meanings (and possibly different spellings), for instance "seed" and "cede". It is recommended that these be represented as different lexemes.

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-US">

cede
siːd

seed
siːd

Homographs

Most languages have words with different meanings but the same spelling (and sometimes different pronunciations), called homographs. For example, in English the word bass (fish) and the word bass (in music) have identical spellings but different meanings and pronunciations. Although it is recommended that these words be represented using separate elements that are distinguished by different values of the role attribute (see Section 4.4 of PLS 1.0), if a pronunciation lexicon author does not want to distinguish between the two words they could simply be represented as alternative pronunciations within the same element. In the latter case the TTS

Speech synthesis

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

processor will not be able to distinguish when to apply the first or the second transcription.

In this example the pronunciations of the homograph "bass" are shown.

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-US">

bass
bæs

beɪs

Note that English contains numerous examples of noun-verb pairs that can be treated either as homographs or as alternative pronunciations, depending on author preference. Two examples are the noun/verb "refuse" and the noun/verb "address".

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
xmlns:mypos="http://www.example.org/my_pos_namespace"
alphabet="ipa" xml:lang="en-US">

refuse
rɪˈfjuːz

refuse
ˈrefjuːs

Pronunciation by Orthography (Acronyms, Abbreviations, etc.)

For some words and phrases pronunciation can be expressed quickly and conveniently as a sequence of other orthographies. The developer is not required to have linguistic knowledge, but instead makes use of the pronunciations that are already expected to be available. To express pronunciations using other orthographies the element may be used.

This feature may be very useful to deal with acronym expansion.

xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon
http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
alphabet="ipa" xml:lang="en-US">

W3C
World Wide Web Consortium

101
one hundred and one

Thailand
tie land

BBC 1
be be sea one

External links

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.