MBROLA
Encyclopedia
MBROLA is an algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 for speech synthesis
Speech synthesis
Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware...

, and software which is distributed at no financial cost but in binary
Binary file
A binary file is a computer file which may contain any type of data, encoded in binary form for computer storage and processing purposes; for example, computer document files containing formatted text...

 form only, and a worldwide collaborative project. The MBROLA project web page provides diphone databases for a large number of spoken language
Language
Language may refer either to the specifically human capacity for acquiring and using complex systems of communication, or to a specific instance of such a system of complex communication...

s.

The MBROLA software is not a complete text-to-speech system for all those languages; the text
Plain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....

 must first be transformed into phoneme
Phoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....

 and prosodic information in MBROLA's format, and separate software to do this is available for some but not all of MBROLA's languages and can require extra setup.

Although diphone
Diphone
In phonetics, a diphone is an adjacent pair of phones. It is usually used to refer to a recording of the transition between two phones.In the following diagram, a stream of phones are represented by P1, P2, etc., and the corresponding diphones are represented by D1-2, D2-3, etc:...

-based, the quality of MBROLA's synthesis is considered to be higher than that of most diphone synthesisers; this is due in part to the fact that it is based on a preprocessing of diphones (imposing constant pitch
Pitch (music)
Pitch is an auditory perceptual property that allows the ordering of sounds on a frequency-related scale.Pitches are compared as "higher" and "lower" in the sense associated with musical melodies,...

 and harmonic
Harmonic
A harmonic of a wave is a component frequency of the signal that is an integer multiple of the fundamental frequency, i.e. if the fundamental frequency is f, the harmonics have frequencies 2f, 3f, 4f, . . . etc. The harmonics have the property that they are all periodic at the fundamental...

 phases), which enhances their concatenation while only slightly degrading their segmental quality.

MBROLA is a time-domain algorithm, as PSOLA
PSOLA
PSOLA is a digital signal processing technique used for speech processing and more specifically speech synthesis. It can be used to modify the pitch and duration of a speech signal....

, which implies very low computational load at synthesis time. Unlike PSOLA, however, MBROLA does not require a preliminary marking of pitch periods. This feature has made it possible to develop the MBROLA project around the MBROLA algorithm, through which many speech research labs
Laboratory
A laboratory is a facility that provides controlled conditions in which scientific research, experiments, and measurement may be performed. The title of laboratory is also used for certain other facilities where the processes or equipment used are similar to those in scientific laboratories...

, companies, or individual
Individual
An individual is a person or any specific object or thing in a collection. Individuality is the state or quality of being an individual; a person separate from other persons and possessing his or her own needs, goals, and desires. Being self expressive...

s around the world have provided diphone database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s for many languages and voices (the number of which is by far a world record
World record
A world record is usually the best global performance ever recorded and verified in a specific skill or sport. The book Guinness World Records collates and publishes notable records of all types, from first and best to worst human achievements, to extremes in the natural world and beyond...

 for speech synthesis, but there are some notable omissions such as Chinese
Chinese speech synthesis
Chinese speech synthesis is the application of speech synthesis to the Chinese language . It poses additional difficulties due to the Chinese characters , the complex prosody which is essential to convey the meaning of words, and sometimes the difficulty in obtaining agreement among native...

).
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK