Julius (software)
Encyclopedia
Julius is an open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...

 engine.

Julius is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word 3-gram
Trigram
Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.-Frequency:The 16 most common trigrams in English are:-Examples:...

 and context-dependent HMM
Hidden Markov model
A hidden Markov model is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. An HMM can be considered as the simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E...

, it can perform almost real-time decoding on most current PCs in 60k word dictation task. Major search techniques are fully incorporated. It is also modularized carefully to be independent from model structures, and various HMM types are supported such as shared-state triphone
Triphone
In linguistics, a triphone is a sequence of three phonemes. Triphones are useful in models of natural language processing where they are used to establish the various contexts in which a phoneme can occur in a particular natural language....

s and tied-mixture models, with any number of mixtures, states, or phones. Standard formats are adopted to cope with other free modeling toolkit. The main platform is Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 and other Unix workstations, and also works on Windows. Julius is open source and distributed with a revised BSD style license.

Julius has been developed as part of a free software toolkit for Japanese LVCSR research since 1997, and the work has been continued at Continuous Speech Recognition Consortium (CSRC), Japan from 2000 to 2003.

From rev.3.4, a grammar-based recognition parser named "Julian" is integrated into Julius. Julian is a modified version of Julius that uses hand-designed DFA
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

 grammar as a language model. It can be used to build a kind of voice command system of small vocabulary, or various spoken dialog system
Dialog system
A dialog system or conversational agent is a computer system intended to converse with a human, with a coherent structure. Dialog systems have employed text, speech, graphics, haptics, gestures and other modes for communication on both the input and output channel.What does and does not constitute...

 tasks.

About Models

To run the Julius recognizer, you need a language model
Language model
A statistical language model assigns a probability to a sequence of m words P by means of a probability distribution.Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information...

 and an acoustic model
Acoustic Model
An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. It is used by a speech recognition engine to recognize speech....

 for your language.

Julius adopts acoustic models in HTK
HTK (software)
HTK is software toolkit for handling HMMs. It is mainly intended for speech recognition, but has been used in many other pattern recognition applications that employ HMMs.-External links:** using the TIMIT speech corpus...

 ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 format, pronunciation dictionary in HTK-like format, and word 3-gram language models in ARPA standard format (forward 2-gram and reverse 3-gram as trained from speech corpus
Speech corpus
A speech corpus is a database of speech audio files and text transcriptions.In Speech technology, speech corpora are used, among other things, to create acoustic models ....

 with reversed word order).

Although Julius is only distributed with Japanese models, the VoxForge
VoxForge
VoxForge is a free speech corpus and acoustic model repository for open source speech recognition engines.VoxForge was set up to collect transcribed speech to create a free GPL speech corpus for use with open source speech recognition engines...

project is working on creating English acoustic models for use with the Julius Speech Recognition Engine.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK