Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
NaturallySpeaking

NaturallySpeaking

Overview
Dragon NaturallySpeaking is a speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker - as is the case for most desktop recognition software, hence there is an aspect of speaker recognition,...

 software package developed by Dragon Systems, and sold by Nuance Communications
Nuance Communications
Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

 for Windows
Microsoft Windows
Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces...

 personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end user, with no intervening computer operator...

s (PCs). It was among the first programs to make speech recognition practical on a PC.

NaturallySpeaking uses a minimal visual interface. Dictated words appear in a floating tooltip
Tooltip
The tooltip is a common graphical user interface element. It is used in conjunction with a cursor, usually a mouse pointer. The user hovers the cursor over an item, without clicking it, and a tooltip may appear — a small "hover box" with information about the item being hovered over.-Variants:A...

 as they are spoken, and when the speaker pauses, the program transcribes
Transcription (linguistics)
Transcription is the conversion into written, typewritten or printed form, of a spoken-language source, as in the proceedings of a court hearing. It can also mean the conversion of a written source into another medium, as by scanning books and making digital versions...

 the words into the active window at the location of the cursor.
Discussion
Ask a question about 'NaturallySpeaking'
Start a new discussion about 'NaturallySpeaking'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
Dragon NaturallySpeaking is a speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker - as is the case for most desktop recognition software, hence there is an aspect of speaker recognition,...

 software package developed by Dragon Systems, and sold by Nuance Communications
Nuance Communications
Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

 for Windows
Microsoft Windows
Microsoft Windows is a series of software operating systems and graphical user interfaces produced by Microsoft. Microsoft first introduced an operating environment named Windows in November 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces...

 personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end user, with no intervening computer operator...

s (PCs). It was among the first programs to make speech recognition practical on a PC.

NaturallySpeaking uses a minimal visual interface. Dictated words appear in a floating tooltip
Tooltip
The tooltip is a common graphical user interface element. It is used in conjunction with a cursor, usually a mouse pointer. The user hovers the cursor over an item, without clicking it, and a tooltip may appear — a small "hover box" with information about the item being hovered over.-Variants:A...

 as they are spoken, and when the speaker pauses, the program transcribes
Transcription (linguistics)
Transcription is the conversion into written, typewritten or printed form, of a spoken-language source, as in the proceedings of a court hearing. It can also mean the conversion of a written source into another medium, as by scanning books and making digital versions...

 the words into the active window at the location of the cursor. Like other speech recognition software, NaturallySpeaking has three primary areas of functionality. Dictation, whereby spoken language is transcribed to written text; commands that control, whereby spoken language is recognized as a command to click widgets
Widget (computing)
In computer programming, a widget is an element of a graphical user interface that displays an information arrangement changeable by the user, such as a window or a text box. The defining characteristic of a widget is to provide a single interaction point for the direct manipulation of a given...

 (controls); and finally text-to-speech whereby written text is converted to synthesized audio stream. Early versions of the software had to be trained for approximately 10 minutes to recognize the user's voice, though in version 9 that requirement was dropped.

Voice profiles can be accessed through different computers in a networked environment, however the audio hardware and configuration must be identical on both the original and secondary machine.

Nuance claims that using NaturallySpeaking, writing a 900 word essay would take 6 minutes, while typing 40 words per minute and writing a 900 word essay would take 22 minutes.

Nuance has released Dragon NaturallySpeaking 10.1, which support Windows Vista 64-bit, in the end of March 2009.

History



NaturallySpeaking has passed through four companies and evolved considerably since its first beginnings in the early 1980s as a research prototype called DRAGON. The married couple Dr. James Baker and Dr. Janet Baker founded Dragon Systems in 1982, deciding to commercialize DRAGON when their funding was cut by DARPA. Their first product DragonDictate
DragonDictate
DragonDictate was the original speech recognition application from Dragon Systems and uses discrete speech where the user must pause between speaking each word.Dragon NaturallySpeaking allows continuous speech recognition....

 was sold for a number of years. Dr. James Baker departed from the conventional AI
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science which aims to create it. Textbooks define the field as "the study and design of intelligent agents,"...

, and was a pioneer in Hidden Markov models, a way of using statistics for recognition of speech. Dr. Janet Baker developed the expert system
Expert system
An expert system is software that attempts to provide an answer to a problem, or clarify uncertainties where normally one or more human experts would need to be consulted. Expert systems are most common in a specific problem domain, and is a traditional application and/or subfield of artificial...

 named Hearsay
Hearsay
Hearsay is information gathered by Person A from Person B concerning some event, condition, or thing of which Person A had no direct experience. When submitted as evidence, such statements are called hearsay evidence. As a legal term, "hearsay" can also have the narrower meaning of the use of such...

.

In March of 1990, Dragon Systems began selling DragonDictate (for DOS) at a cost of $9000 for a single-user license. As hardware became less expensive over the next several years the price decreased, and by 1997 the price of DragonDictate for Windows was about $2000. The hardware during this period was not yet powerful enough to address the difficult problem of word segmentation, and DragonDictate was unable to determine the boundaries of words in the continuous signal that constitute human voice. Users had to pronounce one word at a time, each clearly separated by a small pause before the next. DragonDictate was based on a trigram
Trigram
Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.-Examples:The sentence "the quick red fox jumps over the lazy brown dog" has the following word level trigrams:...

 model, and is known as a discrete
Discrete signal
A discrete signal or discrete-time signal is a time series consisting of a sequence of quantities. In other words, it is a time series that is a function over a domain of discrete integers...

 speech recognition engine.

In 1997 advances in hardware technology allowed NaturallySpeaking version 1.0 to launch as the first available continuous dictation system. During this time the speech recognition industry promoted enthusiastically the notion that speech input was "the" natural modality that would eventually supersede more "primitive" methods such as keyboards. Trying to reach a mass market, vendors dropped prices to levels that were unsustainable.

Lernout & Hauspie
Lernout & Hauspie
Lernout & Hauspie Speech Products, or L&H, was a leading Belgium-based speech recognition technology company, founded by Jo Lernout and Pol Hauspie, that went bankrupt in 2001...

 bought Dragon Systems in June of 2000 for stock then valued at about $600 million. The dictation system bubble
Economic bubble
An economic bubble is “trade in high volumes at prices that are considerably at variance with intrinsic values”...

 burst in 2001, and Lernout & Hauspie went bankrupt. ScanSoft Inc. bought the rights to Dragon products. In 2005, ScanSoft merged with Nuance Communications
Nuance Communications
Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

, and changed the name of the combined entity to Nuance
Nuance Communications
Nuance Communications is a multinational computer software technology corporation, headquartered in Burlington, Massachusetts, USA, that provides speech and imaging applications...

.

The software today is advertised as 99% accurate.

Issues

  • Dragon NaturallySpeaking (DNS) version 10.0 does not work for any processors that do not support the sse2 instruction set. That excludes all AMD Socket 462 processors, many of which are rated well above the supposed minimum requirements.
  • DNS 9.0 up to DNS 9.5 can be installed on Windows XP 64 and Windows Vista 64. There is a fix to force an install on Windows Vista – see this external link or TechSideStories Link. But in the majority of cases the program will not work as expected.
  • The process of contacting Nuance Technical support through their website requires a $US10 fee charged to a credit card before any useful dialogue concerning a possibly already well known bug in their software for which a fix has not yet been provided. The support is free in Europe, the Middle East and Africa.
  • Dragon Naturally Speaking (DNS) version 10 during initial installation on a Windows XP machine places its shared library data files into the C:Documents and Settings\\Local Settings\Temp\folder_with_a_random_name location, instead of using a dedicated location outside of a "temporary files folder" where the shared libraries would be safe from third party disk cleaning utilities that purge old and normally useless data. Nuance have not provided an updated installer to correct this problem.
  • A user who incautiously attempts to edit by voice while composing may confuse the system into considering the first phrase and the replacement phrase as the same, lowering accuracy.

Versions

Version Release date Editions
1.0 June 1997 Personal
2.0 November 1997 Standard, Preferred, Deluxe
3.0 October 1998 Point & Speak, Standard, Preferred, Professional (with optional Legal and Medical add-on products)
3.01 Teens
4.0 August 4, 1999 Essentials,Standard, Preferred, Professional, Legal, Medical, Mobile
5.0 August 2000 Essentials, Standard, Preferred, Professional, Legal, Medical
6.0 November 15, 2001 Essentials, Standard, Preferred, Professional, Legal, Medical
7.0 March 2003 Essentials, Standard, Preferred, Professional, Legal, Medical
8.0 November 2004 Essentials, Standard, Preferred, Professional, Legal, Medical
9.0 July 2006 Standard, Preferred, Professional, Legal, Medical, SDK client, SDK server
9.1 ?? Standard, Preferred, Professional, Legal, Medical, SDK client, SDK server
9.5 January 2007 Standard, Preferred, Professional, Legal, Medical, SDK client, SDK server
10.0 August 7, 2008 Standard, Preferred, Professional, Legal, Medical

External links

  • http://www.washingtonpost.com/wp-dyn/content/article/2008/09/29/AR2008092903046.html