Home      Discussion      Topics      Dictionary      Almanac
Signup       Login
Voice User Interface

Voice User Interface

Overview
A Voice User Interface (VUI) makes human interaction with computers possible through a voice/speech platform in order to initiate an automated service or process.

The VUI is the interface to any speech application. Controlling a machine by simply talking to it was science fiction only a short time ago. Until recently, this area was considered to be artificial intelligence. However, with advances in technology, VUIs have become more commonplace, and people are taking advantage of the value that these hands-free, eyes-free interfaces provide in many situations.

However, VUIs are not without their challenges.
Discussion
Ask a question about 'Voice User Interface'
Start a new discussion about 'Voice User Interface'
Answer questions from other users
Full Discussion Forum
 
Encyclopedia
A Voice User Interface (VUI) makes human interaction with computers possible through a voice/speech platform in order to initiate an automated service or process.

The VUI is the interface to any speech application. Controlling a machine by simply talking to it was science fiction only a short time ago. Until recently, this area was considered to be artificial intelligence. However, with advances in technology, VUIs have become more commonplace, and people are taking advantage of the value that these hands-free, eyes-free interfaces provide in many situations.

However, VUIs are not without their challenges. People have very little patience for a "machine that doesn't understand". Therefore, there is little room for error: VUIs need to respond to input reliably, or they will be rejected and often ridiculed by their users. Designing a good VUI requires interdisciplinary talents of computer science, linguistics and human-factors psychology - all of which are skills that are expensive and hard to come by. Even with advanced development tools, constructing an effective VUI requires an in-depth understanding of both the tasks to be performed, as well as the target audience that will use the final system. The closer the VUI matches the user's mental model of the task, the easier it will be to use with little or no training, resulting in both higher efficiency and higher user satisfaction.

The characteristics of the target audience are very important. For example, a VUI designed for the general public should emphasize ease of use and provide a lot of help and guidance for first-time callers. In contrast, a VUI designed for a small group of power users (including field service workers), should focus more on productivity and less on help and guidance. Such applications should streamline the call flows, minimize prompts, eliminate unnecessary iterations and allow elaborate "mixed initiative dialogs", which enable callers to enter several pieces of information in a single utterance and in any order or combination. In short, speech applications have to be carefully crafted for the specific business process that is being automated.

Not all business processes render themselves equally well for speech automation. In general, the more complex the inquiries and transactions are, the more challenging they will be to automate, and the more likely they will be to fail with the general public. In some scenarios, automation is simply not applicable, so live agent assistance is the only option. A legal advice hot line, for example, would be very difficult to automate. On the flip side, speech is perfect for handling quick and routine transactions, like changing the status of a work order, completing a time or expense entry, or transferring funds between accounts.

Future Uses


Pocket-size devices, such as PDA
PDA
-Science and technology :* Personal digital assistant, an electronic device which can include some of the functions of a computer, a cellphone, a music player, and a camera* Patent ductus arteriosus, a heart defect* photodiode array, a type of detector...

s or mobile phone
Mobile phone
A mobile phone or mobile is a long-range, electronic device used for mobile telecommunications...

s, currently rely on small buttons for user input. These are either built into the device or are part of a touch-screen interface, such as that of the Apple iPod Touch
IPod Touch
The iPod Touch is a portable media player, personal digital assistant, and Wi-Fi mobile platform designed and marketed by Apple Inc. The product was launched on September 5, 2007, at an event called The Beat Goes On. The iPod Touch adds the multi-touch graphical user interface to the iPod line...

 and iPhone
IPhone
The iPhone is an Internet and multimedia enabled smartphone designed and marketed by Apple Inc. Because its minimal hardware interface lacks a physical keyboard, the multi-touch screen renders a virtual keyboard when necessary...

. Extensive button-pressing on devices with such small buttons can be tedious and inaccurate, so an easy-to-use, accurate, and reliable VUI would potentially be a major breakthrough in the ease of their use. Nonetheless, such a VUI would also benefit users of laptop
Laptop
A laptop is a personal computer designed for mobile use and small and light enough to sit on one's lap while in use. A laptop integrates most of the typical components of a desktop computer, including a display, a keyboard, a pointing device , speakers, and often including a battery, into a single...

- and desktop
Desktop computer
A desktop computer is a personal computer in a form intended for regular use at a single location, as opposed to a mobile laptop or portable computer. Prior to the wide spread of microprocessors, a computer that could fit on a desk was considered remarkably small...

-sized computers, as well, as it would solve numerous problems currently associated with keyboard
Keyboard (computing)
In computing, a keyboard is an input device, partially modeled after the typewriter keyboard, which uses an arrangement of buttons or keys, to act as mechanical levers or electronic switches. A keyboard typically has characters engraved or printed on the keys and each press of a key typically...

 and mouse
Mouse (computing)
In computing, a mouse is a pointing device that functions by detecting two-dimensional motion relative to its supporting surface. Physically, a mouse consists of an object held under one of the user's hands, with one or more buttons...

 use, including repetitive-strain injuries such as carpal tunnel syndrome
Carpal tunnel syndrome
Carpal tunnel syndrome , or median neuropathy at the wrist, is a medical condition in which the median nerve is compressed at the wrist, leading to paresthesias, numbness and muscle weakness in the hand. Night symptoms and waking at night is a characteristic of established carpal tunnel syndrome...

 and slow typing speed on the part of inexperienced keyboard users. Moreover, keyboard use typically entails either sitting or standing stationary in front of the connected display; by contrast, a VUI would free the user to be far more mobile, as speech input eliminates the need to look at a keyboard.

Such developments could literally change the face of current machines and have far-reaching implications on how users interact with them. Hand-held devices would be designed with larger, easier-to-view screens, as no keyboard would be required. Touch-screen devices would no longer need to split the display between content and an on-screen keyboard, thus providing full-screen viewing of the content. Laptop computers could essentially be cut in half in terms of size, as the keyboard half would be eliminated and all internal components would be integrated behind the display, effectively resulting in a simple tablet computer. Desktop computers would consist of a CPU and screen, saving desktop space otherwise occupied by the keyboard and eliminating sliding keyboard rests built under the desk's surface. Television remote control
Remote control
A remote control is a component of an electronics device, most commonly a television set, used for operating the device wirelessly from a short line-of-sight distance....

s and keypads on dozens of other devices, from microwave ovens to photocopiers, could also be eliminated.

Numerous challenges would have to be overcome, however, for such developments to occur. First, the VUI would have to be sophisticated enough to distinguish between input, such as commands, and background conversation; otherwise, false input would be registered and the connected device would behave erratically. A standard prompt, such as the famous "Computer!" call by characters in science fiction TV shows and films such as Star Trek
Star Trek
Star Trek is an American science fiction entertainment series.The original Star Trek was an American television series, created by Gene Roddenberry, which debuted in 1966 and ran for three seasons, following the interstellar adventures of Captain James T. Kirk and the crew of the Federation...

, could activate the VUI and prepare it to receive further input by the same speaker. Conceivably, the VUI could also include a human-like representation: a voice or even an on-screen character, for instance, that responds back (e.g., "Yes, Samantha?") and continues to communicate back and forth with the user in order to clarify the input received and ensure accuracy.

Second, the VUI would have to work in concert with highly sophisticated software in order to accurately process and find/retrieve information or carry out an action as per the particular user's preferences. For instance, if Samantha prefers information from a particular newspaper, and if she prefers that the information be summarized in point-form, she might say, "Computer, find me some information about the flooding in southern China last night"; in response, the VUI that is familiar with her preferences would "find" facts about "flooding" in "southern China" from that source, convert it into point-form, and deliver it to her on screen and/or in voice form, complete with a citation. Therefore, accurate speech-recognition software
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker - as is the case for most desktop recognition software, hence there is an aspect of speaker recognition,...

, along with some degree of artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science which aims to create it. Textbooks define the field as "the study and design of intelligent agents,"...

 on the part of the machine associated with the VUI, would be required.

See also

  • User interface
    User interface
    The user interface is the aggregate of means by which people—the users—interact with the system—a particular machine, device, computer program or other complex tool...

  • User interface engineering
  • Speech recognition
    Speech recognition
    Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to speech recognition where the recognition system is trained to a particular speaker - as is the case for most desktop recognition software, hence there is an aspect of speaker recognition,...

  • List of speech recognition software
  • Voice browser
    Voice browser
    A voice browser is a web browser that presents an interactive voice user interface to the user. In addition, it typically provides an interface to the PSTN or a PBX. Just as a visual web browser works with HTML pages, a voice browser operates on pages that specify voice dialogues...


External links