Dialog system
Encyclopedia
A dialog system or conversational agent (CA) is a computer system intended to converse with a human, with a coherent structure. Dialog systems have employed text, speech, graphics, haptics, gestures and other modes for communication on both the input and output channel.

What does and does not constitute a dialog system may be debatable. The typical GUI
Gui
Gui or guee is a generic term to refer to grilled dishes in Korean cuisine. These most commonly have meat or fish as their primary ingredient, but may in some cases also comprise grilled vegetables or other vegetarian ingredients. The term derives from the verb, "gupda" in Korean, which literally...

 wizard
Wizard (software)
A software wizard or setup assistant is a user interface type that presents a user with a sequence of dialog boxes that lead the user through a series of well-defined steps. Tasks that are complex, infrequently performed, or unfamiliar may be easier to perform using a wizard...

 does engage in some sort of dialog, but it includes very few of the common dialog system components, and dialog state is trivial.

Components

There are many different architectures for dialog systems. What sets of components are included in a dialog system, and how those components divide up responsibilities differs from system to system. Principal to any dialog system is the dialog manager, which is a component that manages the state of the dialog, and dialog strategy. A typical activity cycle in a dialog system contains the following phases:
  1. The user speaks, and the input is converted to plain text by the system's input recognizer/decoder, which may include:
    • automatic speech recognizer (ASR)
    • gesture recognizer
    • handwriting recognizer
  2. The text is analyzed by a Natural language understanding
    Natural language understanding
    Natural language understanding is a subtopic of natural language processing in artificial intelligence that deals with machine reading comprehension....

    unit (NLU), which may include:
    • Proper Name identification
    • part of speech tagging
    • Syntactic/semantic parser
  3. The semantic information is analyzed by the dialog manager (see section below), along with a task manager that has knowledge of the specific task domain.
  4. The dialog manager produces output using an output generator, which may include:
    • natural language generator
    • gesture generator
    • layout engine
      Layout engine
      A web browser engine, , is a software component that takes marked up content and formatting information and displays the formatted content on the screen. It "paints" on the content area of a window, which is displayed on a monitor or a printer...

  5. Finally, the output is rendered using an output renderer, which may include:
    • text-to-speech engine (TTS)
    • talking head
      Talking head
      Talking head may refer to:Computers and internet*Computer facial animation, area of computer graphics that animates images of the human head and face*Interactive online charactersFilm and television*Talking Head , 1992 film by Mamoru Oshii...

    • robot
      Robot
      A robot is a mechanical or virtual intelligent agent that can perform tasks automatically or with guidance, typically by remote control. In practice a robot is usually an electro-mechanical machine that is guided by computer and electronic programming. Robots can be autonomous, semi-autonomous or...

       or avatar
      Avatar (computing)
      In computing, an avatar is the graphical representation of the user or the user's alter ego or character. It may take either a three-dimensional form, as in games or virtual worlds, or a two-dimensional form as an icon in Internet forums and other online communities. It can also refer to a text...



Dialog systems that are based on a text-only interface (e.g. text-based chat) contain only stages 2-4.

Dialog manager

The dialog manager is the core component of the dialog system. It maintains the history of the dialog, adopts certain dialog strategy (see below), retrieve the content (stored in files
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

 or databases), and decides on the best response to the user. The dialog manager maintains the dialog flow.

The design of the dialog manager evolves over time.
  • finite-state machine
    Finite state machine
    A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

  • frame-based: The system has several slots to be filled. The slots can be filled in any order. This supports mixed-initiative dialog strategy.
  • information-state based

The dialog flow can have the following strategies:
  • System-initiative dialog: The system is in control to guide the dialog at each step.
  • Mixed-initiative dialog: Users can barge in and change the dialog direction. The system follows the user request, but tries to direct the user back the original course. This is the most commonly used dialog strategy in today's dialog systems.
  • User-initiative dialog: The user takes lead, and the system respond to whatever the user directs.
  • Learned strategy: the system's next dialogue action is chosen based on an optimisation method such as Reinforcement Learning


The dialog manager can be connected with an expert system
Expert system
In artificial intelligence, an expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning about knowledge, like an expert, and not by following the procedure of a developer as is the case in...

 to give the ability to respond with specific expertise.

Types of systems

Dialog systems fall into the following categories, which are listed here along a few dimensions. Many of the categories overlap and the distinctions may not be well established.
  • by modality
    Modality (human-computer interaction)
    In human–computer interaction, a modality is the general class of:* a sense through which the human can receive the output of the computer * a sensor or device through which the computer can receive the input from the human...

    • text-based
      Text-based
      Usually used in reference to a computer application, a text-based application is one whose primary input and output are based on text rather than graphics or sound. This does not mean that text-based applications do not have graphics or sound, just that the graphics or sound are secondary to the...

    • spoken dialog system
      Spoken dialog system
      A Spoken dialog system is a dialog system delivered through voice. It has two essential components that do not exist in a text dialog system: a speech recognizer and a text-to-speech module.-Components:* Speech recognizer* Text-to-speech...

    • graphical user interface
      Graphical user interface
      In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

    • multi-modal
      Multimodal interaction
      Multimodal interaction provides the user with multiple modes of interfacing with a system. A multimodal interface provides several distinct tools for input and output of data.- Multimodal input :...

  • by device
    • telephone-based systems
    • PDA
      Personal digital assistant
      A personal digital assistant , also known as a palmtop computer, or personal data assistant, is a mobile device that functions as a personal information manager. Current PDAs often have the ability to connect to the Internet...

       systems
    • in-car systems
    • robot
      Robot
      A robot is a mechanical or virtual intelligent agent that can perform tasks automatically or with guidance, typically by remote control. In practice a robot is usually an electro-mechanical machine that is guided by computer and electronic programming. Robots can be autonomous, semi-autonomous or...

       systems
    • desktop
      Desktop computer
      A desktop computer is a personal computer in a form intended for regular use at a single location, as opposed to a mobile laptop or portable computer. Early desktop computers are designed to lay flat on the desk, while modern towers stand upright...

      /laptop
      Laptop
      A laptop, also called a notebook, is a personal computer for mobile use. A laptop integrates most of the typical components of a desktop computer, including a display, a keyboard, a pointing device and speakers into a single unit...

       systems
      • native
      • in-browser
        Web browser
        A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

         systems
      • in-virtual machine
        Virtual machine
        A virtual machine is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software emulation or hardware virtualization or both together.-VM Definitions:A virtual machine is a software...

    • in-virtual environment
    • robots
  • by style
    • command-based
    • menu
      Menu (computing)
      In computing and telecommunications, a menu is a list of commands presented to an operator by a computer or communications system. A menu is used in contrast to a command-line interface, where instructions to the computer are given in the form of commands .Choices given from a menu may be selected...

      -driven
    • natural language
      Natural language
      In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...

    • speech graffiti
  • by initiative
    • system initiative
    • user initiative
    • mixed initiative

Applications

Dialog systems can support a broad range of applications in business enterprises, education, government, healthcare, and entertainment. For example:
  • Responding to customers' questions about products and services via a company’s website or intranet portal
    Intranet portal
    An intranet portal is the gateway that unifies access to all enterprise information and applications on an intranet. It is a tool that helps a company manage its data, applications, and information more easily, and through personalized views. Some portal solutions today are able to integrate legacy...

  • Customer service agent knowledge base
    Knowledge base
    A knowledge base is a special kind of database for knowledge management. A Knowledge Base provides a means for information to be collected, organised, shared, searched and utilised.-Types:...

    : Allows agents to type in a customer’s question and guide them with a response
  • Guided selling
    Guided selling
    Guided selling is a process that helps potential buyers of products or services to choose the product best fulfilling their needs and hopefully guides the buyer to buy. It also helps vendors of products Guided selling is a process that helps potential buyers of products or services to choose the...

    : Facilitating transactions by providing answers and guidance in the sales process, particularly for complex products being sold to novice customers
  • Help desk
    Help desk
    A help desk is an information and assistance resource that troubleshoots problems with computers or similar products. Corporations often provide help desk support to their customers via a toll-free number, website and e-mail. There are also in-house help desks geared toward providing the same kind...

    : Responding to internal employee questions, e.g., responding to HR questions
  • Website navigation: Guiding customers to relevant portions of complex websites --a Website concierge
  • Technical support: Responding to technical problems, such as diagnosing a problem with a product or device
  • Personalized service: Conversational agents can leverage internal and external databases to personalize interactions, such as answering questions about account balances, providing portfolio information, delivering frequent flier or membership information, for example
  • Training or education: They can provide problem-solving advice while the user learns
  • Simple dialog systems are widely used to decrease human workload in call centre
    Call centre
    A call centre or call center is a centralised office used for the purpose of receiving and transmitting a large volume of requests by telephone. A call centre is operated by a company to administer incoming product support or information inquiries from consumers. Outgoing calls for telemarketing,...

    s. In this and other industrial telephony applications, the functionality provided by dialog systems is known as interactive voice response
    Interactive voice response
    Interactive voice response is a technology that allows a computer to interact with humans through the use of voice and DTMF keypad inputs....

     or IVR.


In some cases, conversational agents can interact with users using artificial characters. These agents are then referred to as embodied agents
Embodied agents
In artificial intelligence, an embodied agent, also sometimes referred to as an interface agent, is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically with a body, for example a human or a cartoon...

.

Toolkits and architectures

A survey of current frameworks, languages and technologies for defining dialog systems.
Name & Links System Type Description Affiliation[s] Environment[s] Comments
AIML
AIML
AIML, or Artificial Intelligence Markup Language, is an XML dialect for creating natural language software agents.- Background :The XML dialect called AIML was developed by Richard Wallace and a worldwide free software community between the years of 1995 and 2002...

 
Chatterbot
Chatterbot
A chatter robot, chatterbot, chatbot, or chat bot is a computer program designed to simulate an intelligent conversation with one or more human users via auditory or textual methods, primarily for engaging in small talk. The primary aim of such simulation has been to fool the user into thinking...

 language
XML dialect for creating natural language software agents Richard Wallace
Richard Wallace (scientist)
Richard Wallace is the author of AIML and Botmaster of ALICE . Dr. Wallace's work has appeared in the New York Times, WIRED, CNN, ZDTV and in numerous foreign language publications across Asia, Latin America and Europe.Richard Wallace was born in Portland, Maine in 1960. He earned his Ph.D...

 
CSLU Toolkit
CSLU Toolkit
The CSLU Toolkit is a software library comprising a comprehensive suite of tools that enable exploration, learning, and research into speech and human-computer interaction.The tools include:* Audio* Display* Speech recognition* Speech generation...


a state-based speech interface prototyping environment OGI School of Science and Engineering
OGI School of Science and Engineering
The OGI School of Science and Engineering, located in Hillsboro, Oregon, United States is one of the four schools of the Oregon Health and Science University . Until June 2001, it functioned independently as a public graduate school, the Oregon Graduate Institute . OGI operates four departments and...


M. McTear
Ron Cole
publications are from 1999.
VXML
Voice XML
Spoken dialog multimodal dialog markup language developed initially by AT&T
AT&T
AT&T Inc. is an American multinational telecommunications corporation headquartered in Whitacre Tower, Dallas, Texas, United States. It is the largest provider of mobile telephony and fixed telephony in the United States, and is also a provider of broadband and subscription television services...

 then administered by an industry consortium and finally a W3C
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...

 specification
Example primarily for telephony.
SALT
Speech Application Language Tags
Speech Application Language Tags is an XML based markup language that is used in HTML and XHTML pages to add voice recognition capabilities to web based applications.-Description:...

 
markup language multimodal dialog markup language Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

 
"has not reached the level of maturity of VoiceXML in the standards process".
Quack.com
Quack.com
AOLByPhone was an AOL interactive voice service that began in 2000. It was offered to millions of consumers. AOLByPhone started with the America Online acquisition of Quack.com, evolving through the subsequent relaunching of Quack.com's Voice Portal as AOLByPhone. AOLbyPhone expanded as AOL...

 - QXML
Development Environment company bought by AOL
AOL
AOL Inc. is an American global Internet services and media company. AOL is headquartered at 770 Broadway in New York. Founded in 1983 as Control Video Corporation, it has franchised its services to companies in several nations around the world or set up international versions of its services...

 

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK