Action selection - AbsoluteAstronomy.com

Action selection is a way of characterizing the most basic problem of intelligent systems: what to do next. In artificial intelligence

Artificial intelligence

Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

and computational cognitive science

Cognitive science

Cognitive science is the interdisciplinary scientific study of mind and its processes. It examines what cognition is, what it does and how it works. It includes research on how information is processed , represented, and transformed in behaviour, nervous system or machine...

, "the action selection problem" is typically associated with intelligent agents and animat

Animat

Animats are artificial animals, a contraction of anima-materials. The term includes physical robots and virtual simulations. Animat research, a subset of Artificial Life studies, has become rather popular since Rodney Brooks' seminal paper "Intelligence without representation". The word was coined...

s—artificial systems that exhibit complex behaviour in an agent environment. The term is also sometimes used in ethology

Ethology

Ethology is the scientific study of animal behavior, and a sub-topic of zoology....

or animal behavior.

One problem for understanding action selection is determining the level of abstraction used for specifying an "act". At the most basic level of abstraction, an atomic

Atomic

Atomic may refer to:* Of or relating to the Atom, the smallest particle of a chemical element that retains its chemical properties* Atomic Age, also known as the "Atomic Era"* Atomic , an Australian computing and technology magazine...

act could be anything from contracting a muscle cell to provoking a war. Typically for any one action-selection mechanism, the set of possible actions is predefined and fixed.

Most researchers working in this field place high demands on their agents:

The acting agent
Intelligent agent
In artificial intelligence, an intelligent agent is an autonomous entity which observes through sensors and acts upon an environment using actuators and directs its activity towards achieving goals . Intelligent agents may also learn or use knowledge to achieve their goals...

typically must select its action in dynamic and unpredictable environments.
The agents typically act in real time
Real-time computing
In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

; therefore they must make decisions in a timely fashion.
The agents are normally created to perform several different tasks. These tasks may conflict for resource allocation (e.g. can the agent put out a fire and deliver a cup of coffee at the same time?)
The environment the agents operate in may include humans, who may make things more difficult for the agent (either intentionally or by attempting to assist.)
The agents themselves are often intended to model animals and/or humans, and animal/human behaviour
Behavior
Behavior or behaviour refers to the actions and mannerisms made by organisms, systems, or artificial entities in conjunction with its environment, which includes the other systems or organisms around as well as the physical environment...

is quite complicated.

For these reasons action selection is not trivial and attracts a good deal of research.

Characteristics of the action selection problem

The main problem for action selection is complexity

Complexity

In general usage, complexity tends to be used to characterize something with many parts in intricate arrangement. The study of these complex linkages is the main goal of complex systems theory. In science there are at this time a number of approaches to characterizing complexity, many of which are...

. Since all computation

Computation

Computation is defined as any type of calculation. Also defined as use of computer technology in Information processing.Computation is a process following a well-defined model understood and expressed in an algorithm, protocol, network topology, etc...

takes both time and space (in memory), agents cannot possibly consider every option available to them at every instant in time. Consequently, they must be biased, and constrain their search in some way. For AI, the question of action selection is what is the best way to constrain this search? For biology and ethology, the question is how do various types of animals constrain their search? Do all animals use the same approaches? Why do they use the ones they do?

One fundamental question about action selection is whether it is really a problem at all for an agent, or whether it is just a description of an emergent

Emergent

It may also mean:* Emergent , Neural Simulation Software* Emergent , a 2003 album by Gordian Knot* emergent plant, a plant which grows in water but which pierces the surface so that it is partially in air...

property of an intelligent agent's behaviour. However, if we consider how we are going to build an intelligent agent, then it becomes apparent there must be some mechanism for action selection. This mechanism may be highly distributed (as in the case of distributed organisms such as social insect colonies or slime mold) or it may be a special-purpose module.

The action selection mechanism (ASM) determines not only the agent’s actions in terms of impact on the world, but also directs its perceptual attention

Attention

Attention is the cognitive process of paying attention to one aspect of the environment while ignoring others. Attention is one of the most intensely studied topics within psychology and cognitive neuroscience....

, and updates its memory

Memory

In psychology, memory is an organism's ability to store, retain, and recall information and experiences. Traditional studies of memory began in the fields of philosophy, including techniques of artificially enhancing memory....

. These egocentric sorts of actions may in turn result in modifying the agents basic behavioural capacities, particularly in that updating memory implies some form of learning

Machine learning

Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

is possible. Ideally, action selection itself should also be able to learn and adapt, but there are many problems of combinatorial complexity

Combinatorics

Combinatorics is a branch of mathematics concerning the study of finite or countable discrete structures. Aspects of combinatorics include counting the structures of a given kind and size , deciding when certain criteria can be met, and constructing and analyzing objects meeting the criteria ,...

and computational tractability that may require restricting the search space for learning.

In AI, an ASM is also sometimes either referred to as an agent architecture

Agent architecture

Agent architecture in computer science is a blueprint for software agents and intelligent control systems, depicting the arrangement of components...

or thought of as a substantial part of one.

AI mechanisms of action selection

Generally, artificial action selection mechanisms can be divided into several categories: symbol-based systems

Automated planning and scheduling

Automated planning and scheduling is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are...

sometimes known as classical planning, distributed solutions, and reactive or dynamic planning

Reactive planning

In artificial intelligence, reactive planning denotes a group of techniques for action selection by autonomous agents. These techniques differ from classical planning in two aspects. First, they operate in a timely fashion and hence can cope with highly dynamic and unpredictable environments....

. Some approaches do not fall neatly into any one of these categories. Others are really more about providing scientific models than practical AI control; these last are described further in the next section.

Symbolic approaches

Early in the history of artificial intelligence

History of artificial intelligence

The history of artificial intelligence began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen; as Pamela McCorduck writes, AI began with "an ancient wish to forge the gods."...

, it was assumed that the best way for an agent to choose what to do next would be to compute a provably optimal plan, and then execute that plan. This led to the physical symbol system

Physical symbol system

A physical symbol system takes physical patterns , combining them into structures and manipulating them to produce new expressions....

hypothesis, that a physical agent that can manipulate symbols is necessary and sufficient for intelligence. Many software agents still use this approach for action selection. It normally requires describing all sensor readings, the world, all of ones actions and all of one's goals in some form of predicate logic

Predicate logic

In mathematical logic, predicate logic is the generic term for symbolic formal systems like first-order logic, second-order logic, many-sorted logic or infinitary logic. This formal system is distinguished from other systems in that its formulae contain variables which can be quantified...

. Critics of this approach complain that it is too slow for real-time planning and that, despite the proofs, it is still unlikely to produce optimal plans because reducing descriptions of reality to logic is a process prone to errors.

Satisficing

Satisficing

Satisficing, a portmanteau "combining satisfy with suffice", is a decision-making strategy that attempts to meet criteria for adequacy, rather than to identify an optimal solution...

is a decision-making strategy which attempts to meet criteria for adequacy, rather than identify an optimal solution. A satisficing strategy may often, in fact, be (near) optimal if the costs of the decision-making process itself, such as the cost of obtaining complete information, are considered in the outcome calculus.

Goal driven architectures - In these symbol
Symbol
A symbol is something which represents an idea, a physical entity or a process but is distinct from it. The purpose of a symbol is to communicate meaning. For example, a red octagon may be a symbol for "STOP". On a map, a picture of a tent might represent a campsite. Numerals are symbols for...

ic architectures, the agent's behaviour is typically described by a set of goals. Each goal can be achieved by a process or an activity, which is described by a prescripted plan. The agent must just decide which process to carry on to accomplish a given goal. The plan can expand to subgoals, which makes the process slightly recursive. Technically, more or less, the plans exploits condition-rules. These architectures are reactive
Reactive planning
In artificial intelligence, reactive planning denotes a group of techniques for action selection by autonomous agents. These techniques differ from classical planning in two aspects. First, they operate in a timely fashion and hence can cope with highly dynamic and unpredictable environments....

or hybrid. Classical examples of goal driven architectures are implementable refinements of Belief-Desire-Intention
BDI software agent
The Belief-Desire-Intention software model is a software model developed for programming intelligent agents. Superficially characterized by the implementation of an agent's beliefs, desires and intentions, it actually uses these concepts to solve a particular problem in agent programming...

architecture like JAM or IVE.
Excalibur was a research project led by Alexander Nareyek featuring any-time planning agents for computer games. The architecture is based on structural constraint satisfaction
Constraint satisfaction
In artificial intelligence and operations research, constraint satisfaction is the process of finding a solution to a set of constraints that impose conditions that the variables must satisfy. A solution is therefore a vector of variables that satisfies all constraints.The techniques used in...

, which is an advanced artificial intelligence
Artificial intelligence
Artificial intelligence is the intelligence of machines and the branch of computer science that aims to create it. AI textbooks define the field as "the study and design of intelligent agents" where an intelligent agent is a system that perceives its environment and takes actions that maximize its...

technique.

Distributed approaches

In contrast to the symbolic approach, distributed systems of action selection actually have no one "box" in the agent which decides the next action. At least in their idealized form, distributed systems have many modules running in parallel and determining the best action based on local expertise. In these idealized systems, overall coherence is expected to emerge

Emerge

Emerge may refer to:* Emerge: The Best of Neocolours, the fourth album of Neocolours* Emerge Desktop, a Desktop shell replacement for Microsoft Windows* Emerge Stimulation Drink, a drink sold in UK Supermarkets...

somehow, possibly through careful design of the interacting components. This approach is often inspired by neural networks

Neural Networks

Neural Networks is the official journal of the three oldest societies dedicated to research in neural networks: International Neural Network Society, European Neural Network Society and Japanese Neural Network Society, published by Elsevier...

research. In practice, there is almost always some centralised system determining which module is "the most active" or has the most salience. There is evidence real biological brains also have such executive decision systems which evaluate which of the competing systems deserves the most attention

Attention

, or more properly, has its desired actions disinhibited.

Spreading activation including Maes Nets (ANA)
Extended Rosenblatt & Payton is a spreading activation architecture developed by Toby Tyrrell in 1993. The agent's behaviour is stored in the form of a hierarchical connectionism network, which Tyrrell named free-flow hierarchy. Recently exploited for example by de Sevin & Thalmann (2005) or Kadleček (2001).
Behavior based AI, was a response to the slow speed of robots using symbolic action selection techniques. In this form, separate modules respond to different stimuli and generate their own responses. In the original form, the subsumption architecture
Subsumption architecture
Subsumption architecture is a reactive robot architecture heavily associated with behavior-based robotics. The term was introduced by Rodney Brooks and colleagues in 1986...

, these consisted of different layers which could monitor and suppress each other's inputs and outputs.
Creatures
Creatures (artificial life program)
Creatures is an artificial life computer program series, created in the mid-1990s by English computer scientist Steve Grand whilst working for the Cambridge computer games developer Millennium Interactive...

are virtual pets from a computer game driven by three-layered neural network
Artificial neural network
An artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...

, which is adaptive. Their mechanism is reactive since the network at every time step determines the task that has to be performed by the pet. The network is described well in the paper of Grand et al. (1997) and in The Creatures Developer Resources. See also the Creatures Wiki.

Dynamic planning approaches

Because purely distributed systems are difficult to construct, many researchers have turned to using explicit hard-coded plans to determine the priorities of their system.

Dynamic or reactive planning

Reactive planning

methods compute just one next action in every instant based on the current context and pre-scripted plans. In contrast to classical planning methods, reactive or dynamic approaches do not suffer combinatorial explosion

Combinatorial explosion

In administration and computing, a combinatorial explosion is the rapidly accelerating increase in lines of communication as organizations are added in a process...

. On the other hand, they are sometimes seen as too rigid to be considered strong AI

Strong AI

Strong AI is artificial intelligence that matches or exceeds human intelligence — the intelligence of a machine that can successfully perform any intellectual task that a human being can. It is a primary goal of artificial intelligence research and an important topic for science fiction writers and...

, since the plans are coded in advance. At the same time, natural intelligence can be rigid in some contexts although it is fluid and able to adapt in others.

Example dynamic planning mechanisms include:

Finite-state machines
Finite state machine
A finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...

These are reactive
Reactive planning
In artificial intelligence, reactive planning denotes a group of techniques for action selection by autonomous agents. These techniques differ from classical planning in two aspects. First, they operate in a timely fashion and hence can cope with highly dynamic and unpredictable environments....

architectures used mostly for computer game agents, in particular for first-person shooters bots
Computer game bot
A bot, most prominently in the first-person shooter types , is a type of weak AI expert system software which for each instance of the program controls a player in deathmatch, team deathmatch and/or cooperative human player. Computer bots may play against other bots and/or human players in unison,...

, or for virtual movie actors. Typically, the state-machines are hierarchical. For concrete game examples, see Halo 2 bots paper by Damian Isla (2005) or the Master's Thesis about Quake III bots by Jean Paul van Waveren (2001). For a movie example, see Softimage
Softimage
Softimage, Co. was a company located in Montreal, Quebec, Canada that produced 3D animation software. Their flagship products, Softimage 3D and Softimage XSI, are used in the creation of computer animation for films, television advertisement, and video games...

.
Other structured reactive plans tend to look a little more like conventional plans, often with ways to represent hierarchical and sequential structure. Some, such as PRS's 'acts', have support for partial plan
Partial plan
In formal AI planning, a partial plan is a plan which specifies all actions that need to be taken, but does not specify an exact order for the actions as the order does not matter...

s. Many agent architectures from the mid 1990s included such plans as a "middle layer" that provided organization for low-level behavior modules while being directed by a higher level real-time planner. Despite this supposed interoperability
Interoperability
Interoperability is a property referring to the ability of diverse systems and organizations to work together . The term is often used in a technical systems engineering sense, or alternatively in a broad sense, taking into account social, political, and organizational factors that impact system to...

with automated planners, most structured reactive plans are hand coded (Bryson 2001, ch. 3).

Examples of structured reactive plans include James Firby's RAP System and the Nils Nilsson's Teleo-reactive plans. PRS, RAPs & TRP are no longer developed or supported. One still-active (as of 2006) descendent of this approach is the Parallel-rooted Ordered Slip-stack Hierarchical (or POSH) action selection system, which is a part of Joanna Bryson's Behaviour Oriented Design.

Sometimes to attempt to address the perceived inflexibility of dynamic planning, hybrid techniques are used. In these, a more conventional AI planning system searches for new plans when the agent has spare time, and updates the dynamic plan library when it finds good solutions. The important aspect of any such system is that when the agent needs to select an action, some solution exists that can be used immediately (see further anytime algorithm

Anytime algorithm

In computer science an anytime algorithm is an algorithm that can return a valid solution to a problem even if it's interrupted at any time before it ends. The algorithm is expected to find better and better solutions the more time it keeps running....

Others

CogniTAO is a decision making engine it based on BDI (Belief desire intention), it includes built in teamwork capabilities.
Soar
Soar (cognitive architecture)
Soar is a symbolic cognitive architecture, created by John Laird, Allen Newell, and Paul Rosenbloom at Carnegie Mellon University, now maintained by John Laird's research group at the University of Michigan. It is both a view of what cognition is and an implementation of that view through a...

is a symbol
Symbol
A symbol is something which represents an idea, a physical entity or a process but is distinct from it. The purpose of a symbol is to communicate meaning. For example, a red octagon may be a symbol for "STOP". On a map, a picture of a tent might represent a campsite. Numerals are symbols for...

ic cognitive architecture
Cognitive architecture
A cognitive architecture is a blueprint for intelligent agents. It proposes computational processes that act like certain cognitive systems, most often, like a person, or acts intelligent under some definition. Cognitive architectures form a subset of general agent architectures...

. It is based on condition-action rules known as productions. Programmers can use the Soar development toolkit for building both reactive and planning agents, or any compromise between these two extremes.
ACT-R
ACT-R
ACT-R is a cognitive architecture mainly developed by John Robert Anderson at Carnegie Mellon University. Like any cognitive architecture, ACT-R aims to define the basic and irreducible cognitive and perceptual operations that enable the human mind....

is similar to Soar. It is less powerful as a programming language, but simpler to get working. It includes a Bayesian
Bayesian inference
In statistics, Bayesian inference is a method of statistical inference. It is often used in science and engineering to determine model parameters, make predictions about unknown variables, and to perform model selection...

learning system to help prioritize the productions.
ABL/Hap
Fuzzy architectures
Fuzzy control system
A fuzzy control system is a control system based on fuzzy logic—a mathematical system that analyzes analog input values in terms of logical variables that take on continuous values between 0 and 1, in contrast to classical or digital logic, which operates on discrete values of either 1 or 0 .-...

The Fuzzy approach
Fuzzy logic
Fuzzy logic is a form of many-valued logic; it deals with reasoning that is approximate rather than fixed and exact. In contrast with traditional logic theory, where binary sets have two-valued logic: true or false, fuzzy logic variables may have a truth value that ranges in degree between 0 and 1...

in action selection produces more smooth behaviour than can be produced by architectures exploiting boolean condition-action rules (like Soar or POSH). These architectures are mostly reactive
Reactive planning
In artificial intelligence, reactive planning denotes a group of techniques for action selection by autonomous agents. These techniques differ from classical planning in two aspects. First, they operate in a timely fashion and hence can cope with highly dynamic and unpredictable environments....

and symbol
Symbol
A symbol is something which represents an idea, a physical entity or a process but is distinct from it. The purpose of a symbol is to communicate meaning. For example, a red octagon may be a symbol for "STOP". On a map, a picture of a tent might represent a campsite. Numerals are symbols for...

ic. See the work of Alex Champandard.

Theories of action selection in nature

Many dynamic models of artificial action selection were originally inspired by research in ethology

Ethology

Ethology is the scientific study of animal behavior, and a sub-topic of zoology....

. In particular, Konrad Lorenz

Konrad Lorenz

Konrad Zacharias Lorenz was an Austrian zoologist, ethologist, and ornithologist. He shared the 1973 Nobel Prize with Nikolaas Tinbergen and Karl von Frisch...

and Nikolaas Tinbergen

Nikolaas Tinbergen

Nikolaas "Niko" Tinbergen was a Dutch ethologist and ornithologist who shared the 1973 Nobel Prize in Physiology or Medicine with Karl von Frisch and Konrad Lorenz for their discoveries concerning organization and elicitation of individual and social behaviour patterns in animals.In the 1960s he...

provided the idea of an innate releasing mechanism to explain instinctive behaviors (fixed action pattern

Fixed action pattern

In ethology, a fixed action pattern , or modal action pattern, is an instinctive behavioral sequence that is indivisible and runs to completion...

s). Influenced by the ideas of William McDougall

William McDougall (psychologist)

William McDougall FRS was an early twentieth century psychologist who spent the first part of his career in the United Kingdom and the latter part in the United States...

, Lorenz developed this into a "psychohydraulic" model of the motivation

Motivation

Motivation is the driving force by which humans achieve their goals. Motivation is said to be intrinsic or extrinsic. The term is generally used for humans but it can also be used to describe the causes for animal behavior as well. This article refers to human motivation...

of behavior. In ethology, these ideas were influential in the 1960s, but they are now regarded as outdated because of their use of an energy flow

Energy flow

In ecology, energy flow, also called the calorific flow, refers to the flow of energy through a food chain. In an ecosystem, ecologists seek to quantify the relative importance of different component species and feeding relationships....

metaphor; the nervous system

Nervous system

The nervous system is an organ system containing a network of specialized cells called neurons that coordinate the actions of an animal and transmit signals between different parts of its body. In most animals the nervous system consists of two parts, central and peripheral. The central nervous...

and the control of behavior are now normally treated as involving information transmission rather than energy flow. Dynamic plans and neural networks are more similar to information transmission, while spreading activation is more similar to the diffuse control of emotional / hormonal systems.

Stan Franklin

Stan Franklin

Stan Franklin is an American scientist and W. Harry Feinstone Interdisciplinary Research Professor at the and co-director of the Institute of Intelligent Systems. He is the author of Artificial Minds and mental father of IDA and its successor LIDA, both computational implementations of...

has proposed that action selection is the right perspective to take in understanding the role and evolution of mind

Mind

The concept of mind is understood in many different ways by many different traditions, ranging from panpsychism and animism to traditional and organized religious views, as well as secular and materialist philosophies. Most agree that minds are constituted by conscious experience and intelligent...

. See his page on the action selection paradigm.

AI models of neural action selection

Some researchers create elaborate models of neural action selection. See for example:

The Computational Cognitive Neuroscience Lab (CU Boulder).
The Adaptive Behaviour Research Group (Sheffield).