Hamilton-Jacobi-Bellman equation
Encyclopedia
The Hamilton–Jacobi–Bellman (HJB) equation is a partial differential equation
Partial differential equation
In mathematics, partial differential equations are a type of differential equation, i.e., a relation involving an unknown function of several independent variables and their partial derivatives with respect to those variables...

 which is central to optimal control
Optimal control
Optimal control theory, an extension of the calculus of variations, is a mathematical optimization method for deriving control policies. The method is largely due to the work of Lev Pontryagin and his collaborators in the Soviet Union and Richard Bellman in the United States.-General method:Optimal...

 theory. The solution of the HJB equation is the 'value function', which gives the optimal cost-to-go for a given dynamical system
Dynamical system
A dynamical system is a concept in mathematics where a fixed rule describes the time dependence of a point in a geometrical space. Examples include the mathematical models that describe the swinging of a clock pendulum, the flow of water in a pipe, and the number of fish each springtime in a...

 with an associated cost function.

When solved locally, the HJB is a necessary condition, but when solved over the whole of state space, the HJB equation is a necessary and sufficient condition for an optimum. The solution is open loop, but it also permits the solution of the closed loop problem. The HJB method can be generalized to stochastic
Stochastic
Stochastic refers to systems whose behaviour is intrinsically non-deterministic. A stochastic process is one whose behavior is non-deterministic, in that a system's subsequent state is determined both by the process's predictable actions and by a random element. However, according to M. Kac and E...

 systems as well.

Classical variational problems, for example, the brachistochrone problem can be solved using this method.

The equation is a result of the theory of dynamic programming
Dynamic programming
In mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure...

 which was pioneered in the 1950s by Richard Bellman
Richard Bellman
Richard Ernest Bellman was an American applied mathematician, celebrated for his invention of dynamic programming in 1953, and important contributions in other fields of mathematics.-Biography:...

 and coworkers. The corresponding discrete-time equation is usually referred to as the Bellman equation
Bellman equation
A Bellman equation , named after its discoverer, Richard Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming...

. In continuous time, the result can be seen as an extension of earlier work in classical physics
Classical physics
What "classical physics" refers to depends on the context. When discussing special relativity, it refers to the Newtonian physics which preceded relativity, i.e. the branches of physics based on principles developed before the rise of relativity and quantum mechanics...

 on the Hamilton-Jacobi equation by William Rowan Hamilton
William Rowan Hamilton
Sir William Rowan Hamilton was an Irish physicist, astronomer, and mathematician, who made important contributions to classical mechanics, optics, and algebra. His studies of mechanical and optical systems led him to discover new mathematical concepts and techniques...

 and Carl Gustav Jacob Jacobi.

Optimal control problems

Consider the following problem in deterministic optimal control over the time period :


where C[] is the scalar cost rate function and D[] is a function that gives the economic value or utility at the final state, x(t) is the system state vector, x(0) is assumed given, and u(t) for 0 ≤ t ≤ T is the control vector that we are trying to find.

The system must also be subject to


where F[] gives the vector determining physical evolution of the state vector over time.

The partial differential equation

For this simple system, the Hamilton Jacobi Bellman partial differential equation is


subject to the terminal condition

where the means the dot product
Dot product
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number obtained by multiplying corresponding entries and then summing those products...

 of the vectors a and b and is the gradient
Gradient
In vector calculus, the gradient of a scalar field is a vector field that points in the direction of the greatest rate of increase of the scalar field, and whose magnitude is the greatest rate of change....

 operator.

The unknown scalar in the above PDE is the Bellman 'value function', which represents the cost incurred from starting in state at time and controlling the system optimally from then until time .

Deriving the equation

Intuitively HJB can be "derived" as follows. If is the optimal cost-to-go function (also called the 'value function'), then by Richard Bellman's principle of optimality, going from time t to t + dt, we have


Note that the Taylor expansion of the last term is


where o(dt2) denotes the terms in the Taylor expansion of higher order than one. Then if we cancel V(x(t), t) on both sides, divide by dt, and take the limit as dt approaches zero, we obtain the HJB equation defined above.

Solving the equation

The HJB equation is usually solved backwards in time
Backward induction
Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions. It proceeds by first considering the last time a decision might be made and choosing what to do in any situation at that time. Using this...

, starting from and ending at .

When solved over the whole of state space, the HJB equation is a necessary and sufficient condition for an optimum. If we can solve for then we can find from it a control that achieves the minimum cost.

In general case, the HJB equation does not have a classical (smooth) solution. Several notions of generalized solutions have been developed to cover such situations, including viscosity solution
Viscosity solution
In mathematics, the viscosity solution concept was introduced in the early 1980s by Pierre-Louis Lions and Michael Crandall as a generalization of the classical concept of what is meant by a 'solution' to a partial differential equation...

 (Pierre-Louis Lions
Pierre-Louis Lions
Pierre-Louis Lions is a French mathematician. His parents were Jacques-Louis Lions, a mathematician and at that time professor at the University of Nancy, who in particular became President of the International Mathematical Union, and Andrée Olivier, his wife...

 and Michael Crandall), minimax solution (Andrei Izmailovich Subbotin), and others.

Extension to stochastic problems

The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above


now with the stochastic process to optimize and the steering. By first using Bellman and then expanding with Itô's rule, one finds the deterministic HJB equation


where represents the stochastic differentiation operator, and subject to the terminal condition


Note, that the randomness has disappeared. In this case a solution of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market (see for example Merton's portfolio problem
Merton's portfolio problem
Merton's Portfolio Problem is a well known problem in continuous-time finance. An investor with a finite lifetime must choose how much to consume and must allocate his wealth between stocks and a risk-free asset so as to maximize expected lifetime utility. The problem was formulated and solved by...

).

See also

  • Bellman equation
    Bellman equation
    A Bellman equation , named after its discoverer, Richard Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming...

    , discrete-time counterpart of the Hamilton-Jacobi-Bellman equation
  • Pontryagin's minimum principle
    Pontryagin's minimum principle
    Pontryagin's maximum principle is used in optimal control theory to find the best possible control for taking a dynamical system from one state to another, especially in the presence of constraints for the state or input controls. It was formulated by the Russian mathematician Lev Semenovich...

    , necessary but not sufficient condition for optimum, by minimizing a Hamiltonian, but this has the advantage over HJB of only needing to be satisfied over the single trajectory being considered.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK