Bellman equation

Bellman flow chart.

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming.[1] It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices.[2] This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman's “principle of optimality" prescribes.[3] The equation applies to algebraic structures with a total ordering; for algebraic structures with a partial ordering, the generic Bellman's equation can be used.[4]

The Bellman equation was first applied to engineering control theory and to other topics in applied mathematics, and subsequently became an important tool in economic theory; though the basic concepts of dynamic programming are prefigured in John von Neumann and Oskar Morgenstern's Theory of Games and Economic Behavior and Abraham Wald's sequential analysis.[citation needed] The term "Bellman equation" usually refers to the dynamic programming equation (DPE) associated with discrete-time optimization problems.[5] In continuous-time optimization problems, the analogous equation is a partial differential equation that is called the Hamilton–Jacobi–Bellman equation.[6][7]

In discrete time any multi-stage optimization problem can be solved by analyzing the appropriate Bellman equation. The appropriate Bellman equation can be found by introducing new state variables (state augmentation).[8] However, the resulting augmented-state multi-stage optimization problem has a higher dimensional state space than the original multi-stage optimization problem - an issue that can potentially render the augmented problem intractable due to the “curse of dimensionality”. Alternatively, it has been shown that if the cost function of the multi-stage optimization problem satisfies a "backward separable" structure, then the appropriate Bellman equation can be found without state augmentation.[9]

  1. ^ Dixit, Avinash K. (1990). Optimization in Economic Theory (2nd ed.). Oxford University Press. p. 164. ISBN 0-19-877211-4.
  2. ^ "Bellman's principle of optimality". www.ques10.com. Retrieved 2023-08-17.
  3. ^ Kirk, Donald E. (1970). Optimal Control Theory: An Introduction. Prentice-Hall. p. 55. ISBN 0-13-638098-0.
  4. ^ Szcześniak, Ireneusz; Woźna-Szcześniak, Bożena (2023), "Generic Dijkstra: Correctness and tractability", NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pp. 1–7, arXiv:2204.13547, doi:10.1109/NOMS56928.2023.10154322, ISBN 978-1-6654-7716-1, S2CID 248427020
  5. ^ Kirk 1970, p. 70
  6. ^ Kamien, Morton I.; Schwartz, Nancy L. (1991). Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management (Second ed.). Amsterdam: Elsevier. p. 261. ISBN 0-444-01609-0.
  7. ^ Kirk 1970, p. 88
  8. ^ Jones, Morgan; Peet, Matthew M. (2020). "Extensions of the Dynamic Programming Framework: Battery Scheduling, Demand Charges, and Renewable Integration". IEEE Transactions on Automatic Control. 66 (4): 1602–1617. arXiv:1812.00792. doi:10.1109/TAC.2020.3002235. S2CID 119622206.
  9. ^ Jones, Morgan; Peet, Matthew M. (2021). "A Generalization of Bellman's Equation with Application to Path Planning, Obstacle Avoidance and Invariant Set Estimation". Automatica. 127: 109510. arXiv:2006.08175. doi:10.1016/j.automatica.2021.109510. S2CID 222350370.