Bellman equation

A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming.^[1] It writes the "value" of a decision problem at a certain point in time in terms of the payoff from some initial choices and the "value" of the remaining decision problem that results from those initial choices.^[2] This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman's “principle of optimality" prescribes.^[3] The equation applies to algebraic structures with a total ordering; for algebraic structures with a partial ordering, the generic Bellman's equation can be used.^[4]

The Bellman equation was first applied to engineering control theory and to other topics in applied mathematics, and subsequently became an important tool in economic theory; though the basic concepts of dynamic programming are prefigured in John von Neumann and Oskar Morgenstern's Theory of Games and Economic Behavior and Abraham Wald's sequential analysis.^{[citation needed]} The term "Bellman equation" usually refers to the dynamic programming equation (DPE) associated with discrete-time optimization problems.^[5] In continuous-time optimization problems, the analogous equation is a partial differential equation that is called the Hamilton–Jacobi–Bellman equation.^[6]^[7]

In discrete time any multi-stage optimization problem can be solved by analyzing the appropriate Bellman equation. The appropriate Bellman equation can be found by introducing new state variables (state augmentation).^[8] However, the resulting augmented-state multi-stage optimization problem has a higher dimensional state space than the original multi-stage optimization problem - an issue that can potentially render the augmented problem intractable due to the “curse of dimensionality”. Alternatively, it has been shown that if the cost function of the multi-stage optimization problem satisfies a "backward separable" structure, then the appropriate Bellman equation can be found without state augmentation.^[9]

^ Dixit, Avinash K. (1990). Optimization in Economic Theory (2nd ed.). Oxford University Press. p. 164. ISBN 0-19-877211-4.
^ "Bellman's principle of optimality". www.ques10.com. Retrieved 2023-08-17.
^ Kirk, Donald E. (1970). Optimal Control Theory: An Introduction. Prentice-Hall. p. 55. ISBN 0-13-638098-0.
^ Szcześniak, Ireneusz; Woźna-Szcześniak, Bożena (2023), "Generic Dijkstra: Correctness and tractability", NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pp. 1–7, arXiv:2204.13547, doi:10.1109/NOMS56928.2023.10154322, ISBN 978-1-6654-7716-1, S2CID 248427020
^ Kirk 1970, p. 70
^ Kamien, Morton I.; Schwartz, Nancy L. (1991). Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management (Second ed.). Amsterdam: Elsevier. p. 261. ISBN 0-444-01609-0.
^ Kirk 1970, p. 88
^ Jones, Morgan; Peet, Matthew M. (2020). "Extensions of the Dynamic Programming Framework: Battery Scheduling, Demand Charges, and Renewable Integration". IEEE Transactions on Automatic Control. 66 (4): 1602–1617. arXiv:1812.00792. doi:10.1109/TAC.2020.3002235. S2CID 119622206.
^ Jones, Morgan; Peet, Matthew M. (2021). "A Generalization of Bellman's Equation with Application to Path Planning, Obstacle Avoidance and Invariant Set Estimation". Automatica. 127: 109510. arXiv:2006.08175. doi:10.1016/j.automatica.2021.109510. S2CID 222350370.

[1] Dixit, Avinash K. (1990). Optimization in Economic Theory (2nd ed.). Oxford University Press. p. 164. ISBN 0-19-877211-4.

[2] "Bellman's principle of optimality". www.ques10.com. Retrieved 2023-08-17.

[3] Kirk, Donald E. (1970). Optimal Control Theory: An Introduction. Prentice-Hall. p. 55. ISBN 0-13-638098-0.

[Generic_Dijkstra_correctness-4] Szcześniak, Ireneusz; Woźna-Szcześniak, Bożena (2023), "Generic Dijkstra: Correctness and tractability", NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pp. 1–7, arXiv:2204.13547, doi:10.1109/NOMS56928.2023.10154322, ISBN 978-1-6654-7716-1, S2CID 248427020

[5] Kirk 1970, p. 70

[6] Kamien, Morton I.; Schwartz, Nancy L. (1991). Dynamic Optimization: The Calculus of Variations and Optimal Control in Economics and Management (Second ed.). Amsterdam: Elsevier. p. 261. ISBN 0-444-01609-0.

[7] Kirk 1970, p. 88

[8] Jones, Morgan; Peet, Matthew M. (2020). "Extensions of the Dynamic Programming Framework: Battery Scheduling, Demand Charges, and Renewable Integration". IEEE Transactions on Automatic Control. 66 (4): 1602–1617. arXiv:1812.00792. doi:10.1109/TAC.2020.3002235. S2CID 119622206.

[9] Jones, Morgan; Peet, Matthew M. (2021). "A Generalization of Bellman's Equation with Application to Path Planning, Obstacle Avoidance and Invariant Set Estimation". Automatica. 127: 109510. arXiv:2006.08175. doi:10.1016/j.automatica.2021.109510. S2CID 222350370.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]