Lec 7 – Principle of optimality
Lecture notes:
References to textbooks:
- Bertsekas, Dimitri: Dynamic Programming and Optimal Control.
- Bertsekas, Dimitri: Reinforcement Learning and Optimal Control.
- Sutton, Richard, and Barto, Andrew: Reinforcement Learning, An introduction. (Available as pdf Links to an external site.) (Chapter 1 contains a history of reinforcement learning, also touching on the history of dynamic programming and optimal control).
References to papers:
- Bellman, Richard. "The theory of dynamic programming." Bulletin of the American Mathematical Society 60.6 (1954): 503-515. (Introduces Principle of Optimality)
- Bellman, Richard. "Dynamic programming." Science 153.3731 (1966): 34-37. (Popular science description, no equations!)
- Dreyfus, Stuart. "Richard Bellman on the birth of dynamic programming." Operations Research 50.1 (2002): 48-51. (Excerpts from Bellmans autobiography regarding the birth of dynamic programming)
- Doyle, John C. "Guaranteed margins for LQG regulators." IEEE Transactions on Automatic Control 23.4 (1978): 756-757.
- Keel, Lee H., and Shankar P. Bhattacharyya. "Robust, fragile, or optimal?." IEEE transactions on automatic control 42.8 (1997): 1098-1105.
Some Youtube videos:
- Claude Shannon demonstrates his maze solver. Links to an external site.
- CoastRunners 7 optimal policy is not doing what it should. Links to an external site.