Barto, A.G., Sutton, R.S., Watkins, C.J.C.H. "Learning and Sequential Decision Making." COINS Technical Report 89-95 . Dept. of Computer and Information Science, University of Massachusetts, Amherst, MA.
2.
Bertsekas, D.P.Dynamic Programming and Optimal Control, Prentice Hall , Englewood Cliffs, NJ, 1995.
3.
Sutton, R.S. "Planning By Incremental Dynamic Programming." In Proceedings of the Eighth International Machine Learning Workshop, 1991.
4.
Sutton, R.S., Barto, A.G.Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
5.
Sutton, R.S., Barto, A.G., Williams, R.J. "Reinforcement Learning Is Direct Adaptive Optimal Control." IEEE Control Systems Magazine, Vol. 12, pp 19-22, 1992.
6.
Talukdar, S., Ramesh, V.C., Quadrel, R., Christie, R. "Multiagent Organizations For Real-Time Operations." Proceedings of the IEEE, Vol. 80, No. 5, pp 765-778, 1992.