Editing Q-learning (section)

== External links ==
* [http://www.cs.rhul.ac.uk/~chrisw/thesis.html Watkins, C.J.C.H. (1989). Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England.]
* [http://portal.acm.org/citation.cfm?id=1143955 Strehl, Li, Wiewiora, Langford, Littman (2006). PAC model-free reinforcement learning]
* [https://web.archive.org/web/20050806080008/http://www.cs.ualberta.ca/~sutton/book/the-book.html ''Reinforcement Learning: An Introduction''] by Richard Sutton and Andrew S. Barto, an online textbook. See [https://web.archive.org/web/20081202105235/http://www.cs.ualberta.ca/~sutton/book/ebook/node65.html "6.5 Q-Learning: Off-Policy TD Control"].
* [http://sourceforge.net/projects/piqle/ Piqle: a Generic Java Platform for Reinforcement Learning]
* [http://ccl.northwestern.edu/netlogo/models/community/Reinforcement%20Learning%20Maze Reinforcement Learning Maze], a demonstration of guiding an ant through a maze using ''Q''-learning
* [http://www.research.ibm.com/infoecon/paps/html/ijcai99_qnn/node4.html ''Q''-learning work by Gerald Tesauro]

{{Artificial intelligence navbox}}

[[Category:Machine learning algorithms]]
[[Category:Reinforcement learning]]