This lecture will provide an overview of machine learning techniques for decision making and control. In particular, we will focus on bandit and reinforcement learning techniques. We will be reading from the updated textbook of Reinforcement Learning: An Introduction (Second Edition) by Sutton and Barto (available online in PDF form). I would encourage everyone to read the first chapter.

Required Readings:

  • Chapter 2: provides an overview of Multi-armed Bandits [Presented By Lynn]
  • Chapter 3: provides and overview of Markov decision processes [Presented by Karen]
  • Chapter 6: introduces TD learning and combines dynamic programming methods (Chapter 4) and MCMC methods (Chapter 5) for learning to control and is one of the fundamental methods in reinforcement learning. [Presented By Richard]

Optional reading on Recent Advances:

  1. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis. Human-level control through deep reinforcement learning, Nature 2015. An excellent one page summary by Bernard Schölkopf

  2. David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel & Demis Hassabis Mastering the game of Go with deep neural networks and tree search Nature 2016.

  3. Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, David Silver Massively Parallel Methods for Deep Reinforcement Learning, arXiv 2015.

  4. Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu Asynchronous Methods for Deep Reinforcement Learning, arXiv 2016.

Questions