How Reinforcement Learning Algorithms Work: A High Level Overview
Lesson Description and Notes
Get a high level overview of how Reinforcement Learning algorithms work.
We need RL algorithms to solve RL problems. In this tutorial, we will discuss the general blueprint of all Reinforcement Learning algorithms, including Deep Q Network (DQN), AlphaZero and Proximal Policy Optimization (PPO).
At the high level, all RL algorithms start with a random policy (exploration), collect experience, and use the collected experience to iteratively improve the policy (exploitation).
We also consider the special case of problems with large state spaces, where exploring the whole state space is impractical. Here, a Deep Neural Network is used to generalize the knowledge obtained from visited states to unknown states.
We conclude by mentioning the fastest way to start using Deep RL algorithms: using the production grade implementations already available in Deep RL frameworks like Ray Rllib.