Completed The Fast Deep RL Course? Then you know how to solve simple problems using OpenAI Gym and Ray-RLlib. You also have a good basis in Deep RL.

The next obvious step is to apply this knowledge to real-world problems. At this juncture, it is common to face the following hurdles.

  • No premade Gym environment exists for the problem you want to attack. You must define and create your own custom environment.
  • After creating a custom environment, you apply an appropriate Deep RL algorithm with default framework settings. But the agent doesn't seem to learn anything or displays poor learning.

In this course, you will gain the skills needed to overcome these hurdles and become effective in real-world applications.

You will learn the main ideas behind designing and implementing custom Gym environments for real-world problems.

You will also learn how to apply several performance enhancing tricks and Ray Tune experiments to easily identify the most promising tricks. Here is a list of tricks we will cover.

  • Redefining the observations and actions to cast the environment as close as possible to a Markov Decision Process
  • Scaling observations, actions, and rewards to standard ranges preferred by Deep Neural Nets
  • Shaping reward functions to help the agent better distinguish between good and bad actions
  • Preprocessing the inputs to reduce complexity while preserving critical information
  • Tuning the network size and hyperparameters of the Ray-RLlib algorithms to improve performance

Remember the shop inventory management problem from the The Fast Deep RL Course? That's an example of a real world problem.

By the end of the course, you will make a custom Gym environment for inventory management, shape rewards, normalize observations and actions, tune hyperparameters, and much more. By running Ray Tune experiments, you will find the best learning settings and create a Deep RL agent that performs better than classical inventory management techniques.

After solving this example problem, you will be able to use a step-by-step method to solve real-world Deep RL problems that you encounter in your industry.

If you liked The Fast Deep RL Course, I think you will like this course too. After all, real world application is the next natural and exciting step.

I am looking forward to seeing you inside!


What will you learn?

Chapter 1

In Chapter 1, you will learn how to create custom Gym environments.

  • You will be able to design observations and actions such that the environment is close to a Markov Decision Process. This gives Deep RL algorithms the best chance for learning.
  • You will be able to implement custom Gym environments by inheriting the base environment class and defining the required methods.

Chapter 2

In Chapter 2, you will learn how to scale observations and actions, and shape rewards using Gym Wrappers.

  • You will be able to modify your custom environments further by using Gym Wrappers.
  • You will write wrappers for scaling observations and actions to ranges preferred by Deep Neural Nets.
  • You will write several wrappers to shape the reward function.

Chapter 3

In Chapter 3, you will learn how to try out various performance-boosting ideas quickly by running Ray Tune experiments.

  • You will be able to run parallelly running experiments using custom environments and custom wrappers.
  • You will be able to allocate CPU and GPU resources efficiently in your local Ray server for the fastest experiment execution.
  • You will visualize experiment results using Tensorboard and pick the best learning settings.
  • You will be able to benchmark your best results against baselines.
Population Based Training (PBT)

Chapter 4

In Chapter 4, you will learn how to tune hyperparameters to give your agent another boost in performance.

  • You will know which hyperparameters you need to tune and their potential ranges.
  • You will be able to tune hyperparameters using simple methods like grid search and advanced methods like Population Based Training.

Course Curriculum

  Implementing Custom Gym Environments
Available in days
days after you enroll
  Observation/Action Normalization and Reward Scaling using Gym Wrappers
Available in days
days after you enroll
  Running Ray Tune Experiments
Available in days
days after you enroll
  Boosting Performance Further Using Hyperparameter Tuning
Available in days
days after you enroll


Easy to digest

Bite sized video lessons with no fluff (on an average 10 mins long and rarely over 15 mins).

The whole course can be completed in 8 hours (including exercises).

Easy to follow

All videos will have closed captions.

Learn by doing

Video lessons and demonstrations will be followed by coding exercises whenever possible.

Project based

The exercises will be part of an overarching project, where you will teach an agent to manage shop inventory.

Hi, I am Dibya, the instructor of this course 👋

  • I am a Senior Python Engineer based in Germany. I have worked on engineering projects of the biggest automotive companies and the German government.
  • I founded the Artificial General Intelligence community and co-organize the Python developer community in Munich, Germany.
  • I like teaching what I know. I have trained thousands of Data Engineers/Scientists on the world's largest Data Science education platform Datacamp.

"Dibya is one of the most fluent Python instructors in the community"

Anton Caceres - Python Software Foundation Fellow

"Dibya cares deeply about students and makes complex concepts easily accessible"

Hadrien Lacroix - Data Science Curriculum Manager, Datacamp

"No matter how difficult the task, Dibya made sure everyone left with a smile"

Olga Kupriyanova, PyLadies Organizer

Enrollment is risk free

Your purchase is protected by a 14-day money-back guarantee. If the course doesn't meet your needs for any reason, let me know within 14 days of the purchase, and you will get a full refund. No questions asked.

Have any other questions prior to enrollment? Please drop me a message.