December 5, 2019

330 words 2 mins read



Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

repo name Curt-Park/rainbow-is-all-you-need
repo link
language Jupyter Notebook
size (curr.) 4856 kB
stars (curr.) 664
created 2019-06-10
license MIT License

All Contributors

Do you want a RL agent nicely moving on Atari?

Rainbow is all you need!

This is a step-by-step tutorial from DQN to Rainbow. Every chapter contains both of theoretical backgrounds and object-oriented implementation. Just pick any topic in which you are interested, and learn! You can execute them right away with Colab even on your smartphone.

Please feel free to open an issue or a pull-request if you have any idea to make it better. :)

If you want a tutorial for policy gradient methods, please see PG is All You Need.


  1. DQN [NBViewer] [Colab]
  2. DoubleDQN [NBViewer] [Colab]
  3. PrioritizedExperienceReplay [NBViewer] [Colab]
  4. DuelingNet [NBViewer] [Colab]
  5. NoisyNet [NBViewer] [Colab]
  6. CategoricalDQN [NBViewer] [Colab]
  7. N-stepLearning [NBViewer] [Colab]
  8. Rainbow [NBViewer] [Colab]


This repository is tested on Anaconda virtual environment with python 3.6.1+

$ conda create -n rainbow_is_all_you_need python=3.6.1
$ conda activate rainbow_is_all_you_need


First, clone the repository.

git clone
cd rainbow-is-all-you-need

Secondly, install packages required to execute the code. Just type:

make dep
  1. V. Mnih et al., “Human-level control through deep reinforcement learning.” Nature, 518 (7540):529–533, 2015.
  2. van Hasselt et al., “Deep Reinforcement Learning with Double Q-learning.” arXiv preprint arXiv:1509.06461, 2015.
  3. T. Schaul et al., “Prioritized Experience Replay.” arXiv preprint arXiv:1511.05952, 2015.
  4. Z. Wang et al., “Dueling Network Architectures for Deep Reinforcement Learning.” arXiv preprint arXiv:1511.06581, 2015.
  5. M. Fortunato et al., “Noisy Networks for Exploration.” arXiv preprint arXiv:1706.10295, 2017.
  6. M. G. Bellemare et al., “A Distributional Perspective on Reinforcement Learning.” arXiv preprint arXiv:1707.06887, 2017.
  7. R. S. Sutton, “Learning to predict by the methods of temporal differences.” Machine learning, 3(1):9–44, 1988.
  8. M. Hessel et al., “Rainbow: Combining Improvements in Deep Reinforcement Learning.” arXiv preprint arXiv:1710.02298, 2017.


Thanks goes to these wonderful people (emoji key):

This project follows the all-contributors specification. Contributions of any kind welcome!

comments powered by Disqus