October 29, 2018

484 words 3 mins read

dennybritz/reinforcement-learning

dennybritz/reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton’s Book and David Silver’s course.


repo name	dennybritz/reinforcement-learning
repo link	https://github.com/dennybritz/reinforcement-learning
homepage	http://www.wildml.com/2016/10/learning-reinforcement-learning/
language	Jupyter Notebook
size (curr.)	5373 kB
stars (curr.)	13845
created	2016-08-24
license	MIT License

Overview

This repository provides code, exercises and solutions for popular Reinforcement Learning algorithms. These are meant to serve as a learning tool to complement the theoretical materials from

Each folder in corresponds to one or more chapters of the above textbook and/or course. In addition to exercises and solution, each folder also contains a list of learning goals, a brief concept summary, and links to the relevant readings.

All code is written in Python 3 and uses RL environments from OpenAI Gym. Advanced techniques use Tensorflow for neural network implementations.

Table of Contents

Introduction to RL problems & OpenAI Gym
MDPs and Bellman Equations
Dynamic Programming: Model-Based RL, Policy Iteration and Value Iteration
Monte Carlo Model-Free Prediction & Control
Temporal Difference Model-Free Prediction & Control
Function Approximation
Deep Q Learning (WIP)
Policy Gradient Methods (WIP)
Learning and Planning (WIP)
Exploration and Exploitation (WIP)

List of Implemented Algorithms

Dynamic Programming Policy Evaluation
Dynamic Programming Policy Iteration
Dynamic Programming Value Iteration
Monte Carlo Prediction
Monte Carlo Control with Epsilon-Greedy Policies
Monte Carlo Off-Policy Control with Importance Sampling
SARSA (On Policy TD Learning)
Q-Learning (Off Policy TD Learning)
Q-Learning with Linear Function Approximation
Deep Q-Learning for Atari Games
Double Deep-Q Learning for Atari Games
Deep Q-Learning with Prioritized Experience Replay (WIP)
Policy Gradient: REINFORCE with Baseline
Policy Gradient: Actor Critic with Baseline
Policy Gradient: Actor Critic with Baseline for Continuous Action Spaces
Deterministic Policy Gradients for Continuous Action Spaces (WIP)
Deep Deterministic Policy Gradients (DDPG) (WIP)
Asynchronous Advantage Actor Critic (A3C)

Resources

Textbooks:

Reinforcement Learning: An Introduction (2nd Edition)

Classes:

Talks/Tutorials:

Other Projects:

Selected Papers:

jupyter_notebook tensorflow reinforcement learning algorithm book course

comments powered by Disqus

aymericdamien/TensorFlow-Examples

aymericdamien/TensorFlow-Examples

October 29, 2018

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

nlintz/TensorFlow-Tutorials

nlintz/TensorFlow-Tutorials

October 29, 2018

Simple tutorials using Google’s TensorFlow Framework

jakevdp/PythonDataScienceHandbook

jakevdp/PythonDataScienceHandbook

October 26, 2018

Python Data Science Handbook: full text in Jupyter Notebooks

CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

October 29, 2018

aka “Bayesian Methods for Hackers”: An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

FrontendMasters/front-end-handbook-2017

FrontendMasters/front-end-handbook-2017

October 28, 2018

2017 edition of our front-end development guide

saurabhmathur96/clickbait-detector

saurabhmathur96/clickbait-detector

October 27, 2018

Detects clickbait headlines using deep learning.

rougier/from-python-to-numpy

rougier/from-python-to-numpy

October 23, 2018

An open-access book on numpy vectorization techniques, Nicolas P. Rougier, 2017

skidding/illustrated-algorithms

skidding/illustrated-algorithms

October 23, 2018

Interactive algorithm visualizations