Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 1014 Bytes

README.md

File metadata and controls

13 lines (7 loc) · 1014 Bytes

In this repository, I will share my knowledge and works about reinforcement learning. I will write blog posts and implement the algorithms in order to understand them well. I will also share some articles that will help to understand the concepts better.

Algorithms

Vanilla Policy Gradients(REINFORCE)

Check out the blog post for detailed explanation.

Summary: Policy gradient algorithms directly learn/optimize the policy. We generate samples from the environment. We calculate the sum of gradients along the samples and, also we compute the total reward for each sample. We multiply them and optimize with gradient ascent.

Code

Skeleton code for the implementation is taken from Berkeley RL Course Assignment 2 which can be found here. Also check out the course content from here.