Skip to content

omerahmed12345elhussien/RL_Project_Car_Racing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of DQN, Double DQN, Dueling DDQN, and Actor-Critic policy Gradient for Car-Racing_v0 game

In this project, we are interested in training the DQN and A2C agents on Car-Racing_v0 environment. The Car-Racing environment is part of the Box2D environments.

In the video below, we present samples of interactions with the environment.

color picker

DQN, DDQN, and Dueling DDQN

Q-learning is a value-based reinforcement learning algorithm used to find the optimal policy for an MDP. It operates by iteratively updating a Q-table, where each entry (Q-value) represents the expected cumulative reward of taking a specific action within a particular state. The iterative process involves the agent learning by exploring the environment and updating the model as the exploration continues.

Double Deep Q-Learning (DDQN) is an enhancement of the classic Q-learning algorithm. It addresses the issue of overestimation bias in Q-values using two separate Q-value networks: an online network and a target network. DDQN employs experience replay to improve sample efficiency and stabilize training.

Dueling Network Architectures proposes a different architecture compared with the one used in Mnih’s 2015 paper. After the end of the convolutional layers, they introduced two estimators: one for the state value function and the other for the state-dependent action advantage function.

Click here to open in Colab

Actor-Critic (A2C)

References

[1] https://hiddenbeginner.github.io/study-notes/contents/tutorials/2023-04-20_CartRacing-v2_DQN.html

[2] https://github.com/google-deepmind/dqn_zoo

[3] https://github.com/hamishs/JAX-RL/tree/main