Deep RL algorithms implemented using Pytorch
Algo list:
- DQN
- Vanilla policy Gradient
- Deep Deterministic Policy Gradient
- Twin Delayed Deep Deterministic Policy Gradient
- Soft Actor Critic
- Proximal Policy Optimization - CLIP
Article on deeper Look into policy gradients
Experimental Results:
| Algorithm | Discrete Env: LunarLander-v2 | Continuous Env: Pendulum-v0 |
|---|---|---|
| DQN | - | |
| VPG | - | |
| DDPG | - | |
| TD3 | - | |
| SAC | - | |
| PPO | - |
Resources:
- RL course by David Silver
- Lecture slides for above course
- Spinning up by OpenAI
- More exhaustive RL guide by Deeny Britz
Installation:
Prerequisites:
Setup:
-
Clone the repository:
git clone https://github.com/akashe/DeepReinforcementLearning.git
cd DeepReinforcementLearning -
Set up Python environment:
# Option A: Using pyenv (recommended)
pyenv install 3.8.10
pyenv local 3.8.10
# Option B: Using system Python 3.8+
# Make sure you have Python 3.8+ installed -
Install system dependencies (for Box2D environments like LunarLander):
# macOS:
brew install swig
# Ubuntu/Debian:
sudo apt-get install swig
# Other systems: install swig through your package manager -
Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate -
Install Python dependencies:
pip install -r requirements.txt
Usage:
Just run the file/algorithm directly. There is no common structures between algorithms as I implemented them as I learnt them. Different algorithms are inspired from different sources.
Examples:
python ddpg.py # Run DDPG on Pendulum-v1
python DQN.py # Run DQN on LunarLander-v2
python td3.py # Run TD3 on Pendulum-v1
python SoftActorCritic.py # Run SAC on Pendulum-v1
python DQN.py # Run DQN on LunarLander-v2
python td3.py # Run TD3 on Pendulum-v1
python SoftActorCritic.py # Run SAC on Pendulum-v1