Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gokulp01/meta-qlearning-humanoid

Repository files navigation

meta-qlearning-humanoid

Meta QLearning experiments to optimize robot walking patterns

Overview:

Implemented Meta-Q-Learning for optimizing humanoid walking patterns. We also demonstrate its effectiveness in improving stability, efficiency, and adaptability. Additionally, this work also explores the transferability of Meta-Q-Learning to new tasks with minimal tuning.

Conducted experiments:

Learn Stepping using MQL

Test how adaptable the humanoid is by performing:

  • Side stepping
  • Ascending and Descending

Setting up the environment:

This repository contains everything needed to set up the environment and get the simulation up and running.

Clone the repository:

git clone git@github.com:gokulp01/meta-qlearning-humanoid.git

Make sure the file structure is as follows:


+-- algs
| +-- MQL
| +-- buffer.py
| +-- mql.py
+-- configs
| +-- abl_envs.json
+-- Humanoid_environment
| +-- envs
| | +-- common
| | +-- jvrc
| +-- models
| | +-- cassie_mj_description
| | +-- jvrc_mj_description
| +-- scripts
| | +-- debug_stepper.py
| | +-- plot_logs.py
| +-- tasks
| | | +-- rewards.cpython-37.pyc
| | | +-- stepping_task.cpython-37.pyc
| | | +-- walking_task.cpython-37.pyc
| | +-- rewards.py
| | +-- stepping_task.py
| | +-- walking_task.py
| +-- utils
| +-- footstep_plans.txt
+-- misc
| +-- env_meta.py
| +-- logger.py
| +-- runner_meta_offpolicy.py
| +-- runner_multi_snapshot.py
| +-- torch_utility.py
| +-- utils.py
+-- models
| +-- networks.py
| +-- run.py
+-- README.md
+-- run_script.py

Installing packages:

pip3 install -r requirements.txt

Training

python3 run_script.py

Inference

This work was done as a fun project to learn RL and its applications, so I have not drawn a lot of theoretical inferences. That being said, here are some quantitative inferences from the work:

References:

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, & Alex Smola (2020). Meta-Q-Learning. In ICLR 2020, Microsoft Research Reinforcement Learning Day 2021

Some important notes:

  • Code is written to train using a GPU
  • Training time: ~55 hours on RTX 3080
  • Feel free to contact the author for pre-trained model
  • The code is not very well documented (PRs are more than welcome!)

Releases

No releases published

Packages

Contributors

Languages