This is a framework for the research on multi-agent reinforcement learning and the implementation of the experiments in the paper titled by ''Shapley Q-value: A Local Reward Approach to Solve Global Reward Games''.
-
Updated
Nov 4, 2024 - Python
This is a framework for the research on multi-agent reinforcement learning and the implementation of the experiments in the paper titled by ''Shapley Q-value: A Local Reward Approach to Solve Global Reward Games''.
A deep reinforcement learning system for optimizing bridge maintenance decisions across municipal infrastructure fleets, implementing cross-subsidy budget sharing and cooperative multi-agent learning.
Adversarial Co-Evolution of RL and LLM Agents: A framework for training high-performance PPO agents against Large Language Models in Gin Rummy, utilizing curriculum learning and knowledge distillation.
Going through the Hugging Face Deep Reinforcement Learning course.
Deterministic hex-grid soccer environment with two adversarial agents. Implements Q-Learning, Minimax-Q (via LP), and Belief-Q with online belief updates; trains in SE2G/SE6G to reduce state space and evaluates behaviors in the full environment with comprehensive visualizations.
Project 3 of Udacity's Deep Reinforcement Learning Nanodegree Program
An engineering-focused multi-agent reinforcement learning system for Texas Hold'em using PettingZoo AEC and a custom PyTorch PPO self-play setup.
Hexapawn Game Engine Proper 3x3 board with pawn movement Strategic RL Agents Minimax with Alpha-Beta Pruning (depth configurable 1-7) Q-Learning with temporal difference updates Experience replay for efficient learning Epsilon-greedy exploration with decay Multi-level decision hierarchy (immediate threats - strategic planning)
keureijiakeideu mojag + Reinforcement Learning ( DQN, PPO )
Key Features 1. Flexible Game Configuration Adjustable grid size (3x3 up to 10x10) Customizable win condition (e.g., 5-in-a-row on a 7x7 board) 2. Two Competing RL Agents Agent 1 (Blue X) vs Agent 2 (Red O) Each has independent Q-Learning parameters Watch them evolve different strategies over time
Dark Zero Point Genesis: PPO Latent World Models Under Thermodynamic Scarcity 256 agents. 128D Latent Manifolds. Zero supervision. Agents utilize PPO-clipped surrogate objectives. Survival = Predictive Error Coding (PEC) x Energy Efficiency across a 50/15 Seasonal Cycle.
Research-grade Reinforcement Learning framework for single-agent and multi-agent warehouse navigation using Deep Q-Networks (DQN), PyTorch, replay buffer, target networks, logging, and full test suite. Built for PhD-level RL and autonomous systems research.
Multi-Equipment CBM system using QR-DQN with advanced probability distribution analysis. Coordinated maintenance decision-making for 4 industrial equipment units with realistic anomaly rates (1.9-2.2%), comprehensive risk analysis (VaR/CVaR), and 51-quantile distribution visualization.
Classic Nim Rules - 3 customizable piles, take any number from one pile per turn, last to take loses Q-Learning Agents - Two independent agents that learn optimal strategy through self-play
A specialized Reinforcement Learning (RL) project focused on multi-task mastery across 10 distinct gaming environments. General-Gamer-AI-Lite implements a lightweight multi-task agent designed to learn shared representations and transfer knowledge between varied game mechanics, from classic arcade challenges to strategic grid worlds.
Coordinated multi-agent systems that learn to solve complex collaborative and competitive tasks.
Multi-Equipment CBM (Condition-Based Maintenance) optimization using Deep Q-Learning with cost leveling and scenario comparison. Advanced RL system with QR-DQN, N-step learning, and parallel environments for HVAC equipment predictive maintenance.
Comprehensive Qwirkle RL agent application. We'll use Monte Carlo Tree Search (MCTS) combined with Q-Learning - the best approaches for tile-placement games with high branching factors.
Pure RL Agents: I implemented Q-Learning agents that learn through Self-Play. They play against each other to get smarter without human help! Symmetry Optimization: To make them "genius" faster, I added logic so they understand that a board mirrored left-to-right is the same situation. This cuts the learning time in half!
Add a description, image, and links to the multi-agent-rl topic page so that developers can more easily learn about it.
To associate your repository with the multi-agent-rl topic, visit your repo's landing page and select "manage topics."