Locally Constrained Policy Optimization (LCPO): Online Reinforcement Learning in Non-stationary Context-Driven Environments

Overview

LCPO is an online reinforcement learning algorithm tailored for non-stationary context-driven environments. The repository provides implementations and experiments to advance the understanding and performance of agents operating under these challenging conditions.

Reproducing Evaluations

There are four self-contained benchmarks used in the paper. To reproduce the results in the paper, you can follow instructions for each benchmark:

Windy-Gym Environments: A suite of environments based on Gymnasium and MuJoCo tasks that have been customized to include stochastic wind effects.

Straggler Mitigation: A load balancing problem, where the goal is to minimize latency by sending duplicate requests to multiple servers and using the fastest response.

Grid World: A simple grid-based environment to used to showcase the intuition behind LCPO.

Discrete Gym Environments: Gymnasium and MuJoCo environments, but continuous action spaces are replaced with discretized versions. This benchmark shows that this discretization doesn't have any effect on how well RL algorithms can learn the original environment.

Installation

To set up the LCPO project on your local machine:

Clone the Repository:

git clone https://github.com/pouyahmdn/LCPO.git

Navigate to the Project Directory:
```
cd LCPO
```
Install Dependencies:

We use Python (3.8 tested) for all experiments. Install PyTorch according to the website instructions. Install the remaining required packages via pip (or conda):
```
pip install -r requirements.txt
```

License

This project is licensed under the MIT License. You are free to use, modify, and distribute this software in accordance with the license terms.

Citation

If you use LCPO for your research, please cite our paper:

@inproceedings{
   hamadanian2025online,
   title={Online Reinforcement Learning in Non-Stationary Context-Driven Environments},
   author={Pouya Hamadanian and Arash Nasr-Esfahany and Malte Schwarzkopf and Siddhartha Sen and Mohammad Alizadeh},
   booktitle={The Thirteenth International Conference on Learning Representations},
   year={2025},
   url={https://openreview.net/forum?id=l6QnSQizmN}
}

Contact

For questions, suggestions, or collaborations, please open an issue or contact the maintainer directly through GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
disc-gym		disc-gym
results		results
straggler_mitigate		straggler_mitigate
toy_grid_world		toy_grid_world
windy-gym		windy-gym
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Locally Constrained Policy Optimization (LCPO): Online Reinforcement Learning in Non-stationary Context-Driven Environments

Overview

Reproducing Evaluations

Installation

License

Citation

Contact

About

Releases

Packages

Languages

License

pouyahmdn/LCPO

Folders and files

Latest commit

History

Repository files navigation

Locally Constrained Policy Optimization (LCPO): Online Reinforcement Learning in Non-stationary Context-Driven Environments

Overview

Reproducing Evaluations

Installation

License

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages