MEow Replication

Contributions

In this project, we replicated "Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow", reimplementing the flow policy and training process ourselves.

Our main code resides in meow_ours.py, meow_ours_robust.py, flow_policy.py, and td3_ours_robust.py. However, we also improved the documentation of code within ebflows, and wrote scripts in figure_creation that may be helpful for generating future charts or recreating our figures.

Instructions

MuJoCo Environments (Ant-v4, Humanoid-v4, HumanoidStandup-v4)

We wrote this section of the code for Python 3.9, to match the MEow paper.

To start, create a new conda environment (called meow) for Python 3.9 and activate it: conda activate meow

Then, run bash setup.bash within the conda environment.

MEow seeds (and other hyperparameters) should be changed within the corresponding configuration files.

To run Ant-v4 with MEow:

python meow_ours.py --config "config_meow_antv4.yaml"

To run Ant-v4 with SAC:

python sac_continuous_action.py --seed <SEED> --env-id Ant-v4 --total-timesteps 4000000 --tau 0.0001 --alpha 0.05 --learning_starts 5000

To run Humanoid-v4 with MEow:

python meow_ours.py --config "config_meow_humanoidv4.yaml"

To run Humanoid-v4 with SAC:

python sac_continuous_action.py --seed <SEED> --env-id Humanoid-v4 --total-timesteps 5000000 --tau 0.0005 --alpha 0.125 --learning_starts 5000

To run HumanoidStandup-v4 with MEow:

python meow_ours.py --config "config_meow_humanoid_standupv4.yaml"

To run HumanoidStandup-v4 with SAC:

python sac_continuous_action.py --seed <SEED> --env-id HumanoidStandup-v4 --total-timesteps 2500000 --tau 0.0005 --alpha 0.125 --learning_starts 5000

Robust Gymnasium (AntRandom-v5)

We wrote this section of the code for Python 3.12, for compatibility with Adroit.

On Adroit, run bash setup.bash.

MEow and TD3 seeds (and other hyperparameters) should be changed within the corresponding configuration files.

To run AntRandom-v5 with MEow:

python meow_ours_robust.py --config "config_meow_ant_randomv5.yaml"

To run AntRandom-v5 with TD3:

python td3_ours_robust.py --config "config_td3_ant_randomv5.yaml"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MEow Replication

Contributions

Instructions

MuJoCo Environments (Ant-v4, Humanoid-v4, HumanoidStandup-v4)

Robust Gymnasium (AntRandom-v5)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Robust-Gymnasium @ a01d798		Robust-Gymnasium @ a01d798
ebflows		ebflows
figure_creation		figure_creation
requirements		requirements
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config_meow_ant_randomv5.yaml		config_meow_ant_randomv5.yaml
config_meow_antv4.yaml		config_meow_antv4.yaml
config_meow_humanoid_standupv4.yaml		config_meow_humanoid_standupv4.yaml
config_meow_humanoidv4.yaml		config_meow_humanoidv4.yaml
config_td3_ant_randomv5.yaml		config_td3_ant_randomv5.yaml
meow_ours.py		meow_ours.py
meow_ours_robust.py		meow_ours_robust.py
sac_continuous_action.py		sac_continuous_action.py
setup-py3_12.bash		setup-py3_12.bash
setup.bash		setup.bash
td3_ours_robust.py		td3_ours_robust.py

logflash/meow-replication

Folders and files

Latest commit

History

Repository files navigation

MEow Replication

Contributions

Instructions

MuJoCo Environments (Ant-v4, Humanoid-v4, HumanoidStandup-v4)

Robust Gymnasium (AntRandom-v5)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages