Efficient Active Imitation Learning with Random Network Distillation

https://arxiv.org/pdf/2411.01894 https://sites.google.com/view/rnd-dagger/home This codebase contains the code used to produce the experiments from the paper.

Install

Create a conda environment:

conda create -n env_imitation_game python=3.10

pip install -e environments/
pip install -e interactive_module/

pip install -e gametrackr/

git clone --recursive https://github.com/DLR-RM/rl-baselines3-zoo.git

pip install -e rl-baselines3-zoo/

git clone https://github.com/araffin/pybullet_envs_gymnasium.git

Replace the script pybullet_envs_gymnasium/robot_locomotors.py by environments/envs/halfcheetah/imitation/robot_locomotors.py

Unzip executables in the godot_exe folder

Linux/WSL:

mv environments/envs/halfcheetah/imitation/robot_locomotors.py pybullet_envs_gymnasium/robot_locomotors.py

Or for Windows:

Move-Item -Path .\environments\envs\halfcheetah\imitation\robot_locomotors.py -Destination .\pybullet_envs_gymnasium\robot_locomotors.py -Force

pip install -e pybullet_envs_gymnasium/

pip install -e .\interactive_module\.

Prepare the folders

Modify the \interactive_module\interactive\configs\abs_path\abs_path.yaml with the correct paths of your working folder.

Datasets

Download datasets of initial interactions (from which the interactive session is started) from our website https://sites.google.com/view/rnd-dagger

Interaction session

The scripts specific to each environments are located inside environments/envs

To perform an interaction session:

python environments/envs/<env_name>/interact/run.py

The configuration associated is inside interactive_module/interactive/configs/run_interaction_<initials_of_env>.yaml

The configurations of each decision rules are in configs/decision_rules

If you want to perform automatic experiments (with oracles) pass the oracle argument in the yamls to true, else to false (only available for RaceCar and Maze environments)

For instance, to launch an interactive session on RaceCar and use a Human-Gated interactive approach (i.e. you decide when to take control):

To control, use the gamepad (joystick button to take control, LT RT for forward backward) (potentially no longer true: ZQSD to control the car, Spacebar to take control)

python .\environments\envs\race_car\interact\run.py environment.run.headless=false oracle=false

e.g. launching interactive training on maze with RND-Dagger (tip: reduce max_epoch to small value to reduce the number of training iterations before interactive sessions, to check everything is working quickly)

python intheloop/work/imitationgame_racing/environments/envs/maze/interact/run.py n_initial_episodes=25 sync.session_length=2000 decision_rule=rnd max_epoch=2500 seed=42 decision_rule.parameters.context_length=2 sync.num_sessions=8 sync.n_frame_stable=1 decision_rule.trainer.parameters.model.hidden_size=32 decision_rule.trainer.parameters.model.n_layers=0 decision_rule.parameters.threshold_factor=2 eval.first_to_eval=-8 eval.n_episodes=50 decision_rule.trainer.parameters.model.type=regular decision_rule.parameters.padding_type=replicate sync.max_session_length=100000

on racecar (multirun, for cluster launches)

python intheloop/work/imitationgame_racing/environments/envs/race_car/interact/run.py -m n_initial_episodes=1 sync.session_length=2000 decision_rule=rnd max_epoch=2500 seed=120,210,420,12,42,21,1200,2100,120,210,420,12,42,21,1200,2100 sync.num_sessions=8 sync.n_frame_stable=1 decision_rule.parameters.padding_type=replicate eval.first_to_eval=-8 decision_rule.parameters.context_length=10 decision_rule.parameters.threshold_factor=2 decision_rule.trainer.parameters.model.hidden_size=32 decision_rule.trainer.parameters.model.n_layers=0 sync.lazy_threshold_divisor=1

Results

The different results are available in: \

working_dir/map/method/seed/expe_folder/interaction_data to get the different metrics necessary to compute the results working_dir/map/method/seed/expe_folder/evaluation to get the performance of the BC models for each session

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
environments		environments
gametrackr		gametrackr
godot_exe		godot_exe
interactive_module		interactive_module
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient Active Imitation Learning with Random Network Distillation

Install

Prepare the folders

Datasets

Interaction session

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Languages

Folders and files

Latest commit

History

Repository files navigation

Efficient Active Imitation Learning with Random Network Distillation

Install

Prepare the folders

Datasets

Interaction session

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 0

Languages

Packages

Contributors