This repository contains code for the AutoQD algorithm, which automatically discovers behavioral descriptors to use for Quality-Diversity optimization in continuous control tasks. AutoQD was published at ICLR 2026.
AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
arXiv:2506.05634
- Python 3.10
- Dependencies: Install using
pip install -r requirements.txt
# Clone the repo
git clone https://github.com/anonymous/autoqd.git
cd autoqd
# Create a virtual environment
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate
# Install dependencies
pip install -r requirements.txtYou can run the core algorithms using the scripts in the scripts/ directory:
# Run AutoQD on all environments
./scripts/auto_qd.sh
# Run baselines (AURORA, LSTM-AURORA, Regular QD, SMERL)
./scripts/aurora.sh
./scripts/lstm_aurora.sh
./scripts/regular_qd.sh
./scripts/smerl.sh
# Run ablation studies
./scripts/ablations.shThe scripts use seed 42 by default, but you can specify a different seed:
./scripts/auto_qd.sh 123You can also run individual experiments with specific configurations:
# Run AutoQD on the BipedalWalker environment with seed 42 and 2d measures (descriptors)
python -m src.main algorithm=auto_qd env=bipedal_walker seed=42 algorithm.measures_dim=2
# Disable wandb logging
python -m src.main algorithm=auto_qd env=bipedal_walker logging.wandb=falseTo evaluate a single experiment:
# Evaluate a specific experiment
python -m src.evaluation_suite.eval_single outputs/auto_qd_bipedal_walker_0411_1658To evaluate all experiments in the outputs directory:
# Evaluate all experiments in the outputs directory
./eval_script.shFor multi-seed evaluation and aggregation:
# Assuming you have results in 1_outputs/, 2_outputs/, and 3_outputs/
python -m src.evaluation_suite.multi_seed_evalNote: When evaluating all experiments, the code expects results from DvD-ES to be provided in a JSON file (dvd_logs.json) in the outputs directory. DvD-ES needs to be trained separately using the authors' implementation (https://github.com/jparkerholder/DvD_ES). A similar evaluation script to eval_single.py can be used to evaluate DvD-ES policies. Note 2: Before running certain evaluations, you may need to compute gamma values:
python -m src.evaluation_suite.compute_gammasTo visualize policies from a trained archive:
# Visualize policies from a checkpoint
python -m src.viz checkpoint_path=outputs/auto_qd_bipedal_walker_0411_1658/checkpoints/final.pkl env_id=BipedalWalker-v3To evaluate how policies adapt to modified environments:
python -m src.evaluation_suite.adaptationImportant Note: The adaptation.py script has hardcoded directory references. If you want to run this script with your own trained models, you'll need to modify the following lines to point to your output directories:
# In src/evaluation_suite/adaptation.py, modify:
for dir, algo_name in [
("auto_qd_bipedal_walker_0411_1658", "auto_qd"),
("aurora_bipedal_walker_0411_1233", "aurora"),
("lstm_aurora_bipedal_walker_0412_2233", "lstm_aurora"),
("regular_qd_bipedal_walker_0413_0554", "regular_qd"),
("smerl_bipedal_walker_0415_1044", "smerl"),
]src/: Source code for the projectalgorithms/: Implementation of AutoQD and baseline algorithmsembeddings/: Embedding methods for state/action encodingmeasure_maps/: Methods for mapping embeddings to measure spaceqd/: Quality-Diversity optimization componentsevaluation_suite/: Scripts for evaluating algorithm performance
conf/: Configuration files using Hydrascripts/: Shell scripts for running experiments
If you use this code in your research, please cite our paper:
@inproceedings{
hedayatian2026autoqd,
title={Auto{QD}: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization},
author={Saeed Hedayatian and Stefanos Nikolaidis},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=FNnJIf4ymV}
}