Minimal RL project for training and visualizing a MuJoCo Ant agent with SAC (Stable-Baselines3).
A minimal Gymnasium MuJoCo Ant-v4 + Stable-Baselines3 SAC setup.
This project requires Python 3.12.
If python3.12 is not installed yet (Ubuntu):
sudo apt update
sudo apt install -y python3.12 python3.12-venvCreate and activate the virtual environment with Python 3.12:
python3.12 -m venv myenv
source myenv/bin/activate
python -m pip install --upgrade pip
pip install "gymnasium[mujoco]" stable-baselines3 torch tensorboardpython3 train.py \
--env Ant-v4 \
--timesteps 10000 \
--n-envs 8 \
--vec-env subproc \
--model-path models/ant \
--vecnorm-path models/ant_vecnormalize.pklNotes:
- Defaults:
--device auto(uses CUDA if available),--mujoco-gl egl. - Parallel rollout collection:
--n-envs+--vec-env subproc. - Outputs:
<model-path>.zipand<vecnorm-path>. - Training UI: clean one-line progress bar (use
--no-progressto disable). - Fast default:
--timesteps 10000(~100x less than 1,000,000). - Training ends when
--timestepsis reached, then model/stats are saved. - Post-train eval defaults to
--eval-episodes 1(--eval-episodes 0skips eval).
python3 render.py --env Ant-v4 --model-path models/ant --vecnorm-path models/ant_vecnormalize.pkl --episodes 3Notes:
render.pydefaults to--render-mode auto:- Uses
humanwhenDISPLAYis available. - Uses
rgb_arrayin headless environments.
- Uses
- If artifacts are missing, run
train.pyfirst (or pass existing paths).