SafeVerse is not only a real-to-sim reconstruction pipeline — it is a closed-loop training platform that connects reconstructed digital twins, adversarial scene editing, and online reinforcement learning.
This section describes how to launch the Minecraft-based training environment and enable agents to evolve under dynamic attack scenarios.
Unlike traditional static benchmarks, SafeVerse supports a “Reconstruction → Attack → Immunity Evolution” workflow. Agents are trained directly inside high-fidelity digital twins reconstructed from real-world videos and continuously exposed to dynamically edited adversarial conditions.
SafeVerse training requires prebuilt Minecraft environments generated by the reconstruction pipeline.
Please download the environment package from:
After downloading:
- Extract the dataset
- Copy the contents into: server/env{0-31}/
Each folder corresponds to a reconstructed, interactive digital twin scene.
These environments are not static maps — they contain:
- Physically interactive objects
- Editable layouts
Some furniture assets achieve better visual realism with the TMEO resource pack.
If you would like enhanced visual fidelity during training visualization or demonstrations, you may optionally purchase and install TMEO:
This step is not mandatory for functionality but improves realism for qualitative evaluation and demos.
The Minecraft server acts as the interactive execution engine for SafeVerse.
It hosts reconstructed scenes and enables:
- Real-time environment manipulation
- Adversarial scene editing
- Agent–environment interaction
- Online RL feedback loop
cd ./server
conda env create -f server_env.ymlconda activate <your_env_name>cd ./server
bash start_server.shAfter startup, the server becomes the execution backend for training. Make sure you record the server node IP address — it will be required in the training configuration.
SafeVerse enables online reinforcement learning within reconstructed real-world.
Unlike traditional embodied training pipelines that rely on fixed datasets and static environments, SafeVerse introduces a continuously evolving training loop.
Agents trained in SafeVerse:
- Operate inside reconstructed real-world scenes
- Face dynamically edited adversarial conditions
- Adapt through online reinforcement learning
- Improve robustness under distribution shifts
Typical adversarial edits include:
- Blocking critical navigation paths
- Locking or modifying object interaction properties
- Rearranging furniture and spatial layouts
- Changing lighting or visibility conditions
This creates a dynamic curriculum instead of a closed-world benchmark.
cd ./train
conda env create -f train_env.ymlAfter successfully creating the training environment, activate it:
conda activate <your_env_name>Make sure this environment matches the dependencies specified in train_env.yml.
This environment contains all required libraries for reinforcement learning, environment interaction, and experiment logging.
After the Minecraft server is running and the training environment is activated, you can launch the SafeVerse training pipeline:
cd ./train
bash train.shBefore launching training, make sure the following variables are properly configured inside train.sh:
export SERVER_IP=""- The IP address of the running Minecraft server.
- The training process connects to this server for real-time environment interaction.
BASE_NAME=""-
Identifier for the current experiment.
-
Used to name:
- Training logs
- Model checkpoints
- Weights & Biases (WandB) runs
export SAVE_CHECKPOINT_DIR=""- Directory where model checkpoints will be saved.
- Ensure the path exists and has sufficient storage space.
DATA=""- Path to the dataset used for training initialization (if applicable).
- This may include pretrained weights or task configuration files.