This project provides tools for extracting and visualizing data from StarCraft II replay files (.SC2Replay) for AI training purposes. The extracted data is stored using the HDF5 format, which is efficient for storing large numpy arrays.
- Extract feature screen data (height map, visibility map, unit types, etc.)
- Extract feature minimap data
- Extract RGB screen and minimap data (using placeholders when actual RGB data is not available)
- Extract player information (minerals, vespene, food, army count, etc.)
- Extract game information (map name, duration, version)
- Extract player actions
- Visualize extracted data
- Python 3.6+
- PySC2
- NumPy
- Matplotlib
- h5py
- Install the required packages:
pip install pysc2 numpy matplotlib h5py- Make sure you have StarCraft II installed and the replay files you want to analyze.
python replay_extractor.py --replay_path=/path/to/replay.SC2Replay --output_dir=extracted_data --verboseOptions:
--replay_path: Path to the replay file (required)--output_dir: Directory to save extracted data (default: "extracted_data")--step_mul: Game steps per observation (default: 8)--skip_frames: Number of frames to skip between extractions (default: 24)--max_steps: Maximum number of steps to extract (0 for all, default: 0)--extract_all: Extract all features (default: True)--extract_feature_screen: Extract feature screen (default: True)--extract_feature_minimap: Extract feature minimap (default: True)--extract_rgb_screen: Extract RGB screen (default: True)--extract_rgb_minimap: Extract RGB minimap (default: True)--extract_player_info: Extract player information (default: True)--extract_game_info: Extract game information (default: True)--extract_actions: Extract player actions (default: True)--verbose: Print verbose information (default: False)
python visualize_extracted_data.py extracted_data/replay_name.h5 --frame 0 --saveOptions:
data_path: Path to the extracted data file (.h5) (required)--frame: Frame index to visualize (default: 0)--save: Save visualization to file (default: False)--output: Output file path (default: None, will use the same name as the input file with "_visualization.png" suffix)--save-features: Save individual feature maps as images (default: False)
python examine_data.py extracted_data/replay_name.h5 --frame 0Options:
data_path: Path to the extracted data file (.h5) (required)--frame: Frame index to examine in detail (default: 0)
python dataset_utils.pyThis script provides utilities for managing datasets, including:
- Listing replay files and extracted data files
- Merging multiple datasets
- Extracting features and targets
- Creating train-test splits
- Flattening features for use with scikit-learn models
- Saving and loading dataset splits
python example.pyThis script demonstrates how to use the extracted data to train a simple machine learning model using scikit-learn.
The extracted data is stored in a hierarchical structure using HDF5:
/
├── game_info/
│ ├── map_name (attribute)
│ ├── game_duration_loops (attribute)
│ ├── game_duration_seconds (attribute)
│ ├── game_version (attribute)
│ └── players/
│ ├── player_0/
│ │ ├── player_id (attribute)
│ │ ├── race (attribute)
│ │ └── result (attribute)
│ └── player_1/
│ ├── player_id (attribute)
│ ├── race (attribute)
│ └── result (attribute)
└── frames/
├── frame_0/
│ ├── game_loop (attribute)
│ ├── feature_screen/
│ │ ├── height_map (dataset)
│ │ ├── visibility_map (dataset)
│ │ ├── unit_type (dataset)
│ │ └── player_relative (dataset)
│ ├── feature_minimap/
│ │ ├── height_map (dataset)
│ │ ├── visibility_map (dataset)
│ │ ├── camera (dataset)
│ │ └── player_relative (dataset)
│ ├── rgb_screen (dataset)
│ ├── rgb_minimap (dataset)
│ ├── player_info/
│ │ ├── player_id (attribute)
│ │ ├── minerals (attribute)
│ │ ├── vespene (attribute)
│ │ ├── food_cap (attribute)
│ │ ├── food_used (attribute)
│ │ ├── food_army (attribute)
│ │ ├── food_workers (attribute)
│ │ ├── idle_worker_count (attribute)
│ │ ├── army_count (attribute)
│ │ ├── warp_gate_count (attribute)
│ │ └── larva_count (attribute)
│ └── actions/
│ ├── action_0/
│ │ ├── game_loop (attribute)
│ │ ├── action_type (attribute)
│ │ ├── target_unit_tag (attribute)
│ │ └── target_point/
│ │ ├── x (attribute)
│ │ ├── y (attribute)
│ │ └── z (attribute)
│ └── action_1/
│ └── ...
└── frame_1/
└── ...
- RGB data extraction may not work properly in headless environments or with certain versions of StarCraft II. In such cases, placeholder images are used.
- The extractor currently only supports 1v1 replays.
- The extractor may not work with very old replay files due to compatibility issues with the StarCraft II API.
- When feature screen or minimap data is not available, synthetic features can be generated from RGB data using the
--save-featuresoption in the visualization script.
This project is licensed under the MIT License - see the LICENSE file for details.