VR Experiment Data Processing Pipeline
An open-source toolkit for standardized XR behavioral research. Converts Unity/Meta Quest tracking data to BIDS (Brain Imaging Data Structure) compliant format.
- 🎯 Multi-system tracking support: Head, Hands, Eyes, Face, Body, Controllers
- 📊 BIDS-compliant output: Generates motion.tsv, channels.tsv, events.tsv, and JSON sidecars; LATENCY channels (per-system and global
timeSinceStartup) for per-sample timing from recording onset - ✅ Quality validation: Tracking loss, sampling irregularities; column-specific flagging (e.g. flag only left-hand columns when left hand loses tracking)
- 📈 HTML reports: Visual quality reports; quality flag times in global time (converted to
timeSinceStartupand relative to recording onset) - ⚙️ Configurable pipeline: YAML-based configuration; alternate time columns per system (e.g. Hands →
Node_HandLeft_Time); optional quality masking (NaN replacement, no row deletion) - 🔧 Multi-stream validation: Checks can declare
required_streamsand access other streams viasession.get_stream() - 🔧 CLI & programmatic API: Use from command line or import as a library
For users (minimal install):
git clone https://github.com/ResXR/ResXR.git
cd ResXR
conda env create -f environment.yml
conda activate resxrFor developers (includes dev tools):
git clone https://github.com/ResXR/ResXR.git
cd ResXR
conda env create -f environment-dev.yml
conda activate resxr_devThe dev environment includes pytest, ruff, and installs the package in editable mode.
git clone https://github.com/ResXR/ResXR.git
cd ResXR
uv sync --all-extrasRun tests:
uv run pytest- Python ≥3.10 (3.12+ recommended)
- pandas, numpy, pyyaml, pydantic, plotly, jinja2
Edit config/pipeline_config.yaml:
# Input/Output paths
input:
data_dir: /path/to/your/session/data
continuous_data_pattern: "*_ContinuousData.csv"
face_data_pattern: "*_FaceExpressionData.csv"
metadata_pattern: "*session_metadata.json"
events_data_pattern: "*events*.csv" # Optional: task/stimulus events
output:
bids_root: /path/to/output
dataset_name: "My VR Study"
task_name: "VRtracking"
# Map source folders to BIDS subject/session IDs
session_mappings:
- source_dir: "session_001"
subject_id: "01"
session_label: "01"
# Hardware metadata
device:
manufacturer: "Meta"
model_name: "Meta Quest"CLI:
resxr -c config/pipeline_config.yamlPython:
import resxr
resxr.run(config_path="config/pipeline_config.yaml")resxr -c config/pipeline_config.yaml --dry-run┌─────────────────────────────────────────────────────────────────┐
│ ResXR Pipeline │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Load Data │───▶│ Split by │───▶│ Validate │
│ (CSV + JSON) │ │ Tracking │ │ Quality │
│ │ │ System │ │ │
└──────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌──────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Generate │◀───│ Export BIDS │◀───│ Preprocess │
│ Reports │ │ (prepare_ │ │ (optional │
│ (global time) │ │ motion_data + │ │ NaN masking) │
│ │ │ write) │ │ │
└──────────────────┘ └──────────────────┘ └─────────────────┘
Before writing, each stream is passed through prepare_motion_data: internal time columns (timestamp, timeSinceStartup) are converted into BIDS LATENCY channels (latency, latency_global — seconds from recording onset) and the originals are removed. The HTML report shows quality flag times in global time (timeSinceStartup), relative to recording onset.
ResXR expects data recorded from Unity/Meta Quest with the following structure:
session_folder/
├── *_ContinuousData.csv # Main tracking data (head, hands, eyes, etc.)
├── *_FaceExpressionData.csv # Face tracking (FACS blendshapes) - optional
├── *events*.csv # Task/stimulus event markers - optional
└── *session_metadata.json # Recording configuration & timestamps
The pipeline automatically identifies tracking systems by column prefixes:
| System | Column Prefixes |
|---|---|
| Head | Node_Head_*, FocusedObject, RecenterCount, TrackingLost, UserPresent, timeSinceStartup, TrackingOriginChange_*, TrackingTransform_* |
| Hands | Node_HandLeft_*, Node_HandRight_*, LeftHand_*, RightHand_*, Left_XRHand_*, Right_XRHand_* |
| Eyes | EyeGazeHitPosition_*, RightEye_*, LeftEye_*, Node_EyeCenter_*, Eyes_Time |
| Face | Face_*, Brow_*, Cheek_*, Jaw_*, Lip_*, Lips_*, Tongue_*, etc. |
| Body | Body_* (e.g. Body_Time, Body_Confidence, Body_Chest_px, Body_Head_Flags_*) |
| Controllers | Node_ControllerLeft_*, Node_ControllerRight_* |
If using task/stimulus events, provide a CSV with these columns:
| Column | Type | Description | Example |
|---|---|---|---|
onset |
float | Event start time in seconds (relative to session start) | 2.5, 5.832 |
duration |
float | Event duration in seconds (use 0 for instantaneous events) |
1.5, 0 |
name |
string | Event type/label; exported as BIDS trial_type |
"stimulus_onset", "response", "trial_start" |
Example events CSV:
onset,duration,name
0.0,0,trial_start
2.5,1.5,stimulus_A
4.2,0,response
5.0,0,trial_endNote: Duration of 0 indicates an instantaneous event (e.g., button press, stimulus onset).
At BIDS export, ResXR writes events to the session motion/ directory using the configured task name, for example sub-01_ses-01_task-VRtracking_events.tsv. The input name column is exported as BIDS trial_type, and the output events.tsv columns start with BIDS-required onset, duration, then trial_type.
bids_root/
├── dataset_description.json
├── participants.tsv
├── participants.json
├── .bidsignore
└── sub-01/
└── ses-01/
├── sub-01_ses-01_scans.tsv
└── motion/
├── sub-01_ses-01_task-VRtracking_tracksys-Head_motion.tsv
├── sub-01_ses-01_task-VRtracking_tracksys-Head_motion.json
├── sub-01_ses-01_task-VRtracking_tracksys-Head_channels.tsv
├── sub-01_ses-01_task-VRtracking_tracksys-Head_channels.json
├── sub-01_ses-01_task-VRtracking_tracksys-Hands_motion.tsv
├── sub-01_ses-01_task-VRtracking_events.tsv # Optional: task events
├── sub-01_ses-01_task-VRtracking_events.json
└── ... (similar for Eyes, Face, etc.)
All BIDS metadata values are configurable:
bids:
missing_values: "NaN" # How NaN values are written
dataset_type: "raw" # BIDS dataset type
license: "CC0" # Dataset license
authors: [] # List of authors
reference_frame: # Coordinate system
description: "Global VR playspace coordinate system..."
rotation_rule: "left-hand"
rotation_order: "ZXY"
spatial_axes: "RSA"Expected sampling rates for each tracking system (Hz):
sampling_frequencies:
Head: 72.0
Hands: 90.0
Eyes: 30.0
Face: 30.0
Body: 72.0
Controllers: 90.0Task descriptions for BIDS sidecar files:
system_descriptions:
Head: "Head position and rotation tracking from Meta Quest VR headset"
Hands: "Hand and finger tracking from Meta Quest VR headset"
Eyes: "Eye gaze tracking from Meta Quest VR headset"
Face: "Facial expression tracking using FACS blend shapes"
Body: "Full body joint tracking from Meta Quest VR headset"
Controllers: "VR controller position and rotation tracking"| Option | Description | Required |
|---|---|---|
data_dir |
Root directory containing session data | Yes |
continuous_data_pattern |
Glob pattern for main tracking CSV | Yes |
face_data_pattern |
Glob pattern for face tracking CSV | Yes |
metadata_pattern |
Glob pattern for session metadata JSON | Yes |
events_data_pattern |
Glob pattern for optional events CSV | No |
| Option | Description | Required |
|---|---|---|
bids_root |
Output directory for BIDS data | Yes |
dataset_name |
Dataset name in description.json | Yes |
bids_version |
BIDS specification version | Yes |
task_name |
Task name in filenames | Yes |
overwrite |
Overwrite existing files | Yes |
device:
manufacturer: "Meta"
model_name: "Meta Quest"Enable/disable specific tracking systems:
systems:
Head: { enabled: true }
Hands: { enabled: true }
Eyes: { enabled: true }
Face: { enabled: true }
Body: { enabled: false }
Controllers: { enabled: false }validation:
sampling_rate_tolerance: 0.10 # Max deviation from expected rate (10%)
sampling_cv_threshold: 0.5 # Max coefficient of variation (50%)
eyes_closed_threshold: 0.9 # Blend shape value to consider eye closed (0-1)
eyes_closed_min_duration: 0.1 # Min duration (seconds) to flag as closed
enabled_checks:
- hands_tracking_loss
- sampling_rate
- eyes_closed
- stats_summary
# Column groups for column-scoped checks
column_groups:
- name: "Left Hand"
description: "Left hand wrist position and rotation"
columns:
- Left_XRHand_Wrist_x
- Left_XRHand_Wrist_y
- Left_XRHand_Wrist_z
- name: "Right Hand"
description: "Right hand wrist position and rotation"
columns:
- Right_XRHand_Wrist_x
- Right_XRHand_Wrist_y
- Right_XRHand_Wrist_z
# Optional: assign which checks receive which groups (by group name)
check_column_groups:
hands_tracking_loss:
- "Left Hand"
- "Right Hand"Column groups are defined by name, description, and columns. Each entry must have a name and columns key (validated at config load time). check_column_groups optionally assigns which checks receive which groups (by group name). If column_groups is omitted, checks fall back to a single default group containing all columns in the stream.
Quality flags can optionally be applied as NaN masks. At BIDS export, prepare_motion_data converts internal time columns into BIDS LATENCY channels (seconds from recording onset) and strips the originals. — flagged values are replaced with NaN while all rows are preserved (no row deletion). This is disabled by default; the RAW dataset is never modified.
preprocessing:
apply_quality_masking: false # Enable NaN masking for flagged segments
masking_checks: null # null = all flags; or list specific checks e.g. ["tracking_loss"]
# Per-system time column. When set, splitter renames it to timestamp and keeps global as timeSinceStartup.
# At write time: timestamp → latency (per-system), timeSinceStartup → latency_global. Leave {} for global only.
alternate_time_columns:
Hands: "Node_HandLeft_Time"
# Eyes: "Eyes_Time"
# Body: "Body_Time"Alternate time columns: Some streams use their own time column (e.g. Node_HandLeft_Time for Hands). Map system name to exactly one column name; that column is used as the stream’s time axis and included in the stream’s columns.
report:
enabled: true
output_dir: null # null = auto (e.g. alongside BIDS output)from resxr import run, PipelineConfig, Session
# Option 1: Run with config file
run(config_path="config/pipeline_config.yaml")
# Option 2: Load and inspect config
from resxr.core.config import PipelineConfig
config = PipelineConfig.from_yaml("config/pipeline_config.yaml")
print(f"Input: {config.input.data_dir}")
print(f"Output: {config.output.bids_root}")
# Option 3: Access data structures directly (using config from YAML)
from resxr.io.readers import load_session
from resxr.core.config import PipelineConfig
config = PipelineConfig.from_yaml("config/pipeline_config.yaml")
session = load_session("/path/to/session_dir", config.input)
print(f"Session: {session.session_id}")
print(f"Duration: {session.total_duration_seconds:.1f}s")When enabled, ResXR generates HTML reports including:
- Session summary: Duration, streams, quality flags (including column-specific flags with group labels when
column_groupsis used) - Timeline plot: Interactive Plotly timeline with flagged segments and optional event markers
- Per-stream stats: Row counts, channel counts, NaN percentages, expected vs effective sampling rates
- Quality flags table: All flag times are in global time (
timeSinceStartup), converted from per-stream time and expressed relative to recording onset (single shared timeline from 0)
ResXR/
├── config/
│ └── pipeline_config.yaml # Pipeline configuration (all values here!)
├── src/resxr/
│ ├── __init__.py # Package exports
│ ├── cli.py # Command-line interface
│ ├── pipeline.py # Main orchestration (calls prepare_motion_data at write)
│ ├── core/ # Core data structures
│ │ ├── config.py # PipelineConfig, ColumnGroup, ValidationConfig, etc.
│ │ ├── constants.py # Enums & column patterns
│ │ ├── exceptions.py # Custom exceptions
│ │ ├── logger.py # Logging setup
│ │ └── session.py # Session & TrackingStream
│ ├── io/ # Input/Output
│ │ ├── readers.py # CSV/JSON loaders
│ │ ├── splitter.py # Split by tracking system (split-only; no LATENCY here)
│ │ ├── writers.py # TSV/JSON writers (motion.tsv expects prepared data)
│ │ └── column_maps.py # Column classification; LATENCY channel recognition
│ ├── bids/ # BIDS formatting
│ │ ├── layout.py # Directory structure
│ │ ├── metadata.py # JSON sidecar (uses prepared_data when provided)
│ │ ├── naming.py # BIDS filename conventions
│ │ └── channels.py # channels.tsv from prepared DataFrame
│ ├── preprocessing/ # Data cleaning & BIDS prep
│ │ └── stream_preprocessing.py # Masking + prepare_motion_data (LATENCY channels)
│ ├── validation/ # Quality checks
│ │ ├── registry.py # Check registration
│ │ └── checks/ # Individual validators (extensible!)
│ │ ├── hands_tracking_loss.py # Hand tracking loss detection
│ │ ├── sampling_rate.py # Sampling irregularity detection
│ │ ├── eyes_closed.py # Eye closure detection (face data)
│ │ └── stats.py # Per-stream/column statistics
│ ├── utils/ # Shared utilities
│ │ └── __init__.py # find_recording_onset (onset = first non-zero time)
│ └── visualization/ # Reporting
│ ├── report.py # HTML report (flag times in global timeSinceStartup)
│ └── templates/report.html
You can add custom validation checks in 3 steps:
- Create a check class in
validation/checks/:
from resxr.core.config import ColumnGroup, ValidationConfig
from resxr.core.session import QualityFlag, Session, TrackingStream
from resxr.validation.registry import register_check
class MyCheck:
name = "my_check"
description = "Short description"
required_streams = None # None = per-stream; or [TrackingSystem.HANDS, ...] for multi-stream
def __call__(self, stream, session, config):
df = stream.data
# Get column groups for this check (falls back to all columns if not configured)
groups = config.get_column_groups(
self.name,
default_columns=[c for c in df.columns if c != "timestamp"],
)
flags = []
for group in groups:
# group.name - human-readable label
# group.columns - list of column names
# group.description - longer description
...
return flags
# Register an instance of the check
register_check(MyCheck())- Export in
checks/__init__.py - Enable in YAML:
validation:
enabled_checks:
- my_checkColumn groups: Checks call config.get_column_groups(self.name) to get their assigned ColumnGroup objects. If no groups are configured in the YAML, passing default_columns provides an automatic fallback (a single group with all stream columns). Groups are defined in config.py as a simple dataclass with name, description, and columns fields.
Multi-stream checks: Set required_streams = [TrackingSystem.X, TrackingSystem.Y]. The check runs once when processing the first stream in the list and can access other streams via session.get_stream(system). Use this when a check needs data from more than one tracking system.
See the project README and source docstrings for details.
Apache License 2.0 - see LICENSE file for details.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
If you use ResXR in your research, please cite:
@software{resxr,
title = {ResXR: XR Experiment Data Processing Pipeline},
year = {2026},
url = {https://github.com/ResXR/ResXR}
}