xpandas

English | 中文

The only pandas-compatible DataFrame library that compiles to TorchScript and runs in pure C++ inference — no Python runtime required.

Write your trading strategy with familiar pd.DataFrame / pd.Series syntax, compile it with torch.jit.script, ship the .pt artifact to a C++ engine, and run at bare-metal speed.

Key Features

🔁 Drop-in replacement — import xpandas as pd replaces import pandas as pd, zero rewrite
⚡ TorchScript native — every op is a registered TORCH_LIBRARY C++ op, fully torch.jit.script-compatible
🚀 Pure C++ inference — torch::jit::load("alpha.pt") + dlopen(libxpandas_ops.so), no Python at runtime
📊 35 ops covering groupby, rolling, ewm, shift, fillna, rank, zscore, cumulative, datetime, and more
🏎️ 2–163× faster than pandas on element-wise and rolling ops (see Benchmarks)

Target Use Cases

Scenario	Why xpandas?
High-frequency / quantitative trading	Prototype Alpha signals in Python pandas, deploy to sub-millisecond C++ engines with zero rewrite
Online model serving	Embed feature engineering (rolling stats, z-scores, pct_change) inside a TorchScript model served by `torch::jit` in C++
Low-latency inference pipelines	Eliminate Python GIL and interpreter overhead — the entire signal path runs in compiled C++
Edge / embedded deployment	Ship a single `.pt` file + shared library — no Python installation needed on the target machine

How xpandas Differs from Alternatives

	xpandas	pandas	Polars	Modin	cuDF (RAPIDS)
Primary goal	TorchScript compilation + C++ inference	General data analysis	Fast DataFrame engine	Scale pandas with parallelism	GPU-accelerated DataFrames
`torch.jit.script` support	✅ First-class — every op is a `TORCH_LIBRARY` custom op	❌	❌	❌	❌
Pure C++ inference	✅ `torch::jit::load()` — no Python at runtime	❌ Requires Python	❌ Requires Rust runtime	❌ Requires Python	❌ Requires Python + CUDA
Deployment artifact	Single `.pt` file + `.so`	Python source + env	Python/Rust source + env	Python source + env	Python source + env
Python GIL-free execution	✅ All ops run in C++	❌	✅ (Rust)	Partial (Ray)	✅ (GPU)
API compatibility	pandas subset (35 ops)	Full pandas API	Own API (SQL-like)	Full pandas API	pandas subset
Best for	Quant signals → C++ prod	EDA, general analytics	Large-scale data processing	Scaling existing pandas code	GPU batch processing

In short: Other libraries optimize how fast you can crunch data in Python.
xpandas solves a fundamentally different problem: getting your pandas logic out of Python entirely and into a compiled, deployable, GIL-free C++ artifact.

Why?

Quantitative trading strategies are often prototyped in Python using pandas. Deploying them to a low-latency C++ engine traditionally requires a full rewrite. xpandas bridges this gap:

Replace import pandas as pd with import xpandas as pd
torch.jit.script(model) compiles the module to TorchScript
Load the .pt file in C++ -- no Python runtime needed

Architecture

  Python side                        C++ side
  -----------                        --------
  import xpandas                     dlopen(libxpandas_ops.so)
  model = Alpha()                    auto m = torch::jit::load("alpha.pt")
  scripted = torch.jit.script(model) m.get_method("on_bod")({ts, data})
  scripted.save("alpha.pt")          auto sig = m.forward({ts, data})

Data model:

Use xpandas.DataFrame exactly like pandas.DataFrame — same API, zero rewrite
Columns are 1-D float64 tensors (numeric) or int64 tensors (enum-encoded strings)
Internally, each pandas-like operation dispatches to a registered torch.ops.xpandas.* C++ op

Project Structure

xpandas/
  __init__.py              # Package init, loads C++ extension
  ops_meta.py              # FakeTensor kernels (for torch.compile)
  csrc/ops/
    ops.h                  # Common header with op declarations
    register.cpp           # TORCH_LIBRARY schema + CPU dispatch
    groupby_resample_ohlc.cpp
    compare.cpp
    cast.cpp
    lookup.cpp
    breakout_signal.cpp
    rank.cpp               # Example op (see CONTRIBUTING.md)
    to_datetime.cpp        # to_datetime + dt_floor
    groupby_agg.cpp        # groupby_sum/mean/count/std
    groupby_minmax.cpp     # groupby_min/max/first/last
    rolling.cpp            # rolling_sum/mean/std
    rolling_minmax.cpp     # rolling_min/max (O(n) monotonic deque)
    shift.cpp              # shift (lag/lead)
    fillna.cpp             # fillna
    where.cpp              # where_, masked_fill
    pct_change.cpp         # pct_change
    cumulative.cpp         # cumsum, cumprod
    clip.cpp               # clip
    math_ops.cpp           # abs_, log_, zscore
    ewm.cpp                # ewm_mean
    sort.cpp               # sort_by
inference/
  main.cpp                 # Pure C++ inference driver
examples/
  alpha_original.py        # Original pandas-based Alpha (reference)
  alpha_ts.py              # TorchScript-compatible Alpha (breakout)
  alpha_vwap.py            # TorchScript VWAP mean-reversion Alpha
  alpha_momentum.py        # TorchScript momentum z-score Alpha
  trace_and_save.py        # Script + test + save alpha.pt
benchmarks/
  bench_ops.py             # xpandas vs pandas performance comparison
tests/
  test_ops.py                  # Unit tests for each C++ op (110 tests)
  test_wrappers.py             # Wrapper API tests (233 tests)
  test_alpha_e2e.py            # End-to-end TorchScript tests (10 tests)
  test_alpha_xpandas_e2e.py    # End-to-end xpandas wrapper tests (5 tests)

Quickstart

Prerequisites

Python >= 3.9
PyTorch >= 2.0
A C++ compiler with C++17 support

Install (Python)

pip install --no-build-isolation -e .

Note: --no-build-isolation is required to ensure the C++ extension is compiled with the same ABI as your installed PyTorch.

Run Tests

pytest tests/ -v

Script and Save a Model

python examples/trace_and_save.py
# produces alpha.pt

Build and Run C++ Inference

mkdir build && cd build
cmake -DCMAKE_PREFIX_PATH="$(python -c 'import torch; print(torch.utils.cmake_prefix_path)')" ..
make -j

./alpha_infer ../alpha.pt ./libxpandas_ops.so
# Output: Signal: [+1.0, -1.0]

Available Ops (35 total)

DataFrame Utilities

Op	Schema	Pandas Equivalent
`lookup`	`(Dict(str, Tensor) table, str key) -> Tensor`	`df['col']`
`sort_by`	`(Dict(str, Tensor) table, str by, bool ascending) -> Dict(str, Tensor)`	`df.sort_values(by)`

Groupby / Aggregation

Op	Schema	Pandas Equivalent
`groupby_resample_ohlc`	`(Tensor key, Tensor value) -> (Tensor, Tensor, Tensor, Tensor, Tensor)`	`df.groupby(key)[val].resample().{first,max,min,last}()`
`groupby_sum`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].sum()`
`groupby_mean`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].mean()`
`groupby_count`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].count()`
`groupby_std`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].std()`
`groupby_min`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].min()`
`groupby_max`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].max()`
`groupby_first`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].first()`
`groupby_last`	`(Tensor key, Tensor value) -> (Tensor, Tensor)`	`df.groupby(key)[val].last()`

Element-wise Comparison

Op	Schema	Pandas Equivalent
`compare_gt`	`(Tensor a, Tensor b) -> Tensor`	`series > series`
`compare_lt`	`(Tensor a, Tensor b) -> Tensor`	`series < series`

Type Casting

Op	Schema	Pandas Equivalent
`bool_to_float`	`(Tensor x) -> Tensor`	`series.astype(float)`

Fused Signals

Op	Schema	Pandas Equivalent
`breakout_signal`	`(Tensor price, Tensor high, Tensor low) -> Tensor`	`(price > high).float() - (price < low).float()`

Statistical

Op	Schema	Pandas Equivalent
`rank`	`(Tensor x) -> Tensor`	`series.rank(method='average')`
`zscore`	`(Tensor x) -> Tensor`	`(series - series.mean()) / series.std()`

Datetime

Op	Schema	Pandas Equivalent
`to_datetime`	`(Tensor epochs, str unit) -> Tensor`	`pd.to_datetime(series, unit=...)`
`dt_floor`	`(Tensor dt_ns, int interval_ns) -> Tensor`	`series.dt.floor(freq)`

Rolling Window

Op	Schema	Pandas Equivalent
`rolling_sum`	`(Tensor x, int window) -> Tensor`	`series.rolling(window).sum()`
`rolling_mean`	`(Tensor x, int window) -> Tensor`	`series.rolling(window).mean()`
`rolling_std`	`(Tensor x, int window) -> Tensor`	`series.rolling(window).std()`
`rolling_min`	`(Tensor x, int window) -> Tensor`	`series.rolling(window).min()`
`rolling_max`	`(Tensor x, int window) -> Tensor`	`series.rolling(window).max()`

Shift / Lag

Op	Schema	Pandas Equivalent
`shift`	`(Tensor x, int periods) -> Tensor`	`series.shift(periods)`

NaN Handling

Op	Schema	Pandas Equivalent
`fillna`	`(Tensor x, float fill_value) -> Tensor`	`series.fillna(value)`

Conditional

Op	Schema	Pandas Equivalent
`where_`	`(Tensor cond, Tensor x, Tensor other) -> Tensor`	`series.where(cond, other)`
`masked_fill`	`(Tensor x, Tensor mask, float fill_value) -> Tensor`	`series.mask(mask, value)`

Percentage Change

Op	Schema	Pandas Equivalent
`pct_change`	`(Tensor x, int periods) -> Tensor`	`series.pct_change(periods)`

Cumulative

Op	Schema	Pandas Equivalent
`cumsum`	`(Tensor x) -> Tensor`	`series.cumsum()`
`cumprod`	`(Tensor x) -> Tensor`	`series.cumprod()`

Clipping

Op	Schema	Pandas Equivalent
`clip`	`(Tensor x, float lower, float upper) -> Tensor`	`series.clip(lower, upper)`

Math

Op	Schema	Pandas Equivalent
`abs_`	`(Tensor x) -> Tensor`	`series.abs()`
`log_`	`(Tensor x) -> Tensor`	`np.log(series)`

Exponential Weighted

Op	Schema	Pandas Equivalent
`ewm_mean`	`(Tensor x, int span) -> Tensor`	`series.ewm(span=span, adjust=False).mean()`

Benchmarks

Run python benchmarks/bench_ops.py to compare xpandas ops against their pandas equivalents. Example output (N=10,000, 20 repeats, median time):

Op                         pandas (us)  xpandas (us)   speedup
--------------------------------------------------------------
clip                             481.2           5.6    85.91x >>>
to_datetime                     2706.3          16.6   163.02x >>>
rolling_sum                      142.1           9.1    15.66x >>>
breakout_signal                  187.5          10.3    18.26x >>>
pct_change                       207.7          17.3    11.99x >>>
...
--------------------------------------------------------------
Geometric mean speedup:                                  2.23x
Faster in 23/34 ops

Key wins: element-wise ops (clip, compare_*, fillna), rolling window ops (rolling_sum/mean/std), fused ops (breakout_signal), and datetime conversion (to_datetime — 163x). Groupby ops are slower because xpandas uses sorted std::map keys (for deterministic TorchScript output) vs pandas' optimized Cython hashmaps.

Wrapper Benchmarks

Run python benchmarks/bench_wrappers.py to measure Python wrapper overhead and end-to-end Alpha performance. Example output (N=10,000, 30 repeats, median time):

Part 1: Wrapper Overhead

Operation	Direct (μs)	Wrapper (μs)	Overhead
Series.gt	9.4	9.7	+4%
Series.lt	9.4	9.9	+5%
Series.sub	3.7	3.9	+7%
Series.astype(float)	9.9	10.2	+3%
DataFrame.getattr	0.1	0.7	+596%
GroupBy→OHLC chain	127.5	517.9	+306%
OHLC×4 cached	509.7	131.1	-74% 🏆

Part 2: End-to-End Alpha (pandas vs xpandas, rolling mean crossover)

Size	Instruments	Pandas (ms)	xpandas (ms)	Speedup
Small (10×50)	10	0.3	0.0	11.9×
Medium (50×100)	50	0.4	0.1	7.8×
Large (500×10,000)	500	315.4	54.2	5.8×

Wrapper overhead on element-wise ops is negligible (<10%). xpandas is consistently faster than pandas at all tested scales. At production scale (500 instruments × 10,000 ticks), xpandas completes a rolling mean crossover signal in 54 ms vs pandas' 315 ms — a 5.8× speedup. The GroupBy→OHLC chain is an exception: xpandas uses sorted std::map keys for deterministic TorchScript output, which is slower than pandas' Cython hashmaps for groupby-heavy workloads.

API Reference (Python Wrappers)

The Python wrapper API (import xpandas as pd) provides pandas-compatible classes that dispatch to C++ ops under the hood.

Core Classes

Class	Description	Key Methods
`pd.DataFrame`	Dict-backed DataFrame (`Dict[str, Tensor]`)	`__getitem__`, `__setitem__`, `columns`, `shape`, `dtypes`, `head()`, `tail()`, `drop()`, `rename()`, `sort_values()`, `merge()`, `describe()`, `apply()`, `groupby()`
`pd.Series`	1-D Tensor wrapper	Arithmetic (`+`,`-`,``,`/`,`*`,`%`), comparison (`>`,`<`,`>=`,`<=`,`==`,`!=`), `abs()`, `log()`, `zscore()`, `rank()`, `fillna()`, `shift()`, `pct_change()`, `cumsum()`, `cumprod()`, `clip()`, `where()`, `mask()`, `rolling()`, `ewm()`, `expanding()`, `mean()`, `std()`, `sum()`, `min()`, `max()`
`pd.GroupBy`	GroupBy entry point	`__getitem__(col)` → `GroupByColumn`
`pd.GroupByColumn`	Single-column group aggregation	`sum()`, `mean()`, `count()`, `std()`, `min()`, `max()`, `first()`, `last()`, `resample(freq)` → returns `(keys, values)` tuples
`pd.Rolling`	Rolling window	`mean()`, `sum()`, `std()`, `min()`, `max()`
`pd.EWM`	Exponential weighted	`mean()`
`pd.Expanding`	Expanding window	`sum()`, `mean()`
`pd.Resampler`	OHLC resampling	`first()`, `max()`, `min()`, `last()` (cached — one C++ call for all four)
`pd.Index`	Index wrapper	`get_level_values()`

Module-Level Functions

Function	Description
`pd.concat(items, axis=0)`	Concatenate Series (axis=0) or DataFrames (axis=1)
`pd.to_datetime(tensor, unit='s')`	Convert epoch timestamps to nanosecond datetime tensors
`pd.dt_floor(tensor, freq='1D')`	Floor datetime tensors to a frequency

Important Differences from pandas

All tensors must be torch.double (float64) for value columns, torch.long (int64) for groupby keys
GroupBy returns (keys_tensor, values_tensor) tuples, not pandas-style grouped DataFrames
DataFrame is Dict[str, Tensor] internally — column order depends on insertion order
to_datetime and dt_floor are module-level functions, not Series methods

See examples/wrapper_api_tour.py for a complete working demo of every class and method.

Migration Guide

Migrating from pandas to xpandas is straightforward — most code changes are mechanical.

Step 1: Change your import

# Before
import pandas as pd

# After
import xpandas as pd
import torch

Step 2: Use torch.tensor instead of Python lists

# Before (pandas)
df = pd.DataFrame({'price': [100.0, 101.5, 99.8]})

# After (xpandas)
df = pd.DataFrame({'price': torch.tensor([100.0, 101.5, 99.8], dtype=torch.double)})

Step 3: Adapt GroupBy results

# pandas: returns a DataFrame/Series with group index
result = df.groupby('sector')['price'].mean()
print(result['tech'])  # index-based access

# xpandas: returns (keys_tensor, values_tensor) tuple
keys, means = df.groupby('sector')['price'].mean()
print(keys, means)  # tensor([0, 1, 2]), tensor([100.5, 98.3, 105.1])

Step 4: Use module-level datetime functions

# pandas
df['date'] = pd.to_datetime(df['epoch'], unit='s')
df['date_floor'] = df['date'].dt.floor('1D')

# xpandas
df['date'] = pd.to_datetime(df['epoch'], unit='s')
df['date_floor'] = pd.dt_floor(df['date'], freq='1D')

Common Patterns Side-by-Side

pandas	xpandas	Notes
`df['col']`	`df['col']`	✅ Same
`df.col`	`df.col`	✅ Same
`series + series`	`series + series`	✅ Same
`series.rolling(5).mean()`	`series.rolling(5).mean()`	✅ Same
`series.ewm(span=10).mean()`	`series.ewm(span=10).mean()`	✅ Same
`series.fillna(0)`	`series.fillna(0)`	✅ Same
`df.sort_values('col')`	`df.sort_values(by='col')`	✅ Same
`df.merge(other, on='key')`	`df.merge(other, on='key')`	✅ Same
`series.where(cond, -1.0)`	`series.where(cond, tensor)`	⚠️ `other` must be a Tensor
`df.groupby('k')['v'].sum()`	`keys, vals = df.groupby('k')['v'].sum()`	⚠️ Returns tuple
`pd.to_datetime(s, unit='s')`	`pd.to_datetime(t, unit='s')`	✅ Same (module-level)
`s.dt.floor('1D')`	`pd.dt_floor(t, freq='1D')`	⚠️ Module-level function

See examples/pandas_migration.py for a fully runnable side-by-side comparison.

Troubleshooting / FAQ

Q: I get `RuntimeError: expected scalar type Double` — what's wrong?

All xpandas value columns must be torch.double (float64). Check your tensor creation:

# Wrong
t = torch.tensor([1.0, 2.0, 3.0])            # defaults to float32!

# Right
t = torch.tensor([1.0, 2.0, 3.0], dtype=torch.double)

Q: GroupBy raises an error about `Long` tensors?

GroupBy key columns must be torch.long (int64):

df = pd.DataFrame({
    'group': torch.tensor([1, 2, 1, 2], dtype=torch.long),   # int64 keys
    'value': torch.tensor([10.0, 20.0, 30.0, 40.0], dtype=torch.double)
})
keys, sums = df.groupby('group')['value'].sum()

Q: `where()` or `mask()` fails with a scalar argument?

Unlike pandas, xpandas requires other to be a Tensor, not a scalar:

# Wrong
result = series.where(cond, -1.0)

# Right
result = series.where(cond, torch.full_like(series.values, -1.0))

Q: My model fails during `torch.jit.script()` — what should I check?

Ensure all DataFrame columns are Tensors (no Python lists or NumPy arrays)
GroupBy keys must be torch.long, values must be torch.double
Use pd.to_datetime() and pd.dt_floor() as module-level calls, not methods
Avoid Python-only constructs inside @torch.jit.script (list comprehensions, f-strings, etc.)

Q: How do I deploy to C++ inference?

# 1. Script and save in Python
python examples/trace_and_save.py  # produces alpha.pt

# 2. Build C++ inference binary
mkdir build && cd build
cmake -DCMAKE_PREFIX_PATH="$(python -c 'import torch; print(torch.utils.cmake_prefix_path)')" ..
make -j

# 3. Run — no Python needed
./alpha_infer ../alpha.pt ./libxpandas_ops.so

See inference/main.cpp for the complete C++ driver code.

Q: Are groupby ops slower than pandas?

Yes — by design. xpandas groupby uses sorted std::map keys to guarantee deterministic output order in TorchScript. pandas uses optimized Cython hashmaps that are faster but non-deterministic. If groupby performance is critical, consider pre-sorting your data or reducing group cardinality.

Q: Can I use xpandas with `torch.compile`?

Basic support exists via FakeTensor kernels in ops_meta.py. However, the primary compilation target is torch.jit.script. For production deployment, use TorchScript.

Q: What about GPU support?

Currently all ops are CPU-only. The ops dispatch through PyTorch's dispatcher, so adding CUDA kernels is architecturally possible but not yet implemented.

Contributing

See CONTRIBUTING.md (中文) for a step-by-step guide to adding a new op, using rank as a worked example.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
examples		examples
inference		inference
tests		tests
xpandas		xpandas
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
alpha.py		alpha.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

xpandas

Key Features

Target Use Cases

How xpandas Differs from Alternatives

Why?

Architecture

Project Structure

Quickstart

Prerequisites

Install (Python)

Run Tests

Script and Save a Model

Build and Run C++ Inference

Available Ops (35 total)

DataFrame Utilities

Groupby / Aggregation

Element-wise Comparison

Type Casting

Fused Signals

Statistical

Datetime

Rolling Window

Shift / Lag

NaN Handling

Conditional

Percentage Change

Cumulative

Clipping

Math

Exponential Weighted

Benchmarks

Wrapper Benchmarks

API Reference (Python Wrappers)

Core Classes

Module-Level Functions

Important Differences from pandas

Migration Guide

Step 1: Change your import

Step 2: Use torch.tensor instead of Python lists

Step 3: Adapt GroupBy results

Step 4: Use module-level datetime functions

Common Patterns Side-by-Side

Troubleshooting / FAQ

Q: I get RuntimeError: expected scalar type Double — what's wrong?

Q: GroupBy raises an error about Long tensors?

Q: where() or mask() fails with a scalar argument?

Q: My model fails during torch.jit.script() — what should I check?

Q: How do I deploy to C++ inference?

Q: Are groupby ops slower than pandas?

Q: Can I use xpandas with torch.compile?

Q: What about GPU support?

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Q: I get `RuntimeError: expected scalar type Double` — what's wrong?

Q: GroupBy raises an error about `Long` tensors?

Q: `where()` or `mask()` fails with a scalar argument?

Q: My model fails during `torch.jit.script()` — what should I check?

Q: Can I use xpandas with `torch.compile`?

Packages