BESS Pricing

Rainflow-aware DQN for economic battery storage dispatch with physically consistent market logic, reserve commitment, and economically calibrated degradation.

Overview

Rainflow-exact degradation based on switching points instead of purely heuristic penalties
Physically consistent step logic with capacity limits, efficiency, reserve activation, and a safety layer
YAML-based configuration for battery, market, reward, features, agent, and evaluation
Versioned sample data for arbitrage and load following
Extended evaluation with economic KPIs, Rainflow cycle statistics, and baseline comparisons

Example Calculations from This Repo

The figures below were generated directly from the versioned sample dataset data/sample_arbitrage_30d.csv with configs/default.yaml. For the README assets, the repo intentionally uses deterministic baselines instead of a bundled checkpoint so the results stay reproducible within seconds:

python scripts/build_readme_assets.py

This writes docs/readme_assets/*.png and docs/readme_assets/example_metrics.json.

1. Input Data for the 30-Day Demo

The demo uses 720 hourly steps and combines arbitrage prices, reserve compensation, and an FR signal. That makes the market mechanics of the environment easy to inspect without needing an extra dataset.

2. Comparison of the Included Baselines

Agent	Net Profit [EUR]	Reserve [EUR]	Degradation [EUR]	Throughput [MWh]	Cycles/Day
`MovingAvg`	232.15	0.00	13.24	11.84	0.197
`MovingAvg-Reserve`	228.46	2.10	13.97	12.06	0.201
`Threshold`	205.66	0.00	11.02	6.65	0.111
`Rule-Based`	202.73	3.27	13.27	12.34	0.206
`Quantile`	126.13	5.65	10.17	9.20	0.153

This example highlights two recurring tradeoffs in the repo:

More market activity usually increases gross profit, but it also raises throughput and degradation cost.
Reserve revenue alone does not guarantee the best net dispatch; on this sample, MovingAvg wins without reserve commitment.

3. Dispatch Example from the Current Best README Run

The current best reproducible README run is MovingAvg on data/sample_arbitrage_30d.csv:

Net profit: 232.15 EUR
Arbitrage revenue: 245.39 EUR
Degradation cost: 13.24 EUR
Throughput: 11.84 MWh
Rainflow cycles: 63
Mean cycle depth: 0.069
Final SoH: 99.989 %

If you want to document a trained policy instead of baselines, you can train on the same data with train.py and then evaluate it with evaluate.py.

Quick Start

1. Installation

git clone <repo-url>
cd "BESS Pricing"

python -m venv .venv

# Windows
.venv\Scripts\activate

# Linux / macOS
source .venv/bin/activate

pip install -r requirements.txt

2. Training with the Repo Sample Data

# Train on the 30-day arbitrage sample
python train.py --data data/sample_arbitrage_30d.csv

# Smoke test with fewer timesteps
python train.py --data data/sample_arbitrage_30d.csv --timesteps 50000

# Load following with a custom config
python train.py --config configs/default.yaml --data data/sample_load_following_30d.csv

3. Training with Synthetic Data

# Default run with synthetic data
python train.py

# Generate fresh sample data
python generate_sample_data.py --days 30

4. Evaluation

# Evaluate a trained model
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv

# Save plots without opening windows
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv --no-plot --output evaluation_results

# Compare against baselines
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv --baseline

Regenerate README Figures

python scripts/build_readme_assets.py

Generated files:

docs/readme_assets/market_inputs.png
docs/readme_assets/baseline_comparison.png
docs/readme_assets/dispatch_example.png
docs/readme_assets/example_metrics.json

Project Structure

BESS Pricing/
├── configs/
│   └── default.yaml
├── data/
│   ├── sample_arbitrage.csv
│   ├── sample_arbitrage_30d.csv
│   ├── sample_load_following.csv
│   └── sample_load_following_30d.csv
├── docs/
│   └── readme_assets/              # Reproducible README figures
├── tests/
├── scripts/
│   └── build_readme_assets.py      # Generates the figures used in this README
├── baselines.py                    # Deterministic comparison agents
├── config_loader.py                # YAML config loader
├── data_loader.py                  # CSV and synthetic data loader
├── evaluate.py                     # Evaluation, KPIs, and plots
├── features.py                     # Time features and observation stacking
├── generate_sample_data.py         # Data generator
├── market_env.py                   # Gymnasium environment
├── rainflow_sp.py                  # Switching points and degradation
├── train.py                        # DQN training
└── Readme.md

Configuration

The main configuration file is configs/default.yaml. Key parameter blocks:

task: arbitrage

battery:
  capacity_mwh: 1.0
  p_max_mw: 0.25
  eta_charge: 0.95
  eta_discharge: 0.95
  soc_min: 0.1
  soc_max: 0.9

degradation:
  alpha_d: 0.0045
  beta: 1.3
  use_economic_degradation: true
  reference_dod: 0.8
  cycle_life: 6000.0
  replacement_cost_eur_per_mwh: 120000.0

env:
  dt_hours: 1.0
  n_power_levels: 17
  n_reserve_levels: 4
  reserve_max_fraction: 0.3
  stack_k: 4

agent:
  learning_rate: 0.00048
  batch_size: 256
  gamma: 0.975

training:
  total_timesteps: 250000
  seed: 42

CLI overrides are still supported:

python train.py --data data/sample_arbitrage_30d.csv --timesteps 100000 --seed 123

Data Format

Arbitrage Mode

Column	Description	Required
`timestamp`	Timestamp in ISO format	Optional
`price`	Power price in `EUR/MWh`	Yes
`fr_signal`	Frequency regulation signal in `[-1, 1]`	Optional
`reserve_price`	Reserve compensation in `EUR/MW` per step	Optional
`temperature`	Temperature in `°C`	Optional

Example:

timestamp,price,fr_signal,reserve_price
2024-01-01 00:00:00,45.2,0.12,1.5
2024-01-01 01:00:00,42.8,-0.08,1.5

Included in the repo:

data/sample_arbitrage.csv for fast 48h smoke tests
data/sample_arbitrage_30d.csv for more realistic dispatch and backtest runs

Load Following Mode

Additional columns:

Column	Description
`demand`	Load demand in `MW`
`re_gen`	Renewable generation in `MW`

Included in the repo:

data/sample_load_following.csv
data/sample_load_following_30d.csv

Core Methodology

Rainflow-Exact Degradation

Degradation depends on the cycle depth of the SoC trajectory. Using the last three switching points (c₀, c₁, c₂), the repo computes a per-step increment that remains consistent with Rainflow cycle costs:

h_t^d = α_d · exp(β|c_t + b_t - c₂|) - α_d · exp(β|c_t - c₂|)

The repo then calibrates these units to reference cycle life and replacement cost. That makes degradation directly interpretable as economic asset wear in EUR.

Physically Consistent Dispatch Logic

Each action is split into two components:

planned energy position for arbitrage
reserve ratio that preserves headroom in both directions

The realized SoC change then accounts for:

SoC limits
charge and discharge efficiency
reserve activation through the FR signal
safety-layer projection into the feasible action space

Why DQN in This Setup?

Discrete actions map naturally to power levels and reserve levels with a safety layer.
Time features and observation stacking are already built into the environment.
evaluate.py can compare trained DQN policies directly against baselines.

Outputs

Training

Checkpoints: checkpoints/bess_dqn_*.zip
Best model: checkpoints/best_model.zip
TensorBoard logs: tensorboard/
Config snapshots: checkpoints/config_*.yaml

Start TensorBoard:

tensorboard --logdir tensorboard

Evaluation

KPI JSON: evaluation_results/evaluation_*.json
Evaluation plot: evaluation_results/evaluation_plot_*.png
Cycle plot: evaluation_results/cycles_*.png
Agent comparison: evaluation_results/comparison_*.png

Included metrics:

Net profit and arbitrage revenue
Reserve revenue and reserve compliance
Throughput, EFC, and cycles per day
Profit per throughput
Daily risk metrics such as Sharpe, VaR, and drawdown

API Examples

Use the Environment Directly

from market_env import BessMultiMarketEnv
import numpy as np

price = np.random.uniform(30, 100, size=24 * 7)

env = BessMultiMarketEnv(
    price=price,
    dt_hours=1.0,
    c_min=0.1,
    c_max=0.9,
    c_init=0.5,
    b_max=0.08,
    alpha_d=4.5e-3,
    beta=1.3,
)

obs, info = env.reset()
for _ in range(100):
    action = env.action_space.sample()
    obs, reward, done, trunc, info = env.step(action)
    print(f"SoC: {info['soc']:.2f}, Reward: {reward:.2f}")
    if done:
        break

Load Data Programmatically

from data_loader import load_csv_data, generate_synthetic_data

data = load_csv_data(
    path="data/my_data.csv",
    columns={"price": "price", "fr_signal": "fr_signal"},
    task="arbitrage",
)

synthetic = generate_synthetic_data(days=30, seed=42)
print(synthetic)

Limitations

The degradation model uses an exponential DoD and Rainflow formulation and should be calibrated to the actual cell chemistry.
The README figures show reproducible baseline runs, not the performance of a bundled pretrained DQN checkpoint.
For fair model comparisons, training and evaluation data should be separated in time.

References

Kwon & Zhu: Rainflow-exact degradation in MDPs with DQN
Comparative DRL studies on BESS dispatch with cyclical time features and observation stacking

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configs		configs
data		data
docs/readme_assets		docs/readme_assets
scripts		scripts
tests		tests
utils		utils
.gitignore		.gitignore
Readme.md		Readme.md
baselines.py		baselines.py
config_loader.py		config_loader.py
data_loader.py		data_loader.py
evaluate.py		evaluate.py
features.py		features.py
generate_sample_data.py		generate_sample_data.py
market_env.py		market_env.py
rainflow_sp.py		rainflow_sp.py
requirements.txt		requirements.txt
test_degradation.py		test_degradation.py
train.py		train.py
tune.py		tune.py

Folders and files

Latest commit

History

Repository files navigation

BESS Pricing

Overview

Example Calculations from This Repo

1. Input Data for the 30-Day Demo

2. Comparison of the Included Baselines

3. Dispatch Example from the Current Best README Run

Quick Start

1. Installation

2. Training with the Repo Sample Data

3. Training with Synthetic Data

4. Evaluation

Regenerate README Figures

Project Structure

Configuration

Data Format

Arbitrage Mode

Load Following Mode

Core Methodology

Rainflow-Exact Degradation

Physically Consistent Dispatch Logic

Why DQN in This Setup?

Outputs

Training

Evaluation

API Examples

Use the Environment Directly

Load Data Programmatically

Limitations

References

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages