Skip to content

cmkdropdrop/BESS-Pricing

Repository files navigation

BESS Pricing

Rainflow-aware DQN for economic battery storage dispatch with physically consistent market logic, reserve commitment, and economically calibrated degradation.

Overview

  • Rainflow-exact degradation based on switching points instead of purely heuristic penalties
  • Physically consistent step logic with capacity limits, efficiency, reserve activation, and a safety layer
  • YAML-based configuration for battery, market, reward, features, agent, and evaluation
  • Versioned sample data for arbitrage and load following
  • Extended evaluation with economic KPIs, Rainflow cycle statistics, and baseline comparisons

Example Calculations from This Repo

The figures below were generated directly from the versioned sample dataset data/sample_arbitrage_30d.csv with configs/default.yaml. For the README assets, the repo intentionally uses deterministic baselines instead of a bundled checkpoint so the results stay reproducible within seconds:

python scripts/build_readme_assets.py

This writes docs/readme_assets/*.png and docs/readme_assets/example_metrics.json.

1. Input Data for the 30-Day Demo

Versioned 30-day example data

The demo uses 720 hourly steps and combines arbitrage prices, reserve compensation, and an FR signal. That makes the market mechanics of the environment easy to inspect without needing an extra dataset.

2. Comparison of the Included Baselines

Baseline comparison on the 30-day sample data

Agent Net Profit [EUR] Reserve [EUR] Degradation [EUR] Throughput [MWh] Cycles/Day
MovingAvg 232.15 0.00 13.24 11.84 0.197
MovingAvg-Reserve 228.46 2.10 13.97 12.06 0.201
Threshold 205.66 0.00 11.02 6.65 0.111
Rule-Based 202.73 3.27 13.27 12.34 0.206
Quantile 126.13 5.65 10.17 9.20 0.153

This example highlights two recurring tradeoffs in the repo:

  • More market activity usually increases gross profit, but it also raises throughput and degradation cost.
  • Reserve revenue alone does not guarantee the best net dispatch; on this sample, MovingAvg wins without reserve commitment.

3. Dispatch Example from the Current Best README Run

Dispatch example of the MovingAvg agent

The current best reproducible README run is MovingAvg on data/sample_arbitrage_30d.csv:

  • Net profit: 232.15 EUR
  • Arbitrage revenue: 245.39 EUR
  • Degradation cost: 13.24 EUR
  • Throughput: 11.84 MWh
  • Rainflow cycles: 63
  • Mean cycle depth: 0.069
  • Final SoH: 99.989 %

If you want to document a trained policy instead of baselines, you can train on the same data with train.py and then evaluate it with evaluate.py.

Quick Start

1. Installation

git clone <repo-url>
cd "BESS Pricing"

python -m venv .venv

# Windows
.venv\Scripts\activate

# Linux / macOS
source .venv/bin/activate

pip install -r requirements.txt

2. Training with the Repo Sample Data

# Train on the 30-day arbitrage sample
python train.py --data data/sample_arbitrage_30d.csv

# Smoke test with fewer timesteps
python train.py --data data/sample_arbitrage_30d.csv --timesteps 50000

# Load following with a custom config
python train.py --config configs/default.yaml --data data/sample_load_following_30d.csv

3. Training with Synthetic Data

# Default run with synthetic data
python train.py

# Generate fresh sample data
python generate_sample_data.py --days 30

4. Evaluation

# Evaluate a trained model
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv

# Save plots without opening windows
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv --no-plot --output evaluation_results

# Compare against baselines
python evaluate.py checkpoints/best_model.zip --data data/sample_arbitrage_30d.csv --baseline

Regenerate README Figures

python scripts/build_readme_assets.py

Generated files:

  • docs/readme_assets/market_inputs.png
  • docs/readme_assets/baseline_comparison.png
  • docs/readme_assets/dispatch_example.png
  • docs/readme_assets/example_metrics.json

Project Structure

BESS Pricing/
├── configs/
│   └── default.yaml
├── data/
│   ├── sample_arbitrage.csv
│   ├── sample_arbitrage_30d.csv
│   ├── sample_load_following.csv
│   └── sample_load_following_30d.csv
├── docs/
│   └── readme_assets/              # Reproducible README figures
├── tests/
├── scripts/
│   └── build_readme_assets.py      # Generates the figures used in this README
├── baselines.py                    # Deterministic comparison agents
├── config_loader.py                # YAML config loader
├── data_loader.py                  # CSV and synthetic data loader
├── evaluate.py                     # Evaluation, KPIs, and plots
├── features.py                     # Time features and observation stacking
├── generate_sample_data.py         # Data generator
├── market_env.py                   # Gymnasium environment
├── rainflow_sp.py                  # Switching points and degradation
├── train.py                        # DQN training
└── Readme.md

Configuration

The main configuration file is configs/default.yaml. Key parameter blocks:

task: arbitrage

battery:
  capacity_mwh: 1.0
  p_max_mw: 0.25
  eta_charge: 0.95
  eta_discharge: 0.95
  soc_min: 0.1
  soc_max: 0.9

degradation:
  alpha_d: 0.0045
  beta: 1.3
  use_economic_degradation: true
  reference_dod: 0.8
  cycle_life: 6000.0
  replacement_cost_eur_per_mwh: 120000.0

env:
  dt_hours: 1.0
  n_power_levels: 17
  n_reserve_levels: 4
  reserve_max_fraction: 0.3
  stack_k: 4

agent:
  learning_rate: 0.00048
  batch_size: 256
  gamma: 0.975

training:
  total_timesteps: 250000
  seed: 42

CLI overrides are still supported:

python train.py --data data/sample_arbitrage_30d.csv --timesteps 100000 --seed 123

Data Format

Arbitrage Mode

Column Description Required
timestamp Timestamp in ISO format Optional
price Power price in EUR/MWh Yes
fr_signal Frequency regulation signal in [-1, 1] Optional
reserve_price Reserve compensation in EUR/MW per step Optional
temperature Temperature in °C Optional

Example:

timestamp,price,fr_signal,reserve_price
2024-01-01 00:00:00,45.2,0.12,1.5
2024-01-01 01:00:00,42.8,-0.08,1.5

Included in the repo:

  • data/sample_arbitrage.csv for fast 48h smoke tests
  • data/sample_arbitrage_30d.csv for more realistic dispatch and backtest runs

Load Following Mode

Additional columns:

Column Description
demand Load demand in MW
re_gen Renewable generation in MW

Included in the repo:

  • data/sample_load_following.csv
  • data/sample_load_following_30d.csv

Core Methodology

Rainflow-Exact Degradation

Degradation depends on the cycle depth of the SoC trajectory. Using the last three switching points (c₀, c₁, c₂), the repo computes a per-step increment that remains consistent with Rainflow cycle costs:

h_t^d = α_d · exp(β|c_t + b_t - c₂|) - α_d · exp(β|c_t - c₂|)

The repo then calibrates these units to reference cycle life and replacement cost. That makes degradation directly interpretable as economic asset wear in EUR.

Physically Consistent Dispatch Logic

Each action is split into two components:

  1. planned energy position for arbitrage
  2. reserve ratio that preserves headroom in both directions

The realized SoC change then accounts for:

  1. SoC limits
  2. charge and discharge efficiency
  3. reserve activation through the FR signal
  4. safety-layer projection into the feasible action space

Why DQN in This Setup?

  • Discrete actions map naturally to power levels and reserve levels with a safety layer.
  • Time features and observation stacking are already built into the environment.
  • evaluate.py can compare trained DQN policies directly against baselines.

Outputs

Training

  • Checkpoints: checkpoints/bess_dqn_*.zip
  • Best model: checkpoints/best_model.zip
  • TensorBoard logs: tensorboard/
  • Config snapshots: checkpoints/config_*.yaml

Start TensorBoard:

tensorboard --logdir tensorboard

Evaluation

  • KPI JSON: evaluation_results/evaluation_*.json
  • Evaluation plot: evaluation_results/evaluation_plot_*.png
  • Cycle plot: evaluation_results/cycles_*.png
  • Agent comparison: evaluation_results/comparison_*.png

Included metrics:

  • Net profit and arbitrage revenue
  • Reserve revenue and reserve compliance
  • Throughput, EFC, and cycles per day
  • Profit per throughput
  • Daily risk metrics such as Sharpe, VaR, and drawdown

API Examples

Use the Environment Directly

from market_env import BessMultiMarketEnv
import numpy as np

price = np.random.uniform(30, 100, size=24 * 7)

env = BessMultiMarketEnv(
    price=price,
    dt_hours=1.0,
    c_min=0.1,
    c_max=0.9,
    c_init=0.5,
    b_max=0.08,
    alpha_d=4.5e-3,
    beta=1.3,
)

obs, info = env.reset()
for _ in range(100):
    action = env.action_space.sample()
    obs, reward, done, trunc, info = env.step(action)
    print(f"SoC: {info['soc']:.2f}, Reward: {reward:.2f}")
    if done:
        break

Load Data Programmatically

from data_loader import load_csv_data, generate_synthetic_data

data = load_csv_data(
    path="data/my_data.csv",
    columns={"price": "price", "fr_signal": "fr_signal"},
    task="arbitrage",
)

synthetic = generate_synthetic_data(days=30, seed=42)
print(synthetic)

Limitations

  • The degradation model uses an exponential DoD and Rainflow formulation and should be calibrated to the actual cell chemistry.
  • The README figures show reproducible baseline runs, not the performance of a bundled pretrained DQN checkpoint.
  • For fair model comparisons, training and evaluation data should be separated in time.

References

  • Kwon & Zhu: Rainflow-exact degradation in MDPs with DQN
  • Comparative DRL studies on BESS dispatch with cyclical time features and observation stacking

License

MIT License

About

DQN-based battery dispatch with exact rainflow degradation tracking, cyclic time encodings, and observation stacking for superior energy arbitrage and load-following performance

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages