The workflow_events table uses PostgreSQL native table partitioning to handle the large volume of event data generated by workflow executions. This document describes the partitioning strategy, maintenance procedures, and best practices.
- Method: Range partitioning by timestamp (
tscolumn) - Granularity: Monthly partitions
- Naming Convention:
workflow_events_y{YYYY}m{MM}(e.g.,workflow_events_y2025m01)
The workflow_events table is the primary event sourcing log for MeshHook. It stores:
- Every state change in workflow execution
- All node execution events
- Debugging and replay information
This table will grow rapidly in production environments, making partitioning essential for:
- Query Performance: Partition pruning allows PostgreSQL to scan only relevant partitions
- Maintenance: Easier to archive or drop old data by partition
- Index Efficiency: Smaller indexes per partition improve performance
- Backup/Restore: Partition-level operations are faster
- Storage Management: Better control over data lifecycle
CREATE TABLE workflow_events (
id bigserial,
run_id uuid NOT NULL,
ts timestamptz NOT NULL DEFAULT now(),
type text NOT NULL,
payload jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now(),
PRIMARY KEY (id, ts)
) PARTITION BY RANGE (ts);Key Points:
- Primary key includes both
idandts(required for partitioned tables) - Foreign key to
workflow_runs(id)with cascade delete - Partitioned by
ts(timestamp) column
Each partition covers one calendar month:
- Start: First day of month at 00:00:00
- End: First day of next month at 00:00:00 (exclusive)
Example:
workflow_events_y2025m01: [2025-01-01, 2025-02-01)
workflow_events_y2025m02: [2025-02-01, 2025-03-01)
The migration creates partitions for:
- 2 months in the past
- Current month
- 3 months in the future
This ensures smooth operation during initial deployment.
The maintain_workflow_events_partitions() function ensures future partitions exist:
SELECT maintain_workflow_events_partitions();What it does:
- Checks for existing future partitions
- Creates partitions for the next 3 months if needed
- Automatically creates indexes on new partitions
Scheduling:
- If
pg_cronextension is available, runs daily at 2 AM - Otherwise, must be scheduled externally (cron, systemd timer, etc.)
To create a partition for a specific month:
SELECT create_workflow_events_partition('2025-06-01'::date);This creates the partition for June 2025 with all necessary indexes.
Each partition automatically receives these indexes:
{partition}_run_id_idx: Index onrun_idfor filtering by workflow run{partition}_run_ts_idx: Composite index on(run_id, ts)for time-range queries{partition}_type_idx: Index ontypefor filtering by event type
These indexes are created automatically when partitions are created.
Queries work exactly as before - PostgreSQL handles partition routing automatically:
-- Get all events for a run
SELECT * FROM workflow_events
WHERE run_id = 'some-uuid'
ORDER BY ts;
-- Get events in a time range
SELECT * FROM workflow_events
WHERE ts >= '2025-01-01' AND ts < '2025-02-01';Use the provided helper function for better performance:
-- Get events for a run with optional time filtering
SELECT * FROM get_workflow_run_events(
'run-uuid'::uuid,
'2025-01-01'::timestamptz, -- optional start
'2025-01-31'::timestamptz -- optional end
);PostgreSQL automatically prunes (skips) irrelevant partitions when:
- WHERE clause includes the partition key (
ts) - Time range is specified
Example:
-- Only scans January 2025 partition
SELECT * FROM workflow_events
WHERE ts >= '2025-01-15' AND ts < '2025-01-20';-- Simple view of all partitions
SELECT * FROM workflow_events_partition_info;Returns:
- Partition name
- Size (human-readable)
- Row count estimate
-- Detailed partition statistics
SELECT * FROM get_workflow_events_partition_stats();Returns:
- Partition name
- Date range
- Exact row count
- Size in bytes and human-readable format
-- List all partitions with their boundaries
SELECT
tablename,
pg_get_expr(c.relpartbound, c.oid) as bounds
FROM pg_tables t
JOIN pg_class c ON c.relname = t.tablename
WHERE schemaname = 'public'
AND tablename LIKE 'workflow_events_y%'
ORDER BY tablename;To remove old data, drop entire partitions:
-- Drop partitions older than 12 months (default)
SELECT drop_old_workflow_events_partitions();
-- Drop partitions older than 6 months
SELECT drop_old_workflow_events_partitions(6);Warning: This permanently deletes data. Ensure backups exist before dropping partitions.
-- Drop a specific partition
DROP TABLE workflow_events_y2024m01;Before dropping old partitions, consider archiving:
- Export to file:
COPY workflow_events_y2024m01 TO '/path/to/archive/2024-01.csv' CSV HEADER;- Move to archive table:
CREATE TABLE workflow_events_archive (LIKE workflow_events INCLUDING ALL);
INSERT INTO workflow_events_archive SELECT * FROM workflow_events_y2024m01;
DROP TABLE workflow_events_y2024m01;Run automatically via pg_cron or external scheduler:
SELECT maintain_workflow_events_partitions();- Verify new partition was created
- Check partition sizes
- Consider archiving old partitions
-- Check partition sizes
SELECT * FROM workflow_events_partition_info
ORDER BY partition_name DESC
LIMIT 6;- Review retention policy
- Archive or drop old partitions
- Analyze partition performance
-- Analyze all partitions
DO $$
DECLARE
partition_name text;
BEGIN
FOR partition_name IN
SELECT tablename FROM pg_tables
WHERE schemaname = 'public'
AND tablename LIKE 'workflow_events_y%'
LOOP
EXECUTE 'ANALYZE ' || partition_name;
END LOOP;
END $$;- Always include time range when possible:
-- Good: Uses partition pruning
SELECT * FROM workflow_events
WHERE run_id = 'uuid' AND ts >= '2025-01-01';
-- Less optimal: Scans all partitions
SELECT * FROM workflow_events
WHERE run_id = 'uuid';- Use appropriate indexes:
- Single run queries:
run_idindex - Time-range queries:
run_tscomposite index - Event type filtering:
typeindex
- Avoid full table scans:
-- Bad: Scans all partitions
SELECT COUNT(*) FROM workflow_events;
-- Better: Query specific partition
SELECT COUNT(*) FROM workflow_events_y2025m01;- Optimal size: 10-100 GB per partition
- Maximum recommended: 200 GB per partition
- If partitions grow too large: Consider daily or weekly partitioning
Monitor partition growth:
SELECT
partition_name,
size_pretty,
row_count
FROM get_workflow_events_partition_stats()
ORDER BY partition_name DESC
LIMIT 3;Error: no partition of relation "workflow_events" found for row
Cause: Trying to insert data outside existing partition ranges
Solution:
-- Create partition for the required date
SELECT create_workflow_events_partition('2025-12-01'::date);Symptoms: Queries taking longer than expected
Diagnosis:
-- Check if partition pruning is working
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM workflow_events
WHERE ts >= '2025-01-01' AND ts < '2025-02-01';Look for "Partitions removed" in the output.
Solutions:
- Ensure WHERE clause includes
tscolumn - Verify indexes exist on partitions
- Run ANALYZE on partitions
Symptoms: Inserts failing for current/future dates
Solution:
-- Manually create future partitions
SELECT maintain_workflow_events_partitions();
-- Or create specific partition
SELECT create_workflow_events_partition(CURRENT_DATE + interval '1 month');The migration automatically:
- Creates new partitioned table structure
- Migrates existing data (if any)
- Drops old table
- Renames partitioned table
Downtime: Minimal for small datasets, plan maintenance window for large datasets
If rollback is needed:
- Export data from partitioned table
- Create non-partitioned table
- Import data
- Update application
Note: Rollback is complex. Test thoroughly before production deployment.
- Monitor partition creation: Ensure
maintain_workflow_events_partitions()runs regularly - Set retention policy: Define how long to keep event data
- Archive before dropping: Always backup data before deletion
- Include time filters: Always filter by
tswhen possible for better performance - Regular ANALYZE: Run ANALYZE on partitions after bulk inserts
- Monitor partition sizes: Alert when partitions exceed size thresholds
- Test queries: Verify partition pruning with EXPLAIN ANALYZE
Run the verification script to ensure partitioning is working correctly:
node scripts/verify-event-partitioning.jsThis script checks:
- Table is partitioned
- Partitions exist
- Functions are available
- Event insertion works
- Queries execute correctly