Skip to content

feat(s3): add workload profile by query type#86

Merged
NikolayS merged 1 commit intomasterfrom
feat/s3-workload-profile
Feb 10, 2026
Merged

feat(s3): add workload profile by query type#86
NikolayS merged 1 commit intomasterfrom
feat/s3-workload-profile

Conversation

@NikolayS
Copy link
Owner

Summary

New s3 report: workload composition from pg_stat_statements, grouped by first SQL keyword.

The comment problem (#37)

Simple regexp_replace(query, '^\W*(\w+)...') breaks on queries like:

/* pgbouncer health check */ SELECT 1
-- application: myapp
INSERT INTO events ...

The fix strips comments before extracting:

  1. Remove block comments: /\*.*?\*/ (with gs flags for multi-line)
  2. Remove leading line comments: ^\s*(--[^\n]*\n\s*)*
  3. Extract first word: ^\s*(\w+)

Tested regex against

Input Extracted
SELECT 1 select
/* comment */ SELECT 1 select
/* multi\nline */ INSERT ... insert
-- comment\nSELECT 1 select
-- c1\n-- c2\nUPDATE ... update
/* c1 */ -- c2\nDELETE ... delete

Output columns (PG13+)

Column Description
Query Type First SQL keyword (select, insert, update, etc.)
Calls Total call count
Calls % Percentage of all calls
Exec (ms) Total execution time
Exec % Percentage of total exec time
Plan (ms) Total planning time
Avg (ms/call) Average execution time per call
Rows Total rows affected

Closes #37

New s3 report groups pg_stat_statements by first SQL keyword
(SELECT, INSERT, UPDATE, DELETE, etc.) to show workload composition.

The tricky part: SQL comments before the keyword. The regex:
1. Strips block comments: /* ... */ (including multi-line)
2. Strips leading line comments: -- ... (including chained)
3. Extracts first remaining word as the query type

Shows per-type: calls, calls%, exec time, exec%, plan time,
avg time per call, and row count.

PG13+ path uses total_exec_time/total_plan_time split.
Older PG path uses total_time.

Closes #37
@NikolayS NikolayS merged commit bd227fe into master Feb 10, 2026
6 checks passed
@NikolayS NikolayS deleted the feat/s3-workload-profile branch February 10, 2026 03:05
select
lower((regexp_match(
regexp_replace(
regexp_replace(query, '/\*.*?\*/', '', 'gs'),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you only need the leading keyword, you can avoid the global block-comment replace and strip only leading comments (block + line) in one anchored regexp_replace (cheaper and avoids touching comment-like text later in the query). Same change applies in the \else branch.

Suggested change
regexp_replace(query, '/\*.*?\*/', '', 'gs'),
regexp_replace(
query,
'^\\s*(/\\*.*?\\*/\\s*|--[^\\n]*\\n\\s*)*',
''
),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pg_stat_statements "1st word" report

1 participant

Comments