pg_stat_statements: Add rows_scanned and rows_filtered columns#5
Open
pg_stat_statements: Add rows_scanned and rows_filtered columns#5
Conversation
Add a new rows_scanned column to pg_stat_statements that tracks the total number of rows scanned by scan nodes (SeqScan, IndexScan, IndexOnlyScan, BitmapHeapScan, etc.) before filter conditions are applied. This metric is collected by walking the plan tree and summing up ntuples + nfiltered1 for all scan nodes. This information is valuable for identifying queries that scan many rows but return few, which often indicates missing indexes or suboptimal query plans. Combined with the existing rows column, users can calculate the filtering efficiency of their queries. The new column appears after rows in the view, so existing queries that select specific columns by name will continue to work. Bump extension version to 1.14.
Add a rows_filtered column to pg_stat_statements to track rows removed by scan/join/other filter conditions. This metric helps identify queries that may benefit from better indexing. The implementation: - Enables per-node instrumentation with INSTRUMENT_ALL before ExecutorStart - Walks the plan tree in ExecutorEnd to sum nfiltered1 (scanqual/joinqual) and nfiltered2 (other quals) from all nodes - Reads both ntuples and tuplecount to capture complete tuple counts from both completed and current execution cycles - Includes the new column in the SQL function and view for version 1.15
0f4c486 to
0e9d6f8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
pg_stat_statements: Add rows_scanned and rows_filtered columns
Summary
This PR adds two new columns to
pg_stat_statementsto help identify queries that may benefit from better indexing:rows_scanned- Total rows fetched from storage by scan nodes before any filteringrows_filtered- Total rows removed by filter conditions (scanqual, joinqual, and other quals)These metrics are useful for identifying inefficient queries that scan many rows but return few results.
Implementation Details
INSTRUMENT_ALL) inExecutorStarthookExecutorEndto collect statistics from all nodesrows_scanned: sums tuples from scan nodes (SeqScan, IndexScan, etc.) including filtered tuplesrows_filtered: sumsnfiltered1(scanqual/joinqual) andnfiltered2(other quals) from all nodesntuplesandtuplecountfrom instrumentation to capture complete tuple countsExample Output