$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[March 18, 2024] PostgreSQL

PostgreSQL Statistics: Query Planner

PostgreSQL Statistics: Query Planner

────────────────────────────────────────────────────────

PostgreSQL Statistics: Query Planner

Tip: Run ANALYZE

ANALYZE posts;

Updates table statistics for the query planner.

Gotcha: Autoanalyze

PostgreSQL runs autoanalyze automatically. But it may lag behind bulk inserts.

Tip: Check Statistics

SELECT * FROM pg_stats WHERE tablename = 'posts';

Shows column statistics the planner uses.

Gotcha: Stale Statistics

If the planner chooses a bad plan, stale statistics are often the cause. Run ANALYZE.

Tip: default_statistics_target

default_statistics_target = 100

Higher values = more accurate statistics but slower ANALYZE.

Gotcha: Per-Column Statistics

ALTER TABLE posts ALTER COLUMN title SET STATISTICS 200;

Collect more statistics for specific columns.

Tip: EXPLAIN (ANALYZE, BUFFERS) Is Your Best Friend

For query debugging, always use EXPLAIN (ANALYZE, BUFFERS) instead of plain EXPLAIN. The BUFFERS option shows hit/miss rates for every node, revealing whether your indexes are actually in memory.

Tip: Partial Indexes Are Underutilized

CREATE INDEX ON orders (status) WHERE status = 'pending' creates a tiny index that covers only the rows your query needs. It's faster to scan and cheaper to maintain than a full-column index.

Gotcha: NULL Sorting Is Non-Obvious

By default, NULLs sort AFTER non-null values in ascending order. ORDER BY col DESC puts NULLs FIRST. Use NULLS LAST or NULLS FIRST to be explicit.

Senior Insight

Reading EXPLAIN output is the single most important PostgreSQL debugging skill I've developed. The EXPLAIN (ANALYZE, BUFFERS) variant shows actual execution time and buffer usage, revealing whether indexes are cached in memory. I look for three red flags: (1) sequential scans on large tables, (2) 'rows' vs 'actual rows' mismatch (bad estimates — usually from outdated statistics), and (3) temp files written to disk (work_mem too low). Run ANALYZE manually if statistics are stale.

Source: pganalyze Blog (https://pganalyze.com/blog), PostgreSQL Docs (https://www.postgresql.org/docs/current/), Crunchy Data Blog (https://www.crunchydata.com/blog)

────────────────────────────────────────────────────────
<-- back to posts