$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[May 07, 2026] PostgreSQL

PostgreSQL Full-Text: Custom Dictionaries

PostgreSQL Full-Text: Custom Dictionaries

────────────────────────────────────────────────────────

PostgreSQL Full-Text: Custom Dictionaries

Tip: Custom Dictionary

ALTER TEXT SEARCH DICTIONARY english_simple (
    StopWords = english
);

Control which words are treated as stop words.

Gotcha: Stemming Removes Word Forms

'running' becomes 'run'. If you need exact matches, use 'simple' configuration.

Tip: Synonym Dictionary

CREATE TEXT SEARCH DICTIONARY my_synonyms (
    TEMPLATE = synonym,
    SYNONYMS = my_synonyms_file
);

Map synonyms to canonical terms.

Gotcha: Dictionary File Format

One synonym per line: car automobile

Must be in PostgreSQL's tsearch_data directory.

Tip: Combining Dictionaries

CREATE TEXT SEARCH CONFIGURATION custom (COPY = english);
ALTER TEXT SEARCH CONFIGURATION custom
    ADD MAPPING FOR word WITH my_synonyms, english_stem;

Gotcha: Test Your Configuration

SELECT to_tsvector('custom', 'The car is fast');

Verify tokenization before deploying.

Tip: EXPLAIN (ANALYZE, BUFFERS) Is Your Best Friend

For query debugging, always use EXPLAIN (ANALYZE, BUFFERS) instead of plain EXPLAIN. The BUFFERS option shows hit/miss rates for every node, revealing whether your indexes are actually in memory.

Tip: Partial Indexes Are Underutilized

CREATE INDEX ON orders (status) WHERE status = 'pending' creates a tiny index that covers only the rows your query needs. It's faster to scan and cheaper to maintain than a full-column index.

Gotcha: NULL Sorting Is Non-Obvious

By default, NULLs sort AFTER non-null values in ascending order. ORDER BY col DESC puts NULLs FIRST. Use NULLS LAST or NULLS FIRST to be explicit.

Senior Insight

One PostgreSQL feature I wish more Laravel developers knew about is pg_stat_statements. This extension tracks query execution statistics — total time, frequency, rows returned, and shared buffer usage. It's the single best tool for identifying slow queries and high-frequency queries that could be cached. I install it on every PostgreSQL instance and query it weekly to identify optimization opportunities. It turns performance optimization from guesswork into data-driven engineering.

Source: pganalyze Blog (https://pganalyze.com/blog), PostgreSQL Docs (https://www.postgresql.org/docs/current/), Crunchy Data Blog (https://www.crunchydata.com/blog)

────────────────────────────────────────────────────────
<-- back to posts