$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[September 04, 2024] PostgreSQL

PostgreSQL Bulk Operations: COPY Command

PostgreSQL Bulk Operations: COPY Command

────────────────────────────────────────────────────────

PostgreSQL Bulk Operations: COPY Command

Tip: COPY from CSV

COPY products FROM '/data/products.csv' WITH (FORMAT csv, HEADER true);

Much faster than individual INSERTs.

Gotcha: COPY Needs File Access

The file must be accessible by the PostgreSQL server, not the client. Use \COPY in psql for client-side files.

Tip: COPY to CSV

COPY (SELECT * FROM orders WHERE created_at > '2024-01-01')
    TO '/data/orders_2024.csv' WITH (FORMAT csv, HEADER true);

Gotcha: COPY is Transactional

BEGIN;
COPY products FROM '/data/products.csv' WITH (FORMAT csv);
COMMIT;

If COPY fails, the transaction rolls back.

Tip: pg_copy_from in PHP

pg_copy_from($db, 'products', $rows, "\t", "\\\\N");

Bulk insert from PHP arrays.

Gotcha: COPY Doesn't Fire Triggers

COPY bypasses triggers by default. Use COPY ... WITH (FORCE_NOT_NULL ...) for special handling.

Tip: EXPLAIN (ANALYZE, BUFFERS) Is Your Best Friend

For query debugging, always use EXPLAIN (ANALYZE, BUFFERS) instead of plain EXPLAIN. The BUFFERS option shows hit/miss rates for every node, revealing whether your indexes are actually in memory.

Tip: Partial Indexes Are Underutilized

CREATE INDEX ON orders (status) WHERE status = 'pending' creates a tiny index that covers only the rows your query needs. It's faster to scan and cheaper to maintain than a full-column index.

Gotcha: NULL Sorting Is Non-Obvious

By default, NULLs sort AFTER non-null values in ascending order. ORDER BY col DESC puts NULLs FIRST. Use NULLS LAST or NULLS FIRST to be explicit.

Senior Insight

The COPY command is the fastest way to bulk-load data into PostgreSQL, far faster than INSERT statements. I've loaded 10 million rows in under a minute with COPY, compared to hours with individual INSERTs. In Laravel, you can use DB::connection()->getPdo()->pgsqlCopyFromArray() or stream a CSV file. The gotcha: COPY is all-or-nothing — a single bad row aborts the entire operation. Validate your data before COPY, not after.

Source: pganalyze Blog (https://pganalyze.com/blog), PostgreSQL Docs (https://www.postgresql.org/docs/current/), Crunchy Data Blog (https://www.crunchydata.com/blog)

────────────────────────────────────────────────────────
<-- back to posts