$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[July 03, 2025] ClickHouse

ClickHouse Backup and Restore: Strategies

ClickHouse Backup and Restore: Strategies

────────────────────────────────────────────────────────

ClickHouse Backup and Restore: Strategies

Tip: clickhouse-backup Tool

clickhouse-backup create daily_backup
clickhouse-backup upload daily_backup

Third-party tool for logical backups.

Gotcha: Physical Backup

cp -r /var/lib/clickhouse/data /backup/

Stop ClickHouse first, or use FREEZE for consistent snapshots.

Tip: FREEZE Partitions

ALTER TABLE events FREEZE PARTITION 202401;

Creates a hard link snapshot. No data copying.

Gotcha: Backup Size

ClickHouse compresses data heavily. Backups can be larger than the stored data.

Tip: Restore from Backup

clickhouse-backup download daily_backup
clickhouse-backup restore daily_backup

Gotcha: Schema Changes

If the table schema changed since the backup, restore may fail. Backup schema separately.

Tip: Order of Columns in ORDER BY Matters Massively

ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.

Tip: Use LowCardinality for Enum-Like Strings

Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.

Gotcha: Mutations Are Heavy

ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.

Senior Insight

Backing up ClickHouse is different from traditional databases. The recommended approach is clickhouse-backup (open-source tool) or filesystem-level snapshots (ZFS, LVM). I've used both — clickhouse-backup is simpler for daily backups, but filesystem snapshots are faster for restoring entire servers. The key infrastructure: ClickHouse stores data in /var/lib/clickhouse/, and backing up this directory while ClickHouse is running requires freeze-and-copy or snapshot.

Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)

────────────────────────────────────────────────────────
<-- back to posts