$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[October 18, 2024] ClickHouse

ClickHouse Async Inserts: Buffering

ClickHouse Async Inserts: Buffering

────────────────────────────────────────────────────────

ClickHouse Async Inserts: Buffering

Tip: Enable Async Inserts

SET async_insert = 1;

Gotcha: Buffer Flush

ClickHouse flushes the buffer when it reaches max_async_insert_bytes or async_insert_busy_timeout.

Tip: Wait for Async

SET wait_for_async_insert = 1;

Waits for the insert to be processed before returning.

Gotcha: Not for All Workloads

Async inserts are great for high-throughput logging. Not ideal for transactional inserts.

Tip: Batch Size

SET max_async_insert_bytes = 10485760;

10MB buffer before flush.

Gotcha: Data Loss on Crash

Unflushed data in the buffer is lost if ClickHouse crashes.

Tip: Order of Columns in ORDER BY Matters Massively

ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.

Tip: Use LowCardinality for Enum-Like Strings

Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.

Gotcha: Mutations Are Heavy

ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.

Senior Insight

Async inserts in ClickHouse (22.4+) buffer incoming data and flush it in batches. I enable this for all HTTP-based data ingestion where latency isn't critical. The async insert mode groups 10,000+ rows into a single insert, dramatically reducing part count and merge overhead. The setting async_insert = 1 with wait_for_async_insert = 0 provides the best throughput. The trade-off: up to 1 second of data staleness (the default flush interval).

Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)

────────────────────────────────────────────────────────
<-- back to posts