ClickHouse Table Engines
ClickHouse Table Engines: Tips & Tricks
ClickHouse Table Engines: Tips & Tricks
Tip: MergeTree is the Default
CREATE TABLE events (
id UInt64,
event String,
created_at DateTime
) ENGINE = MergeTree()
ORDER BY (created_at, event);
Use this for 90% of cases.
Gotcha: ReplacingMergeTree Deduplicates
CREATE TABLE users (
id UInt64,
name String,
version UInt64
) ENGINE = ReplacingMergeTree(version)
ORDER BY id;
Keeps only the row with the highest version per id. Deduplication happens during merges.
Tip: SummingMergeTree for Pre-Aggregation
CREATE TABLE sales (
product_id UInt64,
amount UInt64,
created_at DateTime
) ENGINE = SummingMergeTree()
ORDER BY (product_id, created_at);
Automatically sums amount for rows with the same key.
Gotcha: AggregatingMergeTree
CREATE TABLE daily_stats (
date Date,
views AggregateFunction(sum, UInt64)
) ENGINE = AggregatingMergeTree()
ORDER BY date;
Stores aggregate functions, not raw values. Requires special insert syntax.
Tip: Log Engines for Small Data
CREATE TABLE logs (message String) ENGINE = StripeLog;
No indexes, no sorting. Good for small lookup tables.
Gotcha: Engine Choice Affects Query Performance
MergeTree: fast reads, good compression. Log engines: fast writes, poor reads.
Tip: ORDER BY Defines Data Layout
ORDER BY (created_at, event_type)
Data is physically sorted by these columns. Queries filtering on the first column are fastest.
Gotcha: PARTITION BY
PARTITION BY toYYYYMM(created_at)
Partitions by month. Makes it easy to drop old data.
Tip: Order of Columns in ORDER BY Matters Massively
ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.
Tip: Use LowCardinality for Enum-Like Strings
Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.
Gotcha: Mutations Are Heavy
ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.
Senior Insight
The MergeTree engine family is ClickHouse's heart, and choosing the right engine variant is critical. MergeTree is the base — use it for most data. SummingMergeTree pre-aggregates numeric columns, ideal for counters. AggregatingMergeTree stores intermediate aggregation states, perfect for real-time rollups. I once had a dashboard that ran a 10-second query on MergeTree and a 50ms query on AggregatingMergeTree with the same data. Choose your engine based on query patterns.
Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)