$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[April 01, 2025] ClickHouse

ClickHouse Time Series: Best Practices

ClickHouse Time Series: Best Practices

────────────────────────────────────────────────────────

ClickHouse Time Series: Best Practices

Tip: Partition by Month

PARTITION BY toYYYYMM(timestamp)

Standard partitioning for time-series data.

Gotcha: Too Granular Partitions

Daily partitions create too many parts. Monthly is usually optimal.

Tip: Time Range Queries

SELECT count() FROM metrics
WHERE timestamp BETWEEN '2024-01-01' AND '2024-01-31';

Uses partition pruning for fast execution.

Gotcha: Date Functions in WHERE

WHERE toStartOfDay(timestamp) = '2024-01-15'

Functions on the column prevent index usage. Use range comparisons instead.

Tip: Downsampling

SELECT toStartOfHour(timestamp) as hour, avg(value)
FROM metrics GROUP BY hour;

Reduces resolution for historical data.

Gotcha: Data Freshness

Recent data is in unmerged parts. Queries spanning recent + old data may have inconsistent performance.

Tip: Order of Columns in ORDER BY Matters Massively

ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.

Tip: Use LowCardinality for Enum-Like Strings

Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.

Gotcha: Mutations Are Heavy

ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.

Senior Insight

Time-series data in ClickHouse follows specific design patterns. The primary key should start with the time column for efficient time-range pruning. I use DateTime64 for millisecond precision and set ttl = '7d' for raw data with toStartOfHour(timestamp) for aggregation. The most impactful optimization: order table columns by access frequency — frequently queried columns first in CREATE TABLE reduces IO for typical queries.

Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)

────────────────────────────────────────────────────────
<-- back to posts