ClickHouse Analytic Queries
ClickHouse Analytic Queries: Tips & Tricks
ClickHouse Analytic Queries: Tips & Tricks
Tip: Time Series Aggregation
SELECT
toStartOfHour(created_at) as hour,
count() as views
FROM page_views
GROUP BY hour
ORDER BY hour;
Groups data by hour for time-series analysis.
Gotcha: uniq() vs uniqExact()
SELECT uniq(user_id) FROM page_views; -- Approximate, fast
SELECT uniqExact(user_id) FROM page_views; -- Exact, slower
uniq() uses HyperLogLog — good enough for most cases.
Tip: Percentiles
SELECT
quantile(0.5)(response_time) as p50,
quantile(0.95)(response_time) as p95,
quantile(0.99)(response_time) as p99
FROM api_logs;
Essential for performance monitoring.
Gotcha: TOP N Queries
SELECT url, count() as views
FROM page_views
WHERE created_at >= today() - 7
GROUP BY url
ORDER BY views DESC
LIMIT 10;
Standard top-N pattern.
Tip: Running Totals
SELECT
date,
views,
sum(views) OVER (ORDER BY date) as running_total
FROM daily_stats;
Window functions work in ClickHouse too.
Gotcha: arrayJoin() for Unnesting
SELECT arrayJoin(tags) as tag, count() FROM posts GROUP BY tag;
Expands arrays into rows.
Tip: Date Truncation Functions
toStartOfDay(dt) -- Midnight of the day
toStartOfWeek(dt) -- Start of the week
toStartOfMonth(dt) -- First of the month
toStartOfHour(dt) -- Start of the hour
Gotcha: GROUP BY with Aliases
ClickHouse allows GROUP BY aliases in some versions, but it's safer to repeat the expression.
Tip: Order of Columns in ORDER BY Matters Massively
ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.
Tip: Use LowCardinality for Enum-Like Strings
Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.
Gotcha: Mutations Are Heavy
ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.
Senior Insight
ClickHouse requires a fundamental shift in thinking for developers coming from OLTP databases. The database is optimized for bulk reads of millions of rows, not single-row lookups. I've seen developers struggle because they try to use ClickHouse like PostgreSQL — frequent small INSERTs, point SELECTs, UPDATEs. These operations work in ClickHouse but perform poorly. The mental model should be: write in batches, read in ranges, and never update. Once you internalize this, ClickHouse becomes beautiful.
Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)