$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[March 27, 2025] ClickHouse

ClickHouse Nested Data Types

ClickHouse Nested Data Types

────────────────────────────────────────────────────────

ClickHouse Nested Data Types

Tip: Tuple

CREATE TABLE events (
    id UInt64,
    location Tuple(Float64, Float64)
) ENGINE = MergeTree ORDER BY id;

Gotcha: Array

CREATE TABLE posts (
    id UInt64,
    tags Array(String)
) ENGINE = MergeTree ORDER BY id;

Tip: Map

CREATE TABLE events (
    id UInt64,
    properties Map(String, String)
) ENGINE = MergeTree ORDER BY id;

Gotcha: Nested Tables

CREATE TABLE events (
    id UInt64,
    nested_data Nested(key String, value String)
) ENGINE = MergeTree ORDER BY id;

Tip: Array Join

SELECT id, arrayJoin(tags) as tag FROM posts;

Gotcha: Map Functions

SELECT properties['key'] FROM events;

Direct key access from Map columns.

Tip: Order of Columns in ORDER BY Matters Massively

ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.

Tip: Use LowCardinality for Enum-Like Strings

Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.

Gotcha: Mutations Are Heavy

ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.

Senior Insight

ClickHouse arrays are first-class data types, unlike PostgreSQL where they're extensions. I store related values like product tags or multi-value attributes directly as Array columns instead of creating separate tables. The ARRAY JOIN clause unnests arrays for analysis. The important optimization: use arrayAll(), arrayExists(), and arrayFilter() instead of ARRAY JOIN when you only need aggregate information about array contents.

Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)

────────────────────────────────────────────────────────
<-- back to posts