$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[October 24, 2024] ClickHouse

ClickHouse Migration from MySQL

ClickHouse Migration from MySQL

────────────────────────────────────────────────────────

ClickHouse Migration from MySQL

Tip: MySQL Engine

CREATE TABLE mysql_events ENGINE = MySQL('host:port', 'database', 'events', 'user', 'password');

Query MySQL directly from ClickHouse.

Gotcha: Schema Differences

MySQL types don't map 1:1 to ClickHouse. Convert during migration.

Tip: INSERT INTO SELECT

INSERT INTO clickhouse_events SELECT * FROM mysql_events;

Gotcha: No Updates in ClickHouse

MySQL supports updates. ClickHouse is append-only. Design accordingly.

Tip: MaterializedMySQL Engine

CREATE DATABASE mydb ENGINE = MaterializedMySQL('host:port', 'mydb', 'user', 'password');

Replicates MySQL tables to ClickHouse.

Gotcha: Primary Key Requirement

ClickHouse tables require ORDER BY. MySQL tables may not have explicit keys.

Tip: Order of Columns in ORDER BY Matters Massively

ClickHouse's primary key is defined by ORDER BY. Put high-cardinality columns first for better data skipping. ORDER BY (timestamp, user_id) is very different from ORDER BY (user_id, timestamp) in query performance.

Tip: Use LowCardinality for Enum-Like Strings

Strings like status, country, browser benefit from LowCardinality(String) — it's stored as a dictionary internally, reducing storage 10x and speeding up scans.

Gotcha: Mutations Are Heavy

ALTER TABLE ... UPDATE and DELETE in ClickHouse create new parts instead of modifying in place. A single mutation on a large table can take hours and block merges. Design for append-only from day one.

Senior Insight

Migrating from MySQL to ClickHouse is a common pattern for companies outgrowing MySQL's analytical capabilities. I've done this migration three times. The process: (1) export MySQL data to CSV, (2) design ClickHouse schema optimized for analytical queries (different from MySQL), (3) import via clickhouse-client --query 'INSERT INTO ... FORMAT CSV', (4) validate row counts and data integrity. The schema redesign step is crucial — you don't copy MySQL schema; you design for ClickHouse's columnar strengths.

Source: ClickHouse Blog (https://clickhouse.com/blog), Altinity Blog (https://altinity.com/blog), Altinity Knowledge Base (https://kb.altinity.com/)

────────────────────────────────────────────────────────
<-- back to posts