---
name: postgres-to-clickhouse-migration
description: MUST USE when migrating an analytical workload from Postgres to ClickHouse. Guides discovery, access-pattern analysis, ClickHouse schema design (engine, ORDER BY, partitioning, materialized views), Postgres-to-ClickHouse type mapping, ClickPipes/PeerDB CDC ingestion, and validation. Prefer official ClickHouse docs and the linked blogs as the source of truth.
license: Apache-2.0
metadata:
  author: ClickHouse Inc
  version: "0.1.0"
---

# Postgres to ClickHouse Migration

End-to-end guidance for moving an analytical (OLAP) workload off Postgres and onto ClickHouse, with a strong focus on data modeling. The biggest wins come from re-modeling the schema for ClickHouse's columnar, sparse-index, merge-based architecture rather than lifting the Postgres schema verbatim.

> **Sources of truth, in priority order:**
> 1. Official ClickHouse docs: [Best Practices](https://clickhouse.com/docs/best-practices), [Migrating from PostgreSQL](https://clickhouse.com/docs/migrations/postgresql)
> 2. Blog: [Postgres to ClickHouse data modeling tips (v2)](https://clickhouse.com/blog/postgres-to-clickhouse-data-modeling-tips-v2)
> 3. Blog: [Postgres to ClickHouse data modeling tips](https://clickhouse.com/blog/postgres-to-clickhouse-data-modeling-tips)
>
> Never invent ClickHouse behavior. If a recommendation is not backed by the docs or the behavior above, say so and verify.

## How to apply this skill

Work through the six phases in order. Each phase has concrete commands to run and decisions to record. Do not jump to writing `CREATE TABLE` before completing Discovery and Access-pattern analysis: ORDER BY is effectively immutable, so a wrong key forces a rebuild.

1. Discovery
2. Access-pattern analysis
3. ClickHouse schema design
4. Type mapping
5. Ingestion and CDC
6. Validation and iteration

A consolidated checklist sits at the end. Use it as the definition of done.

---

## Phase 1: Discovery

Goal: build an accurate inventory of the Postgres schema and, more importantly, of how it is actually queried. ClickHouse modeling is driven by query patterns, not by the source schema, so this phase feeds everything downstream.

### 1.1 Inspect the Postgres schema

For each candidate table capture the column definitions, the primary key, the indexes, and the foreign keys.

```sql
-- Columns and types
SELECT column_name, data_type, is_nullable, character_maximum_length, numeric_precision, numeric_scale
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name = 'events'
ORDER BY ordinal_position;

-- Primary key and unique constraints
SELECT conname, contype, pg_get_constraintdef(oid)
FROM pg_constraint
WHERE conrelid = 'public.events'::regclass;

-- Indexes (these hint at the existing access patterns)
SELECT indexname, indexdef FROM pg_indexes
WHERE schemaname = 'public' AND tablename = 'events';
```

### 1.2 Capture row counts and on-disk size

Volume decides whether a table is worth migrating to ClickHouse at all (large, append-heavy, analytically queried tables are the prime candidates; tiny dimension tables may become dictionaries).

```sql
SELECT relname AS table,
       n_live_tup AS approx_rows,
       pg_size_pretty(pg_total_relation_size(relid)) AS total_size
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
```

### 1.3 Measure cardinality per column

Cardinality drives type selection (LowCardinality vs String) and ORDER BY column ordering. Record the distinct count and null fraction for every column you plan to keep.

```sql
SELECT
  count(*)                                      AS rows,
  count(DISTINCT event_type)                    AS event_type_cardinality,
  count(DISTINCT user_id)                        AS user_id_cardinality,
  avg((country IS NULL)::int)                    AS country_null_fraction
FROM events;
```

### 1.4 Find the slow and heavy analytical queries

The queries you must make fast determine the entire schema. Find them, do not guess them.

- In ClickHouse Cloud, use **Query Insights** to review historical query latency, frequency, and resource use after some traffic has run. On the Postgres side, find the equivalent before you migrate.
- In Postgres, enable and read `pg_stat_statements` to rank queries by total and mean time:

```sql
SELECT
  calls,
  round(total_exec_time)              AS total_ms,
  round(mean_exec_time, 2)            AS mean_ms,
  round(100 * total_exec_time / sum(total_exec_time) OVER (), 1) AS pct_of_total,
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 25;
```

Record the top 10 to 25 queries by total time. These are the benchmark set you will validate against in Phase 6.

**Discovery output:** a table inventory (rows, size), a per-column cardinality and null-fraction report, and a ranked list of the heavy analytical queries.

---

## Phase 2: Access-pattern analysis

Goal: turn the heavy query list into a concrete set of modeling requirements. For each query in the benchmark set, extract:

- **Filters (WHERE):** which columns, and how selective. The most frequently filtered, most selective columns are candidates for the ORDER BY key.
- **GROUP BY columns:** these often belong in the ORDER BY too, since ClickHouse can read pre-sorted data for aggregation.
- **Sort orders (ORDER BY in the query):** repeated sort columns may justify their position in the table's sorting key.
- **Joins:** which tables join to which, on what keys, and the relative sizes. Small dimension tables are dictionary candidates; frequently joined wide tables are denormalization candidates.
- **Time-range usage:** almost every analytical query filters on a timestamp range. This is the strongest signal for both the ORDER BY (a trailing time column) and the partition key (usually by month).

Summarize as a small matrix: one row per query, columns for filters, group-bys, sorts, joins, and time range. Patterns that repeat across many queries are the ones the schema must serve.

**A key mindset shift:** ClickHouse rewards a few well-chosen sorting columns plus denormalization over Postgres-style normalization plus many B-tree indexes. ClickHouse has no per-query index chosen by a planner; the table is physically sorted once by its ORDER BY.

---

## Phase 3: ClickHouse schema design

Design the destination schema from the Phase 2 requirements. Decisions here, especially ORDER BY, are hard to change later.

### 3.1 Engine selection (MergeTree family)

| Situation | Engine | Why |
|-----------|--------|-----|
| Append-only events, immutable facts | `MergeTree` | The default. ClickHouse performs best with append-only data. |
| Rows updated or deleted in Postgres, replicated via CDC | `ReplacingMergeTree` | Updates arrive as versioned inserts; the engine deduplicates by sorting key during background merges. This is what ClickPipes uses for CDC. |
| Pre-aggregated rollups updated incrementally | `AggregatingMergeTree` | Stores aggregate states (one row per key), merged in the background. Pairs with materialized views. |
| Pure additive counters per key | `SummingMergeTree` | Sums numeric columns sharing a sorting key. A lighter-weight rollup than AggregatingMergeTree. |

Notes that hold across the family:

- `ReplacingMergeTree` deduplicates **asynchronously** during merges, so duplicates can be visible between merges. Resolve them at read time with `FINAL`, or with `argMax(col, _version)` / a `ROW_NUMBER() OVER (PARTITION BY key ORDER BY version DESC)` window, or by hiding `FINAL` behind a view. `FINAL` performance has improved substantially in recent versions.
- For CDC, ClickPipes adds metadata columns such as `_peerdb_version` (the row version) and `_peerdb_is_deleted` (soft-delete flag). Filter deleted rows with a view or a row policy: `CREATE ROW POLICY cdc ON events FOR SELECT USING _peerdb_is_deleted = 0 TO ALL;`.
- Do not reach for `CollapsingMergeTree`/`VersionedCollapsingMergeTree` unless you specifically need sign-based collapsing; `ReplacingMergeTree` covers most upsert cases more simply.

### 3.2 ORDER BY and primary key

This is the single most important decision. In ClickHouse the `ORDER BY` defines how rows are physically sorted on disk and the `PRIMARY KEY` defines the sparse index (it must be a prefix of the `ORDER BY`).

Rules:

- **Lead with the columns most queries filter on**, drawn directly from the Phase 2 matrix (common `WHERE` and `GROUP BY` columns).
- **Order columns low cardinality to high cardinality.** Put the column with the fewest distinct values first. This maximizes compression and lets the sparse index prune effectively. A trailing high-cardinality timestamp is common.
- **Keep the key reasonably short** (a handful of columns). Every column adds to index size and slows merges.
- For CDC, the Postgres primary key is a sound default ordering key. If you extend it, choose columns that **do not change for a given row** (for example `(tenant_id, id)`), and make sure Postgres `REPLICA IDENTITY` includes every ordering-key column, otherwise CDC will replicate nulls for the non-PK columns.

Example: a query set that filters by `event_type` and a time range, grouping by `event_type`, suggests:

```sql
ORDER BY (event_type, toStartOfHour(created_at), user_id)
```

low-cardinality `event_type` first, then time, then the high-cardinality `user_id`.

### 3.3 Partitioning

Partitioning is for data lifecycle management (dropping or moving old data with `DROP PARTITION` / TTL), not for speeding up queries. Query pruning comes mainly from the ORDER BY.

- **Partition by month** for time-series data: `PARTITION BY toYYYYMM(created_at)`. This keeps the partition count bounded (12 per year).
- **Avoid over-partitioning.** High-cardinality partition keys (by user, by day over many years) cause part explosion and "too many parts" errors. Keep total distinct partitions roughly in the 100 to 1,000 range.
- **Consider starting with no partitioning** for smaller tables and adding it only when you have a lifecycle need.

### 3.4 Materialized views for rollups

When the heavy queries are aggregations over a large fact table, precompute them.

- **Incremental materialized views** run the aggregation on each new insert block and write states into an `AggregatingMergeTree` (or sums into a `SummingMergeTree`) target. They are ideal for real-time rollups over massive tables because they never rescan history. They cannot deduplicate at insert time, so on a `ReplacingMergeTree` source you still account for duplicates.
- **Refreshable materialized views** run a full query on a schedule and replace the target table. They are the simplest way to materialize a deduplicated, denormalized, or re-ordered view of CDC data, at the cost of freshness bounded by the refresh interval.

```sql
-- Incremental rollup: hourly counts per event_type
CREATE TABLE events_hourly
(
    event_hour   DateTime,
    event_type   LowCardinality(String),
    events       UInt64,
    unique_users AggregateFunction(uniq, UInt64)
)
ENGINE = AggregatingMergeTree()
ORDER BY (event_type, event_hour);

CREATE MATERIALIZED VIEW events_hourly_mv TO events_hourly AS
SELECT
    toStartOfHour(created_at) AS event_hour,
    event_type,
    count()                   AS events,
    uniqState(user_id)        AS unique_users
FROM events
GROUP BY event_hour, event_type;

-- Read it back, merging the aggregate states
SELECT event_type, sum(events), uniqMerge(unique_users)
FROM events_hourly
GROUP BY event_type;
```

### 3.5 Joins vs denormalization vs dictionaries

ClickHouse joins have improved a lot; start by running the join query unchanged and measure. If it is too slow:

- **Small dimension tables:** load as a **dictionary** for fast key lookups in queries, instead of joining.
- **Frequently joined wide data:** **denormalize** into the fact table (or into a refreshable materialized view that joins once), trading storage for read speed.
- **When you must join:** filter each side before joining, include join keys in the ORDER BY, and pick the join algorithm to fit (`parallel_hash` for throughput, `partial_merge` for lower memory).

---

## Phase 4: Type mapping (Postgres to ClickHouse)

Map types deliberately. Right-sizing and avoiding `Nullable` materially affect storage and speed because ClickHouse is columnar.

| Postgres | ClickHouse | Notes |
|----------|-----------|-------|
| `smallint` / `int2` | `Int16` | Right-size to the real range; prefer unsigned when no negatives. |
| `integer` / `int4` | `Int32` | |
| `bigint` / `int8` | `Int64` | |
| `serial` / `bigserial` | `Int32` / `Int64` | Sequence semantics do not carry over; the value is just an integer. |
| `numeric` / `decimal(p,s)` | `Decimal(P, S)` | Preserve precision and scale. Use only when exact decimals are required (money). |
| `real` / `float4` | `Float32` | |
| `double precision` / `float8` | `Float64` | |
| `boolean` | `Bool` (alias of `UInt8`) | |
| `text` / `varchar` / `char` | `String` | ClickHouse `String` is variable length; length limits are not enforced. |
| low-cardinality `text`/`varchar` (status, country, type) | `LowCardinality(String)` | Use when distinct values < ~10,000. Big compression and speed win. |
| fixed-length code (e.g. 2-char country) | `FixedString(N)` | Only for truly fixed-length data. |
| `timestamp` / `timestamptz` | `DateTime64(6)` (or `DateTime` if seconds suffice) | Match Postgres microsecond precision with scale 6; use 9 only if you truly need nanoseconds. ClickHouse `DateTime` is second precision. |
| `date` | `Date` (or `Date32` for pre-1970 / far-future) | `Date` covers 1970 to 2149. |
| `uuid` | `UUID` | |
| `json` / `jsonb` | `String` or the native `JSON` type | Use `String` when you only store and rarely query inside it; use the native `JSON` type for dynamic schemas you query into. Prefer extracting hot fields into typed columns. |
| `bytea` | `String` | Binary-safe. |
| `inet` / `cidr` | `IPv4` / `IPv6` (or `String`) | Native IP types compress and compare efficiently. |
| `array` (e.g. `int[]`) | `Array(T)` | Native arrays are first class in ClickHouse. |
| `enum` | `Enum8` / `Enum16` or `LowCardinality(String)` | Enum validates a fixed set; LowCardinality is more flexible. |

Cross-cutting type guidance:

- **Right-size integers.** Do not default everything to `Int64`. HTTP status fits `UInt16`, age fits `UInt8`. Smaller types compress better and stay in cache.
- **Avoid `Nullable` unless NULL is semantically meaningful.** `Nullable` keeps a separate marker column and hurts columnar performance. Prefer a `DEFAULT` (`''` for strings, `0` for numbers). Reserve `Nullable` for cases where NULL genuinely differs from a default (for example `deleted_at`, `parent_id`). With PeerDB-based CDC, `PEERDB_NULLABLE` can auto-detect nullable Postgres columns.
- **Codecs for time-series and sequential columns.** Columns that change slowly or monotonically compress dramatically with delta codecs plus `ZSTD`:
  - Timestamps and monotonically increasing ids: `CODEC(DoubleDelta, ZSTD)`.
  - Slowly varying numeric gauges and counters: `CODEC(Delta, ZSTD)`.
  - General large string/blob columns: `CODEC(ZSTD)` (a higher level such as `ZSTD(3)` trades CPU for ratio).

---

## Phase 5: Ingestion and CDC

You usually need two things: a one-time backfill of existing data, then continuous replication of ongoing changes.

### 5.1 ClickPipes (ClickHouse Cloud, recommended)

ClickPipes for Postgres reads the Postgres write-ahead log via logical decoding, performs the **initial snapshot/backfill**, and then **streams ongoing inserts, updates, and deletes**. It maps tables to `ReplacingMergeTree`, sets the Postgres primary key as the ordering key, picks native type mappings, and adds CDC metadata columns (`_peerdb_version`, `_peerdb_is_deleted`). Updates become versioned inserts and deletes become soft-delete rows; you resolve them at read time (Phase 3.1).

### 5.2 PeerDB (open source / self-managed)

PeerDB is the open-source CDC engine underpinning ClickPipes. Use it when you self-host ClickHouse or need to run CDC outside ClickHouse Cloud. Same logical-decoding model and the same `_peerdb_*` semantics.

### 5.3 One-shot or batch loads

For tables that do not need continuous sync, or for a quick proof of concept, load directly:

- The `postgresql` table function to `INSERT INTO ... SELECT` straight from Postgres.
- Export to Parquet/CSV and ingest via `s3`/`url`/file table functions.

Batch inserts should be large (roughly 10,000 to 100,000 rows per insert) so ClickHouse creates fewer, larger parts. For many small inserts, enable async inserts.

### 5.4 CDC prerequisites checklist

- Postgres `wal_level = logical`.
- A replication slot and publication for the tables in scope.
- `REPLICA IDENTITY` set to include every column used in the ClickHouse ordering key (default is just the primary key; extend it if you customized ORDER BY).

---

## Phase 6: Validation and iteration

Migration is not done when data lands; it is done when the heavy queries are correct and fast.

### 6.1 Verify completeness

```sql
-- Row counts should match (use FINAL on ReplacingMergeTree to discount duplicates)
SELECT count() FROM events FINAL;          -- ClickHouse
-- compare against SELECT count(*) FROM events;  -- Postgres

-- Spot-check aggregates that the business cares about
SELECT event_type, count() FROM events FINAL GROUP BY event_type ORDER BY event_type;
```

For CDC tables remember to filter `_peerdb_is_deleted = 0` (or query through the view/row policy) so soft-deleted rows do not inflate counts.

### 6.2 Validate correctness

Run each benchmark query against both systems and diff the results. Reconcile differences from deduplication timing (`FINAL`), type rounding (`Decimal` vs `Float`), or NULL-vs-default handling before trusting performance numbers.

### 6.3 Benchmark and iterate

- Time each heavy query in ClickHouse. Use `EXPLAIN` and Query Insights to confirm the sparse index is pruning (check rows read vs rows in table).
- If a query scans far more than it returns, the ORDER BY does not serve it. Options, in order of preference: add a secondary skipping index, add a projection, add a materialized view rollup, or, if many queries need a different physical sort, create an alternative ordering (a second table or refreshable MV) and route those queries to it.
- Re-check type choices: are repeated strings `LowCardinality`? Are integers right-sized? Did time-series columns get delta + ZSTD codecs?
- Confirm partitioning is healthy (bounded part count) via `system.parts`.

---

## Example: a Postgres table and its ClickHouse equivalent

**Postgres source:**

```sql
CREATE TABLE events (
    id            bigserial PRIMARY KEY,
    user_id       bigint NOT NULL,
    event_type    varchar(50) NOT NULL,
    country       varchar(64),
    properties    jsonb,
    amount        numeric(12,2),
    created_at    timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX ON events (event_type, created_at);
CREATE INDEX ON events (user_id);
```

The existing indexes plus the heavy-query analysis show filtering by `event_type` and time range, grouping by `event_type` and `country`. Recommended ClickHouse table:

```sql
CREATE TABLE events
(
    id          Int64,
    user_id     Int64,
    event_type  LowCardinality(String),
    country     LowCardinality(String) DEFAULT '',
    properties  String,                 -- or JSON if you query into it
    amount      Decimal(12, 2),
    created_at  DateTime64(6) CODEC(DoubleDelta, ZSTD)
)
ENGINE = MergeTree                       -- ReplacingMergeTree(_peerdb_version) if CDC
PARTITION BY toYYYYMM(created_at)
ORDER BY (event_type, created_at, user_id);
```

What changed and why: `bigserial`/`bigint` to right-sized `Int64`; low-cardinality `varchar` to `LowCardinality(String)`; `country` made non-`Nullable` with a default; `jsonb` to `String` (extract hot fields into typed columns if queried); `numeric(12,2)` to exact `Decimal(12,2)`; `timestamptz` to `DateTime64(6)` with a delta + ZSTD codec; the composite filter index became the `ORDER BY` (low to high cardinality), and the table is partitioned by month for lifecycle, not for query speed.

---

## Migration checklist

**Discovery**
- [ ] Table inventory captured (row counts, on-disk size)
- [ ] Per-column cardinality and null fraction measured
- [ ] Top 10 to 25 heavy analytical queries identified (pg_stat_statements / Query Insights)

**Access patterns**
- [ ] Filters, GROUP BYs, sort orders, joins, and time ranges extracted per heavy query
- [ ] Recurring patterns identified as ORDER BY / partition / rollup candidates

**Schema design**
- [ ] Engine chosen per table (MergeTree vs ReplacingMergeTree vs Aggregating/Summing)
- [ ] ORDER BY chosen: top filters first, low-to-high cardinality, short
- [ ] Partition key bounded (usually monthly); no over-partitioning
- [ ] Rollups planned as incremental or refreshable materialized views where aggregation-heavy
- [ ] Join strategy decided (run as-is, dictionary, or denormalize)

**Types**
- [ ] Every column mapped via the type table
- [ ] Integers right-sized; LowCardinality applied to repeated strings (< ~10K distinct)
- [ ] Nullable used only where NULL is semantic; defaults elsewhere
- [ ] Delta / DoubleDelta + ZSTD codecs on time-series and sequential columns

**Ingestion**
- [ ] Backfill path chosen (ClickPipes snapshot, postgresql table function, Parquet/CSV)
- [ ] Ongoing CDC configured (ClickPipes or PeerDB) if needed
- [ ] CDC prerequisites met (wal_level=logical, REPLICA IDENTITY covers ordering key)
- [ ] Deduplication strategy for ReplacingMergeTree decided (FINAL / argMax / view / row policy)

**Validation**
- [ ] Row counts reconcile (with FINAL and _peerdb_is_deleted filtering)
- [ ] Benchmark query results match Postgres
- [ ] Heavy queries benchmarked; ORDER BY confirmed to prune via EXPLAIN / Query Insights
- [ ] Partition health verified in system.parts
- [ ] Iterated on ORDER BY, codecs, and materialized views where a query still scans too much
