Unifying OLTP and OLAP: HTAP databases, zero-ETL, and best-of-breed architectures

Al Brown
Last updated: Mar 29, 2026

Data engineers and architects face a recurring question: should transactional and analytical workloads run in one database, or in two purpose-built systems connected by a data pipeline? After a decade of HTAP database promises, the industry has answered decisively with composable architectures — dedicated OLTP and OLAP engines connected by real-time CDC replication and zero-ETL integration. This guide covers the full picture.

  • OLTP and OLAP have fundamentally different storage, I/O, and concurrency requirements. No single engine serves both without compromise.
  • Every major platform (ClickHouse, Snowflake, Databricks, AWS, Google, Microsoft) has converged on composed architectures with separate OLTP and OLAP engines.
  • CDC has become easier to set up, and replication latency has dropped, closing any gap that justified HTAP.
  • ClickHouse's unified data stack pairs PostgreSQL (NVMe-backed, managed) with ClickHouse for analytics, connected by ClickPipes CDC and pg_clickhouse query federation.
  • Starting with a composed architecture from day one avoids a painful migration when you outgrow a single system.

What does unifying OLTP and OLAP mean? #

What is OLTP and what is OLAP? #

OLTP (Online Transaction Processing, sometimes called an OLTP database) handles the reads and writes that power applications. Point lookups, inserts, updates, and deletes against individual rows keep e-commerce carts, user accounts, and payment systems running. OLTP databases use row-oriented storage, deliver sub-millisecond latency, and sustain thousands of simultaneous transactions.

OLAP (Online Analytical Processing) answers questions across large datasets. Aggregations, scans, and joins over millions or billions of rows feed dashboards, reports, and data science workflows. An OLAP database — also known as a columnar database or analytical database — uses columnar storage, returns complex query results in sub-second to seconds, and optimizes for throughput over concurrency.

The two workload types have fundamentally different performance profiles. OLTP optimizes for many small operations completed quickly. OLAP is the opposite: fewer, larger operations where throughput matters more than latency.

Why were OLTP and OLAP separated in the first place? #

Early relational databases handled both workloads in one system. IBM's System R and the first releases of Oracle ran transactions and reports side by side throughout the 1970s and early 1980s.

That changed as datasets grew. Running a monthly sales report on the same database serving live orders caused resource contention that degraded transaction performance. A single analytical query scanning millions of rows could spike CPU and I/O utilization, pushing payment confirmations from milliseconds to seconds.

The industry responded by separating the workloads entirely. Purpose-built data warehouses from Teradata and Netezza emerged in the late 1980s and 1990s. ETL (Extract, Transform, Load) pipelines moved data from production OLTP databases into these analytical systems on nightly or hourly schedules.

This two-system architecture became the standard pattern for three decades. Nearly every enterprise data platform today still follows it, with an OLTP database feeding an OLAP system through some form of data pipeline.

What is an HTAP database (Hybrid Transactional/Analytical Processing)? #

HTAP is a term Gartner coined in 2014 to describe databases that handle both OLTP and OLAP workloads on the same data, in the same engine. An HTAP database promised to eliminate ETL by running real-time analytics directly on transactional data in a single system.

The concept goes by several names depending on which analyst firm is writing the report. Forrester calls it "Translytical." The 451 Group uses "HOAP" (Hybrid Operational/Analytical Processing). The underlying idea is identical.

Multiple architecture patterns compete under the HTAP database umbrella. SAP HANA maintains a primary row store alongside an in-memory column store. TiDB replicates from a distributed row store (TiKV) to a columnar store (TiFlash). SingleStore uses a universal storage format that attempts to serve both access patterns. Each HTAP database makes different tradeoffs between transaction performance, analytical speed, and data freshness.

What does a unified data stack look like in practice? #

Two competing models have emerged for combining OLTP and OLAP capabilities. The first is the converged model, where a single database engine handles both workloads. This is the HTAP approach described above. The second is the composed model, where purpose-built OLTP and OLAP engines integrate tightly through real-time data movement, query federation, or both.

ClickHouse advocates for the composed model. The unified data stack pairs PostgreSQL for OLTP with ClickHouse for OLAP, connected by ClickPipes or PeerDB for CDC (Change Data Capture) and pg_clickhouse for federated queries back to PostgreSQL. Each engine runs the workload it was purpose-built for.

AWS follows a similar composed pattern with Aurora zero-ETL integration to Redshift. Google's BigQuery Omni and AlloyDB with BigQuery integration take the same approach. All of them keep the engines separate and make data movement automatic.

The converged approach struggles because the architectural requirements of OLTP and OLAP pull in opposite directions. Storage format, I/O patterns, memory allocation, concurrency control, and latency targets all differ between the two workloads.

Why OLTP and OLAP are architecturally different #

Why is row-oriented storage better for transactions? #

Row-oriented storage matches the access pattern of transactional workloads. OLTP operations read, write, or update complete rows: fetching a user record, inserting an order, or updating a payment status. All columns of a given row are needed together.

A row store places all columns of a row contiguously on disk. Reading one row requires a single sequential I/O operation. B-tree indexes provide O(log n) lookups on primary keys, so finding a specific row in a table of 100 million records requires traversing only 3 to 4 tree levels. On NVMe storage, that traversal completes in 0.3 to 0.4 milliseconds.

Row storage also supports efficient in-place updates. Changing a single column value in a row means rewriting just that row's data page, not reorganizing an entire data structure. This property is essential for OLTP workloads where updates are frequent and must commit quickly.

Why is a columnar database better for analytics? #

Analytical queries scan millions or billions of rows but typically read only a few columns. A columnar database stores data by column rather than by row, and this columnar storage layout is built for exactly this access pattern. A query like SELECT SUM(revenue) FROM sales WHERE date > '2025-01-01' touches 2 columns out of a table that might have 50. In a row store, the database reads all 50 columns for every row. In a column store, it reads only the 2 it needs.

Same-type data stored contiguously compresses far better than mixed-type row data. ClickHouse typically achieves 5x to 10x compression compared to row-oriented databases, because a column of integers or a column of dates contains highly repetitive patterns that compression algorithms exploit effectively.

Vectorized execution adds another advantage. A columnar database processes data in batches of thousands of values at once, using CPU SIMD (Single Instruction, Multiple Data) instructions. Instead of evaluating a filter condition row by row, the engine applies it to a vector of 8,192 values in a single CPU operation. ClickHouse's query engine is built entirely around this principle.

What happens when you run analytics on an OLTP database? #

Analytical queries consume large amounts of CPU, memory, and disk I/O for sustained periods, and these resources are shared with the transactional workload. A complex GROUP BY over 100 million rows can pin CPU at 100% and saturate disk I/O for minutes, causing transaction latency to spike from sub-millisecond to seconds or worse.

Long-running analytical queries also create problems for concurrency control. Under MVCC (Multi-Version Concurrency Control), a query that runs for 30 minutes holds a snapshot open for the entire duration. PostgreSQL must retain all row versions created during that window, increasing table bloat and slowing down vacuum operations that reclaim space.

The common workaround is read replicas. Route analytical queries to a replica and keep the primary clean for transactions. This reduces contention but introduces operational complexity (managing replication lag, provisioning replica capacity) and still runs analytics on row-oriented storage. A query that takes 3 seconds on a columnar engine like ClickHouse might take 10 minutes on a PostgreSQL read replica scanning the same data in row format.

What happens when you run transactions on an OLAP database? #

Reading a single row from a columnar database means fetching data from every column file independently and assembling the result. What completes in one I/O operation on a row store requires dozens of I/O operations on a column store, one per column. Point lookups are inherently expensive.

Updates work differently in columnar systems. ClickHouse supports standard SQL UPDATE statements for modifying individual rows, and ReplacingMergeTree provides an efficient pattern for high-volume upsert workflows by inserting new versions of rows and deduplicating during background merges. The FINAL keyword in SELECT queries ensures immediate consistency at query time regardless of merge state. These capabilities serve analytical and CDC workloads well, but they are not a substitute for the row-level locking and per-transaction ACID semantics that OLTP applications depend on.

Columnar systems optimize for throughput by batching operations. Inserting rows one at a time into ClickHouse or Snowflake is orders of magnitude less efficient than inserting them in batches of thousands. OLTP workloads generate individual inserts and updates continuously, which conflicts directly with the batch-oriented design of columnar engines.

CharacteristicOLTPOLAP
Primary operationPoint reads, inserts, updates, deletesScans, aggregations, joins
Storage formatRow-orientedColumn-oriented
Latency targetSub-millisecondSub-second to seconds
ConcurrencyThousands of small transactionsTens to hundreds of large queries
Data volume per querySingle row or small rangeMillions to billions of rows
Index typeB-tree for point lookupsSparse indexes, data skipping
Compression ratio1-2x5-10x+
Example systemsPostgreSQL, MySQL, OracleClickHouse, Snowflake, BigQuery

The storage problem: why OLTP needs NVMe and OLAP can use object storage #

Why does OLTP need local NVMe storage? #

OLTP workloads consist of small, random reads and writes scattered unpredictably across the dataset. B-tree index traversals, WAL (Write-Ahead Log) fsyncs, and buffer pool page flushes all generate random I/O patterns that demand low-latency storage.

NVMe SSDs physically attached to the compute node deliver approximately 100 microsecond random read latency and roughly 20 microsecond random write latency. This physical proximity is non-negotiable for OLTP performance.

WAL durability sits on the critical path of every transaction. Before a database can acknowledge a commit to the application, it must fsync the WAL record to durable storage. On local NVMe, this takes microseconds. Every production OLTP database, from PostgreSQL to MySQL to Oracle, is architected around this guarantee of fast, durable local writes.

A well-tuned PostgreSQL instance on NVMe serves simple key-value lookups at under 1ms p99 and sustains hundreds of thousands of transactions per second on a single node. That performance depends entirely on the storage layer sitting physically next to the CPU.

What is the latency difference between NVMe and S3 for database I/O? #

The gap between local NVMe and object storage is not a small percentage difference. It spans two to three orders of magnitude for the operations that define OLTP workloads.

OperationNVMe SSD (local)S3 GET (first byte)S3 PUT
Random 4KB read~0.1ms (100μs)20-100msN/A
Sequential 4KB read~0.05ms20-100msN/A
Random 4KB write~0.02msN/A10-50ms
WAL fsyncmicrosecondsN/A (not supported)10-50ms per object write
B-tree traversal (3-4 levels)0.3-0.4ms60-400msN/A
p99 tail latencyunder 1ms200-500msvariable

That is a 200x to 1,000x gap for the I/O patterns OLTP depends on.

S3 is designed for large sequential reads and writes of multi-megabyte objects. OLTP requires small random I/O against 4KB or 8KB pages. These are fundamentally incompatible access patterns. S3 also does not support append operations. Databases continuously append small records to the WAL, but S3 requires writing complete objects, which means buffering WAL records in memory and flushing them as discrete objects. This introduces both latency and a durability gap during the buffer window.

S3 tail latency compounds the problem. The p99 for S3 GET requests can spike to 200 to 500 milliseconds due to internal retries and throttling. OLTP applications care deeply about tail latency. A payment processing system with a 50ms p99 SLA cannot absorb 500ms storage spikes.

Why can't caching close the gap for OLTP on object storage? #

Analytical databases like ClickHouse, Snowflake and Databricks cache frequently accessed data. For read-heavy analytical workloads with good temporal locality (recent data accessed repeatedly), this caching works well. It does not solve the OLTP problem.

Writes still require durable storage. A database cannot acknowledge a transaction commit based solely on a cache write, because a node failure would lose that data. The write path must reach durable storage, and if that storage is S3, the latency floor is 10 to 50 milliseconds per write. No amount of read caching changes the write path.

Cache misses are catastrophic for OLTP latency targets. A single cache miss that falls through to S3 adds 20 to 100 milliseconds, which destroys any sub-millisecond p99 SLA. OLTP write patterns (random updates to scattered rows across the dataset) exhibit poor cache locality. Unlike OLAP workloads where queries concentrate on recent partitions, OLTP touches data unpredictably across the entire working set.

Cold starts after scale-down events or failovers mean empty caches and full S3 latency for every operation. In an OLTP context, that translates to minutes of degraded performance affecting live users and transactions.

What does write amplification look like in dual-format storage? #

HTAP systems that maintain both row and columnar representations of the same data pay a write amplification tax. Every transaction writes data to the row store for OLTP and then again (synchronously or asynchronously) to the columnar store for OLAP. This at minimum doubles write I/O.

The doubled I/O adds CPU overhead for format conversion between row and columnar layouts. On write-heavy workloads, the conversion overhead alone can consume 15% to 30% of available CPU cycles.

Synchronization timing forces a tradeoff with no good middle ground. Immediate synchronization keeps the columnar store fresh for analytics but degrades transaction throughput because every commit must wait for the columnar write. Delayed synchronization preserves OLTP performance but makes analytics stale, which undermines the primary selling point of HTAP.

A 2024 VLDB survey on HTAP systems identifies this write amplification and synchronization tension as a fundamental architectural constraint. No production HTAP system has eliminated it. The tradeoff can be managed but not avoided.

Resource contention and isolation: the HTAP database tradeoff #

What is resource contention when OLTP and OLAP share an engine? #

An analytical query scanning billions of rows consumes CPU, memory, disk I/O, and network bandwidth for sustained periods, often minutes at a time. These are finite physical resources.

When both workloads run on the same engine, the database scheduler faces a zero-sum allocation problem. Memory assigned to hash joins for an OLAP query is memory the OLTP buffer pool cannot use. The same goes for CPU: vectorized aggregation starves transaction threads. And a full table scan on disk delays every B-tree index read waiting behind it.

The practical result is OLTP latency spikes during analytical workloads. A dashboard refresh that triggers a complex aggregation across a 500 million row table can cause payment transactions to time out. For applications with strict latency SLAs, this is unacceptable.

No scheduling algorithm can give a CPU core to two workloads at the same instant. A core executing an OLAP filter is unavailable for OLTP transaction processing at that moment.

How do HTAP databases balance workload isolation and data freshness? #

The common approaches each trade away a different part of the HTAP promise.

Resource partitioning dedicates specific CPU cores and memory regions to each workload type. SAP HANA uses this approach. It reduces contention but wastes resources when one workload is idle. If the OLTP partition gets 8 cores and OLAP gets 8 cores, a burst of analytical queries cannot borrow from idle OLTP capacity. Peak performance for each workload is capped at half of what dedicated hardware could provide.

Replica-based isolation routes analytics to a separate copy of the data. TiDB's TiFlash columnar store receives data through Raft replication from the TiKV row store. This genuinely isolates the workloads, but it introduces replication lag. If the system is replicating data to a separate store anyway, the architectural difference between TiFlash and replicating to a standalone ClickHouse instance narrows considerably.

Scheduling-based isolation prioritizes OLTP queries and throttles OLAP when resources are scarce. This protects transaction latency but gives analytical queries unpredictable, often poor performance. An analyst waiting 45 seconds for a query that would complete in 2 seconds on a dedicated OLAP engine will not consider the system "unified" in any meaningful sense.

Jack Vanlightly's analysis of HTAP storage architectures covers these tradeoffs in detail.

Can a single database engine truly serve both workloads without compromise? #

No. A decade of HTAP development and production deployments has produced consistent evidence that both workloads cannot run at full capability on shared infrastructure simultaneously.

Every successful deployment that claims unified OLTP and OLAP either accepts degraded performance on one workload (usually OLAP), introduces physical separation through replication (which negates the single-engine premise), or restricts the scale of one workload to keep the other healthy. These constraints reflect the physical reality that a single set of hardware resources cannot simultaneously optimize for two opposing access patterns.

The market has validated this conclusion through acquisitions. Snowflake acquired Crunchy Data (a PostgreSQL company) rather than building OLTP into its columnar engine. Databricks acquired Neon (a serverless PostgreSQL company) rather than adding transactional capabilities to its Spark-based platform. Microsoft deprecated Azure Synapse Link for Cosmos DB in favor of Fabric Mirroring, moving from a tightly coupled integration to a decoupled replication architecture.

Composed architectures with dedicated engines for each workload, connected by real-time CDC replication pipelines, outperform converged HTAP database engines. The remaining challenge is making the integration between those engines simple and automatic enough that operators experience it as a single system.

How the major platforms approach OLTP #

Snowflake and Databricks each independently acquired PostgreSQL companies in mid-2025, concluding that extending a columnar analytical engine into transactional territory does not work. Other platforms built bridges between their existing OLTP and OLAP services. A few attempted a converged engine.

Why did both Snowflake and Databricks acquire PostgreSQL companies? #

Both companies independently concluded that extending a columnar analytical engine into transactional territory does not work. Within weeks of each other in mid-2025, Databricks acquired Neon (May) and Snowflake acquired Crunchy Data (June), each buying a PostgreSQL-native company rather than continuing to build OLTP capabilities into their analytical engines.

Snowflake had already tried the converged approach. Snowflake Unistore hybrid tables reached general availability in November 2024, adding a row-based storage layer alongside Snowflake's columnar engine. Throughput throttled to roughly 1,000 operations per second. Active data capped at 500 GB. It lacked materialized views, cloning, streams, and cross-region replication. Nobody would trust this for production transactional workloads.

Rather than doubling down on Snowflake Unistore, both Snowflake and Databricks did the pragmatic thing. They acquired battle-tested PostgreSQL providers with years of production hardening. These two analytical data platforms both concluded that the HTAP approach of extending OLAP into OLTP doesn't deliver, and opted for the composed approach of adding a separate, purpose-built OLTP engine.

What is Databricks Lakebase and what is the S3-backed Postgres tradeoff? #

Databricks Lakebase, the company's managed online transaction processing service generally available on AWS in early 2026, is built on Neon's serverless PostgreSQL technology. Its architecture separates compute and storage with S3 as the "ultimate source of truth." The system comprises stateless Postgres compute nodes, Safekeepers that replicate the write-ahead log, Pageservers that serve data pages from object storage, and a control plane that orchestrates everything.

Object-storage-backed PostgreSQL introduces latency that traditional OLTP workloads don't tolerate well. ClickHouse's analysis of managed Postgres architectures noted that "object storage works well for analytics, but its higher latency and coarse-grained I/O make it challenging to support transactional patterns." The Neon team's own benchmarks confirm that their disaggregated Postgres requires a sophisticated page server caching tier to achieve acceptable latency, and still cannot match vanilla PostgreSQL on local NVMe for latency-sensitive workloads.

How do the major platforms compare? #

PlatformOLTP EngineOLAP EngineIntegration MethodStorage LayerKey Limitation
SnowflakeSnowflake Postgres (Crunchy)Snowflake columnarpg_lake / Iceberg zero-ETLMixed, object storage, disksPostgres uses local disks but not NVMe
DatabricksLakebase (Neon)Spark / Photon / Delta Lakepg_mooncake / Delta syncObject storage (both)S3-backed Postgres means high latency
AWSAurora PostgreSQLRedshiftZero-ETLEBS (Aurora) / S3 (Redshift)AWS-only, both systems have struggled to maintain modern performance
GoogleAlloyDBBigQueryFederated queriesPersistent disks / GCSGoogle Cloud-only, OLAP is in-memory only
MicrosoftCosmos DBFabric / SynapseFabric MirroringCosmos storage / OneLakeAzure-only, requires adoption of Fabric
ClickHouseManaged PostgresClickHouseClickPipes CDC + pg_clickhousePostgres uses NVMe / ClickHouse uses object storage for primary storage + SSDs for hot cachingUses NVMe for Postgres, not object storage, to preserve peak OLTP performance

How cloud providers bridge OLTP and OLAP #

The hyperscalers took a different path than the analytical data platforms. Instead of acquiring PostgreSQL companies, AWS, Google, and Microsoft each built integration layers between their existing transactional and analytical database services. Their approaches vary in ambition, from simple replication bridges to transparent dual-format engines.

How does Amazon Aurora zero-ETL work with Redshift? #

Aurora zero-ETL replicates data from Aurora PostgreSQL or Aurora MySQL to Amazon Redshift within seconds of transactional writes. Generally available for Aurora PostgreSQL since 2025, it supports up to five integrations from the same Aurora cluster.

This is architecturally the purest composed model among the cloud providers. Aurora remains a dedicated OLTP system backed by EBS storage with row-oriented data. Redshift remains a dedicated OLAP system backed by S3 with columnar storage. A managed sync layer handles CDC, schema mapping, and data format conversion. The limitation is vendor lock-in: both Aurora and Redshift are AWS-only services, and zero-ETL does not work with Postgres instances outside of Aurora. Data format conversion from Aurora's row-oriented storage to Redshift's columnar storage happens automatically, but schema changes on the Aurora side require manual reconfiguration of the integration. AWS also added zero-ETL from Aurora to SageMaker Lakehouse in October 2025, syncing data in Apache Iceberg format for machine learning workloads.

How does Google AlloyDB accelerate analytics on PostgreSQL? #

AlloyDB is a PostgreSQL-compatible managed database with a transparent in-memory columnar engine. The columnar engine is bounded by available RAM, which caps the volume of data that benefits from columnar acceleration. It also runs exclusively on Google Cloud.

Within those limits, AlloyDB is the most HTAP-like approach among the cloud providers. The columnar engine operates transparently alongside row-based transactional storage with no schema changes and no special query syntax. Google claims up to 100x faster analytical queries compared to standard PostgreSQL. For datasets that fit in the columnar cache and teams already committed to Google Cloud, it reduces the need for a separate analytical system. For larger datasets or multi-cloud requirements, a dedicated OLAP engine remains the better path.

Azure Synapse Link for Cosmos DB was originally positioned as cloud-native HTAP, maintaining a fully isolated columnar analytical store alongside Cosmos DB's transactional row store within a single system boundary. Microsoft deprecated it for new projects.

The replacement is Azure Cosmos DB Mirroring for Fabric, which replicates data into Fabric OneLake with zero performance impact on transactional workloads and no Request Unit consumption. Data lands in OneLake in Delta Parquet format, ready for analytical queries through Fabric's compute engines.

Microsoft moved from a converged model to a decoupled model with continuous replication. Fabric Mirroring is Azure-only and tightly coupled to the Microsoft ecosystem, which limits its appeal for multi-cloud teams.

Has the HTAP database lived up to the promise? #

Every vendor discussed in this article started from the same premise: organizations need real-time analytics on transactional data. The question was never whether the need exists, but how to satisfy it. After a decade of attempts, the answer is composable architectures, not the converged single-engine systems analysts originally predicted.

What did Gartner predict for the HTAP database and what actually happened? #

In 2014, Gartner coined the term HTAP and predicted it would "foster opportunities for dramatic business innovation" by enabling real-time analytics directly on transactional data stores. The vision was a single database engine that eliminated the latency and complexity of moving data between systems.

The term has been rebranded since: AWS calls its approach "zero-ETL integration," and other vendors use similar euphemisms. The rebranding signals that the original HTAP framing (one engine for both) failed to match what the industry actually built. In practice, every "zero-ETL" implementation is still two separate engines with a managed sync layer between them.

The dominant architecture in 2026 remains separate OLTP and OLAP systems connected by data pipelines. Integration latency has dropped from hours to seconds. Managed services have reduced operational burden. But the fundamental separation persists. No single-engine HTAP database has captured significant market share against the combination of a dedicated transactional database and a dedicated columnar database for analytics.

Why has the industry moved toward composable architectures instead of converged systems? #

Zhou Sun, co-founder of Mooncake Labs (acquired by Databricks), published "HTAP is Dead" in May 2025. Sun argued that practical challenges of resource contention, system complexity, and hardware constraints make dedicated systems more viable than converged ones. The resource contention problems outlined in the earlier sections of this article are the primary reason.

As Sun put it, "It's still HTAP; but through composition instead of consolidation of databases." The need for unified analytics on transactional data has not diminished. The architectural approach has shifted from one engine doing everything to connected engines doing what each does best.

Snowflake and Databricks acquired PostgreSQL companies instead of improving their built-in OLTP features. AWS built zero-ETL bridges between Aurora and Redshift. Microsoft deprecated Synapse Link for Fabric Mirroring. Google's AlloyDB, the closest thing to production HTAP among the hyperscalers, remains bounded by in-memory columnar capacity. Every major platform arrived at the same conclusion independently.

What does "composable HTAP" mean? #

Composable HTAP achieves what the original HTAP vision promised, specifically real-time analytics on transactional data with simplified operations, through composition rather than consolidation. Separate specialized engines handle OLTP and OLAP independently, connected by real-time data pipelines that keep them synchronized.

The canonical architecture is an OLTP system (typically PostgreSQL) feeding change data capture into a streaming layer, which delivers changes to an OLAP engine (ClickHouse) within seconds. Query federation layers like the pg_clickhouse foreign data wrapper let applications query both systems through a single PostgreSQL connection, routing analytical queries to the columnar engine transparently.

What made this viable in 2026 when it was impractical in 2014? CDC latency dropped from hours to sub-second. Managed services like ClickPipes, Aurora zero-ETL, and Fabric Mirroring eliminated most of the operational overhead of maintaining replication pipelines. Query federation matured to the point where applications don't need to know which engine answers which query.

The remaining hard problem is integration complexity at the edges: schema evolution across two systems, failure recovery and resync after outages, and consistency guarantees during the replication window. The platforms that solve these problems well (with managed CDC, failover-safe replication slots, and automatic schema handling) make composed architectures viable for teams that previously lacked the engineering bandwidth to maintain two systems.

The best-of-breed alternative: dedicated OLTP + dedicated OLAP #

What is the best-of-breed approach to unifying OLTP and OLAP? #

The best-of-breed approach pairs two engines that are each purpose-built for their workload and solves the integration problem between them. PostgreSQL handles transactions as the OLTP database. A columnar database like ClickHouse handles analytics as the real-time analytics database. A Postgres CDC pipeline keeps them in sync.

This architecture has existed for years. What changed is that the integration gaps that made it painful have closed. Modern change data capture delivers single-digit-second latency. Foreign data wrappers push analytical queries transparently from one engine to the other. Managed cloud services handle the operational overhead of running two systems.

What is ClickHouse's unified data stack? #

ClickHouse's unified data stack is PostgreSQL for OLTP paired with ClickHouse for OLAP, connected by three integration components:

  1. ClickPipes with PeerDB for real-time CDC from Postgres to ClickHouse
  2. pg_clickhouse extension for transparent analytical query pushdown from Postgres to ClickHouse
  3. Managed Postgres in ClickHouse Cloud for a single-vendor operational experience

Applications talk to Postgres for transactional operations. CDC continuously replicates changes to ClickHouse. Analytical queries either go directly to ClickHouse or route through pg_clickhouse from the Postgres connection.

Both databases are open source. The integration components (PeerDB, pg_clickhouse) are also open source. A reference architecture is available at github.com/ClickHouse/postgres-clickhouse-stack.

How does managed Postgres in ClickHouse Cloud work? #

ClickHouse Cloud's managed PostgreSQL service reached GA in January 2026.

The defining architectural choice is NVMe-backed storage, not EBS or object storage. This delivers up to 10x faster performance for disk-bound workloads compared to EBS-backed Postgres providers. Cross-AZ high availability supports up to two synchronous standbys with quorum-based replication, automatic backups with PITR via WAL-G, and dedicated isolated VPCs per cluster with PrivateLink support.

50+ configurations are available on AWS, ranging from 2 vCPUs/8 GB to 96 vCPUs/768 GB/60 TB storage.

Failover-safe replication slots preserve CDC replication slots across HA failovers and scaling operations. Other Postgres providers lose replication slots during failover, which triggers costly full resyncs of the entire dataset.

Why does ClickHouse use NVMe storage for its managed Postgres instead of S3? #

OLTP workloads need local, low-latency, durable storage. NVMe delivers roughly 100-microsecond random reads. S3 delivers 20-100 ms. That is a 200x to 1,000x gap for the I/O patterns that matter most to transactional databases.

Databricks' Lakebase and Snowflake Postgres run on object storage architectures. ClickHouse's position is explicit: object-storage-backed Postgres is not suited for OLTP where consistent, predictable latency matters.

PostgreSQL was designed for local storage. Its buffer pool, WAL, and checkpoint mechanisms all assume fast, durable, locally-attached disks. Running Postgres on S3 requires additional infrastructure layers (page servers, safekeepers) to bridge the gap between Postgres's expectations and object storage's capabilities. These layers add latency, operational complexity, and failure modes that don't exist when storage is physically attached to the compute node. ClickHouse itself uses tiered storage with both NVMe and object storage, placing hot data on NVMe for fast queries and cold data on S3 for cost efficiency.

Change data capture as the integration layer #

What is Change Data Capture and how does it bridge OLTP and OLAP? #

Change data capture (CDC) captures row-level changes from an OLTP database and streams them to a downstream system in near-real-time. Inserts, updates, and deletes are all captured. CDC replication has become the standard CDC tool for bridging transactional and analytical systems.

For Postgres CDC, the typical approach uses the logical replication protocol. The database publishes WAL (write-ahead log) entries for each change. A CDC consumer reads these entries and writes them to the target system.

This bridges OLTP and OLAP by replicating transactional data into a real-time analytics database continuously. Production isn't impacted by heavy SELECT queries from analytical users. Batch ETL delays disappear. The latency floor depends on the CDC tool and configuration, but modern tools achieve 10-second latency for most workloads.

How does Postgres CDC with ClickPipes and PeerDB replicate to ClickHouse? #

ClickPipes is ClickHouse Cloud's managed ingestion service. For Postgres CDC, it uses PeerDB, which ClickHouse acquired in 2024.

PeerDB connects to Postgres using the logical replication protocol and continuously streams changes to ClickHouse. Target tables use ReplacingMergeTree with two metadata columns: _peerdb_version (incrementing version for deduplication) and _peerdb_is_deleted (soft-delete flag for DELETE operations).

PeerDB parallelizes the initial snapshot using CTID-based partitioning. In benchmarks, 1 TB migrated in approximately 2 hours with 8 threads, compared to 17 hours with pg_dump/pg_restore or 8.5 hours with native logical replication.

Schema evolution is handled automatically for ADD COLUMN and DROP COLUMN changes. Fine-grained table and column exclusion supports PII filtering and bandwidth optimization. Connectivity options include AWS PrivateLink and SSH tunneling.

At scale, the connector handles hundreds of production workloads pushing over 100 TB per month. Customer Ashby reported that "reports that previously took minutes now complete in under a second" after moving analytical queries from Postgres read replicas to ClickHouse via CDC. Syntage migrated a 30 TB Aurora database using the same pipeline.

What replication latency can you expect from modern CDC? #

ClickPipes Postgres CDC delivers a minimum 10-second replication delay.

For the managed Postgres service, ClickHouse is building "CDC v2" targeting sub-second replication. Tight integration between managed Postgres and ClickHouse enables optimizations that aren't possible with external Postgres instances.

10-second latency is functionally indistinguishable from real-time for most analytical workloads. No human refreshes a dashboard faster than every 10 seconds. Where sub-second freshness genuinely matters (real-time fraud detection, live operational dashboards, AI feature serving), CDC v2 targets those use cases directly.

How does CDC compare to zero-ETL integration? #

"Zero-ETL" is a marketing term for zero-ETL integration, used by AWS (Aurora to Redshift), Snowflake, and others. Architecturally, zero-ETL is CDC: changes captured from a source database and replicated to an analytical system.

ClickPipes CDC occupies a middle ground: managed and integrated like zero-ETL, but compatible with any Postgres source. Users bring their own Postgres from AWS RDS, Aurora, Supabase, Neon, or self-hosted instances.

FeatureCDC (ClickPipes/PeerDB)Zero-ETL (AWS)Zero-ETL (Snowflake pg_lake)
SourceAny PostgreSQLAurora onlySnowflake Postgres only
TargetClickHouseRedshiftSnowflake
Typical latency~10 seconds"Within seconds"Near-real-time
Initial loadParallel CTID-based (~2h/TB)ManagedManaged
Schema evolutionADD/DROP COLUMN autoLimitedVia Iceberg schema
Cloud lock-inNone (open source)AWSSnowflake
Self-hosted optionYes (PeerDB open source)NoNo

Querying across OLTP and OLAP without rewriting applications #

How does pg_clickhouse push analytical queries from Postgres to ClickHouse? #

pg_clickhouse is a PostgreSQL Foreign Data Wrapper that creates foreign tables pointing to ClickHouse tables. Applications query Postgres normally. When a query touches a foreign table and involves analytical operations, pg_clickhouse rewrites the query and pushes execution to ClickHouse, returning only the results.

pg_clickhouse delegates execution entirely to ClickHouse. Postgres acts only as a routing and result layer. Unlike extensions that embed columnar engines inside Postgres (pg_duckdb, pg_analytics), pg_clickhouse uses zero CPU and memory inside Postgres for the analytical computation.

Analytical queries don't compete with OLTP for resources inside the Postgres process. The workload isolation that HTAP systems struggle to achieve comes naturally because the workloads run on separate engines. pg_clickhouse comes pre-installed in ClickHouse Cloud's managed Postgres service.

What queries can pg_clickhouse push down today? #

pg_clickhouse pushes down aggregate functions (COUNT, SUM, AVG, COUNT DISTINCT, MIN, MAX), GROUP BY, ORDER BY, HAVING, and WHERE clauses including LIKE and regex patterns. Date operations like EXTRACT, DATE_TRUNC, and date comparisons are supported. CASE WHEN expressions, FILTERs, JOINs, and SEMI-JOINs all push down.

14 of 22 TPC-H queries fully push down to ClickHouse, delivering 60x+ improvement over standard Postgres. Subquery and CTE pushdown is in development.

On ClickBench, pg_clickhouse is the fastest Postgres analytics extension, outperforming pg_duckdb and pg_analytics while running the computation entirely outside Postgres. Per-query overhead is small. It consists of query rewriting, one network round-trip to ClickHouse, and result conversion to Postgres format.

-- Create a foreign table pointing to ClickHouse
CREATE FOREIGN TABLE analytics_events (
    event_id UInt64,
    user_id UInt64,
    event_type TEXT,
    timestamp TIMESTAMP,
    properties JSONB
) SERVER clickhouse_server
OPTIONS (table_name 'events');

-- Query from your Postgres application — pushes down to ClickHouse
SELECT
    date_trunc('hour', timestamp) AS hour,
    event_type,
    count(*) AS event_count
FROM analytics_events
WHERE timestamp > now() - interval '24 hours'
GROUP BY hour, event_type
ORDER BY hour DESC;

How does this compare to embedding an analytics engine inside Postgres? #

Extensions like pg_duckdb and pg_analytics embed a columnar query engine inside the Postgres process. They share CPU, memory, and I/O with the OLTP workload running in the same process. This is the same resource contention problem that plagues HTAP databases, just at a smaller scale.

A heavy analytical query running inside Postgres can cause OLTP latency spikes. pg_clickhouse avoids this entirely. The analytical computation happens on ClickHouse's cluster, which scales independently. Postgres handles only the transactional workload it was designed for.

pg_clickhouse does require a running ClickHouse instance. Embedded extensions are simpler operationally for small-scale use cases. For production workloads where both OLTP stability and analytical performance matter, the separated approach wins.

How to evaluate whether you need an HTAP database or a best-of-breed stack #

How fresh does your analytical data need to be? #

Freshness is often cited as the reason to choose an HTAP database over separate systems, but modern Postgres CDC closes the gap. With sub-10-second CDC replication, separate engines achieve the same freshness that HTAP promised, without the architectural compromises.

What are the operational complexity differences? #

Managing one system that does both OLTP and OLAP sounds simpler. In practice, it gives you a complex system where two very different workloads compete for resources, each with their own performance characteristics and failure modes. Because the workloads are colocated, those complexities compound. An analytical query that causes OLTP latency spikes is harder to diagnose and fix when both workloads share the same engine, buffer pool, and I/O subsystem.

With separate systems, each workload is isolated. You can profile, understand, and optimize OLTP and OLAP independently. Each system is built for its workload and keeps its respective complexities scoped. While you do manage two deployments, the operational experience ends up being less complex than managing one system that tries to handle both.

Managed services reduce the overhead further. ClickHouse Cloud with managed Postgres, ClickPipes CDC, and pg_clickhouse means the integration is handled as a service. For most teams, the two-system approach with managed services is operationally simpler than a single HTAP system at any non-trivial scale.

When is a converged HTAP system the right choice? #

At small data volumes (under 100 GB) with light analytical queries (simple aggregations, not multi-way JOINs over billions of rows), a single database like PostgreSQL can handle both workloads without noticeable performance degradation. Prototypes and early-stage products benefit from this simplicity.

But, if you expect to scale, it makes more sense to start with an architecture that grows with you. HTAP systems don't scale well because the resource contention between OLTP and OLAP gets worse as data volumes and query concurrency increase. Combining Postgres for OLTP with ClickHouse for OLAP has become a common pattern, and managed services have simplified the setup to the point where it is appropriate from day one. Starting with a composed architecture avoids the painful migration later when you outgrow HTAP.

How do you migrate from a monolithic database to a split OLTP/OLAP architecture? #

Migration is incremental and reversible. Follow these steps:

  1. Set up CDC from your existing Postgres (or MySQL) to ClickHouse using ClickPipes. This runs alongside your current setup with no impact on the source database.
  2. Validate that replicated data in ClickHouse matches the source by running comparison queries on both systems.
  3. Identify your heaviest analytical queries, the ones causing read replica lag or production database load. Migrate these to ClickHouse first.
  4. Install pg_clickhouse if you want to route queries transparently from the Postgres connection.
  5. Monitor CDC lag, compare query results between systems, and measure the performance improvement on migrated queries.
  6. Gradually shift remaining analytical workloads to ClickHouse. Keep Postgres focused on OLTP.

CDC replication can run indefinitely alongside the existing setup while the team validates each migrated query. No cutover date needed.

RequirementKey MetricRecommendation
Sub-second analytical freshness on transactional dataReplication lag toleranceCDC with sub-second target
Dashboard/report freshness under 1 minuteReplication lag toleranceStandard CDC (10-second latency is plenty)
Hourly or daily analyticsBatch window toleranceBatch ETL or CDC (either works)
OLTP p99 latency under 5 msTransaction latency SLADedicated Postgres on NVMe; do not run analytics on the same engine
Analytical queries scanning 1B+ rowsQuery throughput at scaleDedicated columnar OLAP (ClickHouse); row-oriented engines will struggle
Expecting to scale beyond 100 GBGrowth trajectoryStart with composed architecture; avoids painful migration later
Multi-TB dataset, growing fastStorage and compute scalingComposed architecture: scale OLTP and OLAP independently

Best practices for a unified OLTP and OLAP architecture #

  1. Keep your engines separate. Run OLTP on a database designed for transactions (PostgreSQL) and OLAP on a database designed for analytics (ClickHouse). Don't force one to do the other's job.

  2. Start CDC on day one. Even if you're not running analytical queries yet, setting up CDC early means your analytical database has full history when you need it. This means you can avoid a larger historical migration later.

  3. Object-storage-backed Postgres trades latency for elasticity. If your application has p99 latency SLAs, use NVMe-backed Postgres for the OLTP layer.

  4. Use ClickHouse's ReplacingMergeTree for CDC target tables. ClickHouse's ReplacingMergeTree engine is purpose-built for CDC workflows, handling insert/update/delete semantics natively. The FINAL keyword provides query-time deduplication with minimal overhead, giving you consistent reads regardless of background merge state.

  5. Size your Postgres for OLTP only. Don't overprovision Postgres to handle analytical queries. Move analytics to ClickHouse and right-size Postgres for the transactional workload it actually serves.

  6. Use pg_clickhouse for transparent query routing. Applications that already query Postgres for analytics can switch to ClickHouse without code changes. The FDW pushes analytical queries to ClickHouse while keeping transactional queries in Postgres.

  7. Monitor CDC lag as a first-class metric. Treat replication lag between Postgres and ClickHouse like you treat API latency. Alert on it. Dashboard it. Growing lag means analytical data is going stale.

  8. The performance gap between dedicated OLTP and OLAP engines grows with data volume. What works on 10 GB may not work on 10 TB. Test with production-scale data before committing to an architecture.

  9. Plan for schema evolution. Your Postgres schema will change. ClickPipes handles ADD and DROP COLUMN automatically. For more complex changes (column renames, type changes), plan a resync strategy.

Frequently asked questions

01
02
03
04
05
06
07
08
09
10
Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...