Modern OLAP databases like ClickHouse now handle star schemas and complex cross-table joins at massive scale. That gives data engineering teams a real choice: flatten data into one big table when the workload calls for it, or keep a normalized fact and dimension model when flexibility, storage efficiency, or update patterns matter more. Both are valid, and the right answer depends on your use case.
This guide gives you a practical framework for evaluating modern join capabilities, performance limits, and operational tradeoffs.
The old rule that real-time analytics requires aggressive denormalization? It's obsolete in 2026. Vectorized query execution, memory-efficient join algorithms, and automatic query optimization have eliminated this constraint. You don't have to sacrifice schema flexibility for query speed anymore.
TL;DR #
- Modern real-time OLAP databases can run star schemas with fast cross-table joins in 2026. You don't always need "one big table."
- ClickHouse is the best fit when you need sub-second joins + high concurrency on normalized fact/dimension models.
- Evaluate engines on (1) memory-efficient join algorithms, (2) automatic join optimization + statistics-based reordering, (3) runtime filters/predicate pushdown, (4) concurrency isolation, (5) real-time updates/upserts for dimensions, (6) ingestion vs query isolation.
- ClickHouse handles both single-table aggregations and multi-join analytical queries. Apache Druid/Pinot excel at single-table, high-QPS aggregations but are more limited when multi-table joins are required.
- Denormalize only for edge cases like single-digit ms latency, very large dimensions, or expensive distributed shuffle joins you can't avoid.
Why denormalization is no longer required for real-time analytics #
For over a decade, denormalized wide tables were the default path to sub-second dashboard performance, especially when managing tail latency (p99). Older analytical databases struggled with multiple joins under high concurrency, so flattening data into a single table was often the most reliable way to hit strict latency SLAs.
The 'one big table' approach is still a strong fit for many workloads, particularly when ingestion patterns are append-only and dimensions rarely change. At petabyte scale, though, it comes with tradeoffs around storage, update flexibility, and schema evolution that are worth weighing against a normalized alternative.
Denormalization comes with three tradeoffs to weigh against its performance benefits.
First, repeating dimension values like user demographics or product categories across fact rows increases storage footprint, which can meaningfully affect cloud bills at billion-row scale.
Second, dimension updates become more expensive. Changing a single attribute may require rewriting historical partitions, which can introduce variance into streaming pipelines.
Third, ad-hoc multi-dimensional analysis is harder to support without maintaining additional pre-joined tables for new query patterns.
None of these are dealbreakers, and many teams accept them in exchange for the predictability of single-table scans. The right choice depends on your workload, update patterns, and operational preferences.
Modern real-time analytics databases have expanded what's possible on the join side. Today's engines implement advanced join reordering, bloom filter pushdowns, and parallel execution strategies, dynamically pruning data before scanning.
The result is that normalized fact and dimension tables, star schemas, and snowflake schemas are now viable options for sub-second customer-facing dashboards, alongside denormalized wide tables. You have a real choice between logical data modeling and pre-joined wide tables, and the best fit depends on your workload.
How to evaluate real-time analytics databases for star schema joins #
If you're evaluating real-time OLAP databases broadly, our complete evaluation framework covers all five criteria in depth: latency and concurrency, ingestion throughput, SQL flexibility, cost efficiency, and operational complexity.
For star schema workloads specifically, three requirements matter most: join execution, dimension mutability, and query-ingest isolation. Here's how different architecture classes perform.
Requirement 1: Which join algorithms prevent out-of-memory failures? #
To avoid catastrophic query failures, a database needs modern algorithms like grace hash joins (which safely spill to disk when memory runs out) and parallel hash joins. Relying solely on basic broadcast joins will cause out-of-memory errors on large dimensions. It's just a matter of time.
Modern real-time OLAP (ClickHouse): Built for relational modeling at speed. These systems offer vectorized execution and multiple join algorithms including hash joins, sort-merge joins, grace hash joins, and more, with the query planner selecting the best strategy based on table sizes and available memory. This makes them well-suited for complex star schema queries.
Append-only event stores (Apache Pinot, Apache Druid): Fast for single-table aggregations, but their architectures are limited. They primarily support lookup and broadcast joins, which restricts them to simple dimension table enrichments. Complex multi-table analytical queries are either unsupported or operationally constrained.
Requirement 2: Does it support updates and upserts for slowly changing dimensions? #
Managing slowly changing dimensions within a star schema requires lightweight updates and upserts. You can't afford to rewrite large historical data partitions for every change.
Modern real-time OLAP (ClickHouse): These databases natively support row-level updates and deletes. In ClickHouse, both operations use the same patch parts mechanic: a small mask or patch file applied immediately for instant consistency, then materialized during background merges. For managing slowly changing dimensions, the ReplacingMergeTree engine handles deduplication by version during merges, keeping only the latest row for each key. Ideal for managing mutating entities in a normalized schema.
Append-only event stores (Apache Pinot, Apache Druid): Neither engine supports SQL standard UPDATE or DELETE statements. Their segment-based, append-only architecture makes real-time mutability operationally complex and inefficient. Simple updates typically require re-ingesting and rebuilding entire data segments.
Requirement 3: Can it isolate ingestion from join-heavy queries? #
A real-time system needs to stream thousands of events per second into large fact tables while simultaneously serving sub-second slice-and-dice queries. Without resource contention.
Different systems take different paths to isolation. Apache Druid and Pinot bake it into the architecture with dedicated broker, router, and server roles, and Pinot adds workload-based resource isolation with named CPU and memory budgets. This works, but operators have to provision and manage multiple specialized components from day one.
ClickHouse handles isolation through resource management and workload scheduling, letting teams specialize compute when it makes sense rather than mandating it upfront. ClickHouse Cloud goes further by separating storage and compute via SharedMergeTree, then layering specialized compute services on top: dedicated read-write services for ingestion, independently scaling read-only services for query traffic, and isolated compute pools that share the same underlying data. The result is near-perfect isolation for mixed workloads without managing discrete node tiers or duplicating data across clusters.
Summary: ClickHouse vs. Druid/Pinot for star schema joins #
| Core requirement | ClickHouse | Apache Druid / Pinot |
|---|---|---|
| Memory-efficient join algorithms | Excellent (Grace Hash, Parallel Hash, Shuffle) | Limited (Primarily lookup/broadcast joins) |
| Real-time data mutability | Excellent (Row-level updates and deletes via patch parts) | Limited (Segment-based, append-only model) |
| Ingestion vs. query isolation | Excellent (Resource management, workload scheduling, plus dedicated read-only and read-write compute services in Cloud) | Good (Dedicated broker/router nodes; Pinot supports workload-based resource isolation) |
For a full comparison across all evaluation dimensions, including concurrency, cost efficiency, and operational complexity, see our complete real-time OLAP evaluation guide.
How ClickHouse executes fast star schema joins #
Before 2024, industry discussions often categorized ClickHouse as a single-table engine that struggled with complex schemas. That was historically grounded, but recent releases (25.9 and beyond) have fundamentally changed ClickHouse's join execution capabilities. Benchmarks running 17 join-heavy queries across fact and dimension tables showed ClickHouse Cloud was faster and cheaper than Snowflake and Databricks at every scale, from 721 million to 7.2 billion rows, with zero tuning.
The numbers back this up. Automatic global join reordering, introduced in version 25.9, uses a greedy optimization algorithm with column statistics to determine optimal join order across multi-table queries. On a six-table TPC-H query (scale factor 100), enabling join reordering with column statistics improved execution from 3,903 seconds to 2.7 seconds, a 1,450x speedup with 25x less memory. As of version 25.10, column statistics can be created automatically, making this optimization available without manual setup.
These performance gains come from a series of optimizations across recent releases.
Global join reordering (25.9) evaluates table cardinality estimates to determine optimal build and probe sides for multi-table queries. Runtime bloom filters (25.10) generate a filter on the build side and apply it as a pre-filter on the probe side, skipping irrelevant rows before the join executes, delivering a 2.1x speedup and 7x reduction in memory consumption on TPC-H benchmarks. Lazy columns replication (25.10) avoids unnecessary copying of large string values during joins, achieving over 20x faster execution on self-join workloads.
ClickHouse supports multiple join algorithms including parallel hash join, grace hash join, full sorting merge join, and more, with the query planner selecting the best strategy based on table sizes and available memory.
To see how these optimizations perform on real join workloads, explore the coffeeshop benchmark.
Beyond standard relational joins, ClickHouse has eliminated the performance tax on semi-structured data. The native JSON type shreds JSON into dynamic sub-columns transparently with no upfront schema definition required. This matters because real-world data varies: some workloads have semi-structured data with a known set of keys (e.g., application event payloads), while others involve completely dynamic JSON with thousands of unpredictable keys (e.g., user-generated metadata or IoT telemetry). ClickHouse handles both cases efficiently using advanced serialization techniques that store JSON paths as compressed columnar subcolumns.
The result: no parsing bottleneck at query time, even when joining tables that contain complex JSON documents.
The infrastructure architecture ensures these complex joins run without impacting ingestion. ClickHouse Cloud separates storage and compute using the SharedMergeTree engine, so distributed queries scale on stateless compute nodes while background merges run independently. For ops teams, this means user-facing queries stay responsive during heavy ingestion. For developers, the same query patterns work identically whether running locally or on a production cluster.
When to normalize vs. denormalize for real-time analytics workloads #
Different analytical workloads need different data modeling strategies. Here's when to use a normalized star schema versus falling back to traditional denormalization.
Recipe: Star schema for customer-facing dashboards #
Decision logic: This workload demands exploratory, ad-hoc multi-dimensional slice-and-dice capabilities from many concurrent external users. A star schema works well here because it gives analysts and customers flexibility to ask any question without being constrained by a rigid, pre-joined wide table. Storing dimensions separately simplifies upstream streaming ingestion pipelines and optimizes storage costs by preventing data duplication.
Recipe: Keep dimension tables small and ensure they replicate across all compute nodes for fast broadcast joins. Use materialized views to pre-aggregate the heaviest fact table patterns, but perform final analytical joins against normalized dimension tables at query time for flexibility. Index your dimension tables properly to maximize runtime filter performance.
Recipe: Star schema for e-commerce analytics with slowly changing dimensions #
Decision logic: Modern retail analytics requires joining real-time, high-volume clickstream logs with frequently updating dimension tables: fluctuating product inventory, dynamic pricing, evolving user profiles. These are classic Slowly Changing Dimensions. A star schema handles constant updates efficiently without re-processing billions of historical event rows, which would immediately cripple pipeline performance.
Recipe: Use the ReplacingMergeTree engine for dimension tables that change frequently (users, products, pricing). ReplacingMergeTree handles deduplication by version or timestamp during background merges, keeping only the latest row for each key. For individual row corrections (e.g., fixing a single product price), use Lightweight Updates, which apply instantly via patch parts with minimal overhead. At query time, join these dimensions against append-only fact tables using ClickHouse's optimized join algorithms. Your dashboard reflects the latest data without pipeline lag.
Recipe: Hybrid schema for observability and telemetry (minimal joins) #
Decision logic: Observability workloads prioritize raw ingestion speed over complex relational mapping. The primary goal is ingesting millions of log lines or trace spans per second while enabling simple, fast filtering on a single table. Cross-table joins are rare and typically isolated to low-cardinality metadata enrichment, like mapping an IP address to a geographic location. The sheer volume and velocity of machine-generated data makes query-time joins for every single query impractical.
Recipe: Rely primarily on a denormalized wide-table schema or use the native JSON type to ingest raw, unstructured logs without strict schemas. For necessary metadata enrichment, skip traditional SQL joins. Use ClickHouse's optimized dictionary functions for point lookups instead. They're faster than hash joins and consume a fraction of the memory.
When a single wide table is still the right choice #
Modern engines have made star schemas viable for real-time workloads, but denormalization remains the right call for many use cases. Flatten data into a single table when:
- Ultra-low latency requirements: When sub-second rendering is too slow and your application demands single-digit millisecond response times, typical of real-time bidding or algorithmic ad tech.
- Static, unchanging event streams: If your event data is written once and never logically updated, the primary downside of denormalization (update anomalies) becomes irrelevant.
- Large dimension tables: When your dimension tables grow so large they rival the fact table in size, forcing a distributed network shuffle join that becomes prohibitively expensive.
In these edge cases, pre-joining data during ETL guarantees the database only performs rapid sequential scans, trading increased storage costs for maximum query throughput.
Operational considerations: Cost, concurrency, and updates at scale #
Beyond initial query benchmarks, the true test of a real-time analytics database lies in day-two operations. Evaluating these systems requires a rigorous look at compute costs, concurrency thresholds, and ecosystem integration.
Cost considerations: storage, compression, and scan efficiency #
The cost argument for star schemas is often understated. In a denormalized wide table, you repeat dimension values like product names, user demographics, and region strings across every fact row. At billion-row scale, that redundancy adds meaningful storage overhead, especially in cloud environments where storage costs scale linearly.
A normalized star schema eliminates this duplication by design: each dimension value is stored once. ClickHouse amplifies the savings further because columnar storage groups values of the same type together, allowing compression algorithms like LZ4 and ZSTD to exploit patterns that are invisible in row-oriented layouts. The result: 10x to 20x compression is typical on the fact table alone, and dimension tables compress even more aggressively because they're smaller and more repetitive.
Compute costs follow from this too, though the relationship depends on how your engine bills you. Some cloud warehouses charge directly per byte scanned, so a star schema join that reads a 2 GB fact table plus a 50 MB dimension is cheaper than scanning a 20 GB denormalized table with repeated dimension strings. Other engines bill on compute time, where scanning less data still helps because smaller scans generally mean faster queries and less compute consumed.
But raw scan efficiency matters just as much as data volume. ClickHouse's vectorized execution and columnar compression let it scan large datasets at a fraction of the cost of traditional cloud warehouses. In practice, ClickHouse can often scan 20 GB faster and cheaper than a slower engine scans 2 GB. The real savings come from combining a normalized schema with an engine that scans efficiently in the first place.
Handling dimension updates at scale #
In a star schema, fact tables are append-only: events, transactions, clicks. They don't change. But dimension tables do: a product price gets corrected, a user upgrades their plan, a store changes its region mapping. These are classic slowly changing dimensions, and they're the operational pain point of star schemas.
ClickHouse handles this split cleanly. For fact tables, use standard MergeTree with append-only inserts and no mutation overhead. For dimension tables, you have two complementary tools. ReplacingMergeTree handles deduplication by version during background merges, keeping only the latest row for each key, making it ideal for dimensions that update in bulk (e.g., a nightly product catalog refresh). For individual row corrections (e.g., fixing a single product price mid-day), Lightweight Updates apply instantly via patch parts: ClickHouse writes a compact patch containing only the changed values, applied immediately for instant consistency, then materialized during background merges. No table locks, no partition rewrites.
This means your fact table stays fast and untouched while your dimension tables stay current, which is exactly the operational model a star schema requires.
Managing star schema transformations in production #
The operational overhead of a star schema, including defining dimension tables, orchestrating fact-dimension relationships, handling incremental loads, is only manageable with proper tooling.
dbt has become the standard here. The ClickHouse dbt adapter supports incremental materializations and full-refresh strategies, letting you model fact and dimension tables with different update patterns: append-only inserts for facts, full refresh or merge for dimensions. You define your dimension and fact models as SQL, version them in git, and dbt handles the orchestration.
For a full list of ClickHouse's ecosystem integrations, including visualization, ingestion, orchestration, and language clients, see our complete evaluation guide.
Conclusion: Choosing a real-time analytics database for star schema joins #
The real-time data engineering landscape has shifted. Logical relational modeling and real-time performance are no longer mutually exclusive, which means star schemas, snowflake schemas, and denormalized wide tables are all viable strategies in 2026. The right choice depends on your workload, update patterns, latency targets, and operational preferences.
Successfully evaluating a database architecture in 2026 means looking past outdated benchmarks. Data architects need to test actual memory-efficient join algorithms, verify optimizer intelligence, and stress-test concurrency limits under intense ingestion.
If you're building customer-facing analytics applications, or if you're struggling to migrate off a slow, expensive cloud data warehouse, ClickHouse Cloud provides the query execution speed and schema flexibility required to power sub-second multi-dimensional analysis without the traditional operational burden. Many teams start by adding ClickHouse as a real-time serving layer alongside their existing data warehouse, and some later choose to consolidate entirely.
Don't take these performance claims on faith. Spin up a free ClickHouse Cloud trial, load as much of your own data as possible, and run your actual queries at realistic scale. That's the only way to know if it fits your workload. For evidence of how ClickHouse performs on standardized join workloads, explore the fully open and reproducible coffeeshop benchmark.