What is a real-time data warehouse?
Traditional on-prem data warehouse
- Data volumnes were small
- Warehouses were operationally complex
Traditional cloud warehouse
- Performance and concurrency limitations became limiting at scale
- Retrofitting these for analytics or real-time workloads started to become prohibitively costly
Real-time data warehouse
- Simplified and cost effective
- Unified resource for querying streaming and historical data
Why use a real-time data warehouse?
Traditional cloud data warehouse
- PerformanceHigh query latency and concurrency limitations are commonplace
- Hardware efficiencyCan create data bloat and inefficient usage of system resources
- ScaleAnalytics queries scale inadequately as data volumes increase
- ComplexityCan lead to growing operational complexity
- CostCostly for many workloads
Real-time data warehouse
- PerformanceEngineered to handle highly concurrent workloads that back user-facing applications
- Hardware efficiencyOptimized to manage petabytes of data, with best-in-class compression ratios for the most efficient storage usage
- ScaleDelivers unparalleled performance for analytical workloads at scale
- ComplexitySimplified developer experience that's easy to manage and scale
- CostMaximizes cost-effectiveness
Impact of the real-time data warehouse
With Snowflake, we were using the standard plan, small compute, which cost nearly six times more than ClickHouse Cloud. We got several seconds query time and no materialized views.
With ClickHouse Cloud's production instance, we are getting sub-second query time along with materialized views. The decision to switch was a no-brainer for us.
Snowflake
We were on Redshift for about a year and a half, but found the operational overhead and performance wasn't getting it done. Moving over to ClickHouse we were basically able to cut that (Redshift) bill in half. That 30 second query now takes under a second, and every page loads just faster.
Redshift
It [BigQuery] discourages data usage. Instead of encouraging analysts to query the database in any and all ways they can imagine you’ll end up worrying about needing to limit them and come up with processes for controlling the volume of data being used. We simply don’t want the hassle of trying to figure out in advance of how many BQ slots to purchase - what a headache!
BigQuery