For the last decade, the open-source community has treated observability as a search problem. We dumped logs into inverted indexes (like Elasticsearch or Lucene) because our primary workflow was grep: finding a single error line in a haystack of text.
Modern cloud-native environments have shifted usage patterns. This is why platforms like Datadog moved beyond simple indexing to support complex analytical queries. We no longer just search. We analyze. We need to compute p99 latency across millions of requests, group traces by high-cardinality customer IDs, and correlate metrics with logs to find the root cause of a complex failure.
Observability is a data analytics problem. We should treat it like one.
(For understanding this shift, read our deep dive on why legacy observability architectures fail.)
This guide focuses on the technical implementation of ClickStack, the ClickHouse Observability Stack.
By treating observability as a data analytics problem, ClickStack uses ClickHouse, a high-performance, unified columnar database, to store logs, metrics, and traces in a single engine. You regain control of your data, simplify your stack, and reduce costs by an order of magnitude.
If you’re looking for the business case? Check out our Observability TCO Analysis to calculate your savings and see how shifting to ClickHouse impacts your bottom line.
Key takeaways #
- Unify data: Move from fragmented stacks (ELK/LGTM) to a unified ClickHouse columnar store.
- Stop sampling: Reduce costs through compression (ZSTD) and tiered storage (S3), not by deleting data.
- Pre-aggregation: Use materialized views and asynchronous inserts for performance gains.
Why legacy observability architectures (ELK & SaaS) fail at scale #
High observability bills and slow dashboards aren't just pricing problems. They signal architectural failures. Whether you buy a SaaS product or self-host open source, you likely hit one of two fundamental bottlenecks.
1. The SaaS problem: High ingestion costs and data tax #
Commercial vendors (e.g., New Relic) run architectures that are expensive to scale. Their pricing models aren't just business decisions. They defend aging architectures.
- The ingestion trap: Vendors with high margins on ingestion force you to filter data before you know its value. You pay to index data you might never query, as well as paying to query and interact with your data.
- The "tax on curiosity": Further pricing skus, mean every dashboard refresh comes with a price. Engineers start asking, "Can I afford to run this query?" rather than "What caused this failure?"
2. The open source problem: Limitations of ELK and LGTM stacks #
Self-hosting seems like the escape hatch, but traditional open-source stacks introduce their own complexity taxes:
- ELK (Elasticsearch, Logstash, Kibana): ELK excels at text search via inverted indexes but struggles with analytical aggregations. It relies on Doc Values, which often compress inefficiently, and JVM memory limits constrain the stack and force early horizontal scaling. This leads to timeouts and instability under heavy analytical loads.
- LGTM (Loki, Grafana, Tempo, Mimir): While optimized for cost, this stack forces you to manage three distinct database backends: one for logs, one for metrics, and one for traces. This "Tri-Database" complexity creates data silos, making correlation difficult without complex query joins in the UI.
What is ClickStack? The ClickHouse-native open-source observability stack (not just “observability that uses ClickHouse”) #
ClickStack is ClickHouse's open source observability stack, combining open standards with high-performance storage. ClickStack unifies the three pillars of observability into one columnar engine:
- The ingestion layer (OpenTelemetry): Standardized, vendor-neutral collection of logs, metrics, and traces.
- The storage engine (ClickHouse): A unified columnar database that handles all three data types, allowing SQL-based correlation and massive compression.
- The visualization layer (HyperDX): A purpose-built UI that queries ClickHouse directly, providing the "glass pane" experience engineers expect.
This architectural unity makes ClickStack cheaper and simpler.
| Feature | ELK stack (search index) | LGTM stack (fragmented) | ClickStack (unified columnar) |
|---|---|---|---|
| Core architecture | Inverted index. Great for search, poor for aggregations. | Fragmented. Different backends for Logs (Loki), Metrics (Mimir), Traces (Tempo). | Unified columnar. One engine for all three signals. |
| Operational load | High. JVM tuning, index management, and heavy memory usage. | High. Managing three distinct distributed systems and their interactions. | Medium. Single binary/service to manage and scale. |
| Correlation | Difficult. Requires application-side logic or complex UI joins. | Siloed. Data lives in different systems. Correlation happens at the visualization layer. | Native. Correlate logs, metrics, and traces using standard SQL JOINs. |
ClickHouse-backed vs ClickHouse-native (why the difference matters) #
Many "open source observability + ClickHouse" solutions use ClickHouse as a storage layer inside their own architecture. ClickStack takes a different approach: it's the ClickHouse-led stack.
The architecture is straightforward. OpenTelemetry handles ingestion. ClickHouse serves as the unified store. HyperDX provides the UI. These components ship as a cohesive observability stack, not an observability app that happens to run on ClickHouse.
Case studies: How Anthropic, Didi, OpenAI, Shopee, and Tesla use ClickHouse for observability at scale #
The strategies in the playbook below aren't theoretical. They mirror the exact architectural patterns used by the world's largest engineering organizations to address the cardinality and cost problems inherent in search-based observability.
| Company | Challenge | Solution | Outcome |
|---|---|---|---|
| Tesla | Monitoring global infrastructure at quadrillion-scale. | Built a unified platform ingesting tens of millions of rows/sec on ClickHouse. | Achieved real-time "slice and dice" capabilities on massive datasets. |
| Anthropic | Scaling observability for Claude's explosive growth. | Migrated to ClickHouse for a cost-effective database that maintains query speed. | Stabilized the database under pressure and controlled costs. |
| OpenAI | Petabyte-scale daily data generation requires high performance. | Used ClickHouse for real-time query capabilities. | Reduced query times from minutes to milliseconds. |
| Didi | High costs and slow queries with Elasticsearch for logging. | Migrated from Elasticsearch to ClickHouse. | Achieved 30% cost reduction and 4x query speed improvement. |
| Shopee | Distributed tracing with billions of spans. | Adopted ClickHouse for high-performance tracing. | Searches specific trace IDs from 30B+ rows in seconds. |
#
Why sampling and cardinality limits fail as cost optimization strategies #
Before we get to the technical playbook, we need to address the "standard advice" for cost reduction. If you search for "how to reduce observability costs," you'll find a list of survival tactics designed to hide the inefficiencies of legacy architectures.
You've heard them before:
- "Aggressive sampling": Throw away 90% of your traces (Head-based sampling).
- "Cardinality limits": Never tag metrics with User IDs or Container IDs.
- "Log diet": Turn off INFO logs in production and hope you can reproduce the error locally.
These aren't optimizations. They're amputations.
In a modern, complex distributed system, these strategies actively hinder your debugging. You trade visibility for solvency.
- Sampling guarantees you'll miss the one-in-a-million race condition that takes down production.
- Cardinality caps prevent you from answering the most critical business question: "Which specific customers experience latency?"
- Deleting logs saves storage bytes but leaves you flying blind during an outage.
These constraints exist because inverted indexes (ELK) and row-based stores (Postgres/legacy SaaS) bloat data size and choke on high cardinality.
The ClickHouse approach is different. Columnar storage compresses data by 10-20x and processes high-cardinality aggregations efficiently. You don't need to start by deleting data. You can afford to keep the signal.
Don't optimize by blinding yourself. Optimize the engine first. Then explore other opportunities if needed.
5 steps to reduce observability costs with ClickHouse #
Migrating to ClickHouse is step one. Tuning ClickHouse for observability workloads is step two. These technical steps use ClickHouse features to minimize storage footprint and maximize query throughput.
| Step | Optimization | Impact | Feature |
|---|---|---|---|
| 1 | Optimize codecs | ~50% immediate storage reduction. | CODEC(ZSTD) and type-specific codecs, e.g., Delta. |
| 2 | Tiered storage | Retain compliance data on S3 for pennies. | TTL / Storage Policies |
| 3 | Compression key optimization | Balance high compression with fast query speeds. | Primary key design and optimize row order |
| 4 | Pre-aggregation | Instant dashboard loading. | Materialized Views |
| 5 | Server-side buffering | Eliminate Kafka or Gateway infrastructure costs. | Asynchronous Inserts |
#
Step 1: Chain specialized codecs (Delta) with ZSTD to reduce storage by 50% #
Observability data carries structure beyond text. Timestamps increase monotonically, and counters often increment steadily. Generic compression algorithms miss these patterns.
ClickHouse lets you chain codecs. Apply a specialized codec to transform the data first, then apply a general-purpose compressor.
The Delta codec stores differences between values rather than the values themselves. Storing the difference between timestamps (e.g., +10ms) requires fewer bits than storing full 64-bit timestamps. Wrapping Delta in ZSTD significantly improves compression ratios.
- Timestamps/Counters: Use
DeltaorDoubleDeltafollowed byZSTD. - Log Messages: Use
ZSTD(high compression)
1CREATE TABLE optimized_logs (
2 -- Delta stores the difference, ZSTD compresses the small result
3 timestamp DateTime64(3) CODEC(Delta, ZSTD(1)),
4
5 -- ZSTD compresses repetitive text effectively
6 message String CODEC(ZSTD(1)),
7
8 -- Low cardinality columns compress well with default settings
9 severity Enum('INFO', 'WARN', 'ERROR')
10)
11ENGINE = MergeTree()
12ORDER BY timestamp;
#
Step 2: Implement tiered storage (S3/object storage integration) #
In observability, data value decays rapidly. Logs from 10 minutes ago are critical. Logs from 3 months ago serve compliance.
ClickHouse separates storage from compute. Configure a storage policy that keeps recent "hot" data on NVMe SSDs for fast debugging, and automatically moves data older than 7 days to object storage (S3/GCS). This enables massive retention periods without the cost of block storage.
<policies>
<tiered_observability>
<volumes>
<hot>
<disk>default_ssd</disk>
</hot>
<cold>
<disk>s3_bucket</disk>
</cold>
</volumes>
</tiered_observability>
</policies>
1ALTER TABLE your_logs_table MODIFY TTL
2 timestamp + INTERVAL 7 DAY TO VOLUME 'cold',
3 timestamp + INTERVAL 365 DAY DELETE;
Step 3: Balance compression and query speed with Primary Keys #
Logs typically arrive in chronological order. Ordering strictly by time creates "entropy" in your storage, interleaving messages from Service A, Service B, and Service C. This randomness hurts compression because algorithms like ZSTD thrive on repetition.
To fix this, balance your Primary Key between compression and query performance.
- Optimize for time:
ORDER BY (Timestamp)ensures fast "last 10 minutes" queries but interleaves the data, yielding average compression. - Optimize for compression:
ORDER BY (Service, Timestamp)groups all logs from the same service together. This significantly improves compression (often 20-40%), but forces the database to scan more data for global time-range queries. - Compromise - To balance strong compression with efficient time-based ordering, reduce timestamp granularity using interval functions in the primary key. For example,
ORDER BY (toStartOfMinute(Timestamp), Service)groups more rows per service into each time bucket, improving compression while still supporting fast queries over the last N minutes. For higher-ingest workloads, coarser intervals can be used, such astoStartOfTenMinutes(Timestamp), to further increase row grouping and compression efficiency.
Choose a Primary Key that matches your most common filter (e.g., Service). ClickHouse enforces this physical order when data parts are written during INSERTs. Background merges preserve this ordering when combining parts.
For further optimization, enable optimize_row_order=1 to heuristically group similar rows during INSERTs within ORDER BY ranges of new parts with the aim of maximizing compression.
1-- 1. Create a table with a balanced Primary Key (Time + Service)
2CREATE TABLE logs (
3 timestamp DateTime,
4 service_name LowCardinality(String),
5 body String
6) ENGINE = MergeTree()
7ORDER BY (service_name, timestamp);
8
9-- 2. (Optional) Enable row reordering to boost compression further
10ALTER TABLE logs MODIFY SETTING optimize_row_order = 1;
#
Step 4: Use materialized views for pre-aggregation #
In traditional systems, dashboards run expensive aggregate queries over raw data every time you refresh the page.
ClickHouse lets you pre-compute these answers using materialized views. As data arrives, ClickHouse calculates the aggregate (e.g., error counts per hour) and stores the result in a separate, smaller table.
Your dashboard then queries this pre-computed table, returning results in milliseconds with near-zero CPU load. With ClickStack, it's even better; you don't need to rewrite dashboard widgets to target these tables. HyperDX detects when a pre-aggregated view matches your query and routes the request to the optimized table.
1-- 1. Create the destination table for aggregated data
2CREATE TABLE error_counts_per_service (
3 Timestamp DateTime,
4 ServiceName String,
5 Count UInt64
6) ENGINE = SummingMergeTree ORDER BY (Timestamp, ServiceName);
7
8-- 2. Create the view that populates it automatically
9-- HyperDX can now leverage this view for "counts by servicename" charts and general total log volume over time
10CREATE MATERIALIZED VIEW error_counts_per_service_mv TO error_counts_per_service AS
11SELECT
12 toStartOfMinute(Timestamp) AS Timestamp,
13 ServiceName,
14 count() AS Count
15FROM logs_source
16GROUP BY Timestamp, ServiceName;
Step 5: Simplify architecture by removing external queues (Kafka) #
A hidden cost in observability is the "buffering layer." Traditional databases choke on thousands of small, concurrent insert requests, so engineers often deploy heavyweight middleware like Apache Kafka or a fleet of OpenTelemetry Gateway collectors solely to batch data before it reaches the database.
This middleware tax adds up. You pay for infrastructure, management overhead, and latency.
ClickHouse Asynchronous Inserts eliminate this layer. With this setting, ClickHouse accepts small, unbatched requests directly from edge collectors and buffers them in memory (server-side) before writing to disk.
How to configure:
Tune buffer behavior to balance latency vs. throughput using async_insert_busy_timeout_ms (wait time) and async_insert_max_data_size (buffer size).
# ClickHouse Asynchronous Inserts Configuration
exporters:
clickhouse:
endpoint: tcp://clickhouse:9000?async_insert=1&wait_for_async_insert=0
Migration guide: How to implement a dual-write strategy #
Replacing an observability tool rarely works as a "rip and replace" operation. We recommend a phased approach that de-risks the migration while proving the cost savings.
Phase 1: The side-by-side PoC (30 days) #
Don't turn off your current vendor yet.
- Deploy ClickStack: Use ClickHouse Cloud to spin up the ClickStack UI (powered by HyperDX). This gives you a familiar visualization layer immediately.
- Dual-write configuration: Configure your OpenTelemetry Collector to send telemetry to both your existing vendor and your new ClickHouse endpoint.
- Compare: For one month, use ClickStack to debug non-critical incidents. Compare the query speed for high-cardinality questions against your legacy tool.
Phase 2: The shift #
Once you've established confidence:
- Migrate P0 dashboards: Recreate your critical operational views in ClickStack.
- Redirect new services: Default all new microservices to ClickStack as their primary observability platform.
- Slash retention with your legacy vendor to 3 days. This immediately drops your legacy bill by ~80% while keeping a safety net for critical incidents.
- Cut the cord: Once the team feels comfortable with ClickStack's SQL-based power (or the UI abstraction), disable the export to the legacy vendor.
Conclusion #
Observability is a data problem. By moving from a search-based architecture to a columnar architecture with ClickHouse, you don't just save money. You adopt a system designed for the scale of modern telemetry.
For teams that want the power of ClickHouse without building the UI from scratch, ClickStack provides the best of both worlds: the raw engine power of ClickHouse, standard OpenTelemetry ingestion, and a purpose-built visualization layer.