Comparisons
/
Observability

ClickHouse vs Splunk

ClickStack is a high-performance, open-source observability stack built on ClickHouse for OpenTelemetry at scale. It delivers high compression and lightning-fast queries across high cardinality OTel data at petabyte scale.

Splunk, in contrast, is a legacy log analytics and monitoring platform built on an index-based search architecture and a proprietary query language. Designed for IT operations and security analytics, it faces limitations in cost efficiency and performance for modern observability workloads at large scale.

Tired of ingest limits, limited retention, slow searches, and complex licensing? You’re not alone.

Query results

1 Queries executed

Search heads

1 Queries executed

Join Anthropic in migrating from Splunk

I’d recommend ClickHouse - it supports real-time at scale, fast analytics, deployment flexibility, and cost-effective scaling. Queries are lightning-fast, and money is not on fire as much.

Predictable, resource-based pricing

Splunk’s complex mix of ingest, workload, and host-based pricing makes cost forecasting difficult. ClickStack uses simple resource-based pricing -pay only for compute and storage. With separation of storage and compute and high compression, users can enjoy long term cost-efficient retention.

Real-time performance, not long-running searches

Splunk queries often slow under scale or require pre-aggregations like tstats. ClickStack delivers sub-second queries on full-fidelity data, even across trillions of rows. No sampling. No penalty for high cardinality.

Unified observability without product sprawl

Unlike Splunk’s separate Enterprise, Cloud, and Observability platforms, ClickStack unifies logs, metrics andtraces, in one system - no multiple SKUs or disconnected data stores and disjointed user experiences.

Open source and open standards

Splunk’s proprietary SPL and closed data formats limit portability. ClickStack is fully open-source and embraces open standards like SQL and OpenTelemetry, ensuring flexibility and avoiding lock-in.

Designed for OTel at scale

OTel-first by design. Real-time querying.
Long term retention. No sampling.

ClickStack, built on ClickHouse, is OpenTelemetry-native by design, supporting unified logs, traces, metrics, and replays at petabyte scale.

Splunk’s architecture is not optimized for OTel’s high-cardinality, high-throughput demands.

ClickStack compared to Splunk

Break free from thousands of products and SKUs.
One high-performance engine, one unified experience.

Splunk started as an early log aggregator using a forwarder–indexer–search head model built for gigabyte-scale data. It’s since expanded into multiple products with separate backends, but its architecture wasn’t designed for fast aggregations or high-cardinality workloads at petabyte scale.

ClickStack, built on ClickHouse’s high-performance columnar engine, delivers superior compression and seamless real-time aggregation at any scale. It provides a simpler, faster observability platform powered by OpenTelemetry and HyperDX. In ClickHouse Cloud, separated compute and storage maintain sub-second latency with cost-efficient long-term retention.

Splunk’s coupled indexing and search architecture limits scalability, forcing nodes to handle both ingest and queries. Scaling is manual with no elasticity, and Data Model Acceleration delays data availability through scheduled jobs. Even with SmartStore, compute remains static and performance hinges on I/O tuning.

ClickStack scales efficiently with ClickHouse - vertically through vectorized execution that maximizes CPU use, and horizontally across shards. Decoupled compute and storage enable independent scaling for ingest and queries, while incremental materialized views are updated in real-time.

Splunk’s performance is limited by its event-based indexing model, where ingest and search share the same compute resources. Large queries often span multiple indexers and require full bucket scans, leading to multi-minute latencies unless pre-aggregations are scheduled in advance. Concurrency and join performance degrade as indexer load increases, and query speed depends heavily on data locality within hot buckets.

ClickStack with ClickHouse delivers consistent sub-second performance through a columnar engine with vectorized, fully parallel execution across cores and nodes. Built-in data skipping, advanced compression, and real-time materialized views minimize scan volumes while maintaining full fidelity. The result is high-throughput ingestion, fast searches, and predictable performance even at petabyte scale.

Splunk’s closed architecture and proprietary SPL limit interoperability. Though it integrates with OpenTelemetry and Kafka, all data must be ingested and indexed before querying, restricting flexibility with external or historical data. It lacks support for open formats like Parquet or Iceberg, catalog integration, and offers only limited SQL access via SDKs or APIs.

ClickStack embraces open standards across ingestion, query, and storage. It supports streaming and batch ingestion via OpenTelemetry, Kafka, HTTP, and S3, with full SQL and native engines for MySQL, PostgreSQL, MongoDB, and Iceberg. Native Parquet and Iceberg support, external table querying, and standard JDBC/ODBC access enable seamless integration into modern data stacks without lock-in.

Yes
Single columnar engine (ClickHouse) for logs, metrics, traces, and replays
Yes
One binary, homogeneous cluster
Yes
Complete separation via compute-compute separation
Yes
Fully columnar, vectorized execution
Yes
Supported, efficient columnar layout for semi-structured data
Yes
MIT / Apache 2.0 licensed
Yes
Standard SQL for analytics and joins
Yes
Fully decoupled; object storage for retention, elastic compute for queries
Yes
Optional secondary inverted indexes for text search
Yes
Native vectorized parallelism; scales vertically
Yes
Supported via HyperDX interface
Yes
Supported
Yes
Scales elastically across nodes with distributed queries
Yes
Self-hosted or ClickHouse Cloud
Yes
Decoupled storage and compute allow independent scaling of ingest and query nodes.
Yes
Compute scaled dynamically in Cloud
Yes
Materialized views execute incrementally on inserts. No loss of fidelity, all functions supported.
Yes
Fully parallelized, vectorized execution across CPU cores and cluster nodes.
Yes
~1TB per core/day uncompressed
Yes
Scale query compute up and dynamically independent of storage in Cloud
Yes
Efficient use of large multi-core nodes due to vectorization.
Yes
Fully decoupled in ClickHouse Cloud. Object storage for long retention with intelligent caching.
Yes
Data skipping indexes on primary keys, plus optional inverted.
Yes
Native distributed queries over many shards.
Yes
< 1s for aggregations due to columnar execution and skipping.
Yes
Built-in skipping indexes + optional inverted indices reduce scan volumes dramatically.
Yes
Extremely high insert rates - typically tens of MB/sec per core (≈1 TB/day per core uncompressed)
Yes
Designed for many concurrent analytical queries on large ranges.
Yes
Fully parallelized across cores and nodes
Yes
Materialized views and projections allow real-time aggregation without pre-computation delays.
Yes
10x–30x
Yes
Optimized for real-time analytical joins and aggregations with full JOINs supported.
Yes
Uniformly high performance across hot, warm, and cold tiers due to intelligent caching.
Yes
Full SQL with hundreds of analytical and statistical functions.
Yes
Query external data directly using table engines and functions (e.g., s3, url, hdfs, mysql, postgresql).
Yes
Reads Parquet, Iceberg, and other open formats natively from S3, HDFS, and local storage.
Yes
Integrates with catalogs such as Unity, Nessie, AWS Glue for open table formats.
Yes
Native table engines for PostgreSQL, MySQL, MongoDB, and ODBC/JDBC sources.
Yes
Supports both: streaming via Kafka/HTTP/OTel, batch via S3, Parquet, and bulk inserts.
Yes
MySQL SQL + JDBC/ODBC drivers allow direct connection from BI and AI platforms.
Yes
OpenTelemetry-native. Accepts OTel traces, logs, and metrics directly.
Yes
Native Kafka table engine and ClickPipes in Cloud for high-throughput streaming ingest.
Yes
Reads/writes natively without conversion; schema-on-write or schema-on-read both supported.

No
Multiple backends (Enterprise, Cloud, Observability Cloud) with separate data stores
No
Multiple component types (forwarders, indexers, search heads)
No
Shared resources on indexers; ingest and search contend for CPU & I/O
No
Row/event-based index buckets
No
Not supported; schema defined at query time only
No
Proprietary, closed source
No
Proprietary SPL only
Intermediate—
SmartStore uses object storage for long-term retention, local disks still for hot
Intermediate—
Proprietary event index; not true full-text inverted index
Intermediate—
Limited; vertical scaling possible but constrained by indexer thread model
Intermediate—
Basic keyword search; SPL required for complex queries
Yes
Supported
Yes
Scales via additional indexers; recommended approach
Yes
On-prem or cloud offerings
No
Ingest and search share indexer resources. Partial separation only via searchable vs non-searchable replicas.
No
No native support. Scaling manual.
Intermediate—
Supports Data Model Acceleration and Report Acceleration. Updates are not incremental per event - there’s a delay before new data appears in summaries.
Intermediate—
MapReduce-style search pipeline. The search head distributes subsearches (“map”) to indexers, which process and reduce results before aggregation (“reduce”). Parallelism exists per search pipeline but is not vectorized.
Intermediate—
300 GB per day per 12 vCPU indexer
Intermediate—
Scale by adding or removing indexers. No native separation of storage from compute for hot data.
Intermediate—
Supported but requires tuning and can be limited by I/O and pipeline configuration.
Intermediate—
SmartStore uses object storage but hot buckets still rely on local disks
Intermediate—
Proprietary event index with metadata lookups.
Yes
Add indexers to distribute ingest and search.
No
Multi-minute long queries common
No
Must scan relevant time buckets in full; no native range-pruning or skipping indexes.
Intermediate—
~300 GB/day per 12 vCPU indexer with search load. Throughput limited by indexing and search contention.
Intermediate—
Concurrency is sensitive to indexer load.
Intermediate—
MapReduce-style search distributes work across indexers but lacks vectorization and fine-grained parallelism.
Intermediate—
Data Model and Report Acceleration rely on scheduled jobs; results delayed until summarization completes.
Intermediate—
2x - 4x
Intermediate—
Joins and sub-searches in SPL are slow
Intermediate—
Hot buckets perform best; warm/cold buckets stored in SmartStore add network latency.
No
Proprietary SPL; limited interoperability with SQL-based tools.
No
Must ingest and index before search; no native query-in-place capability.
No
No native support for Parquet, Iceberg, or ORC as queryable sources.
No
No catalog support.
Intermediate—
Limited to Splunk DB Connect app (separate plugin); slower, ETL-style integration.
Intermediate—
Primarily streaming via forwarders, HEC, or Kafka Connect; lacks native batch ingestion.
Intermediate—
Limited integration via Splunk SDKs or REST API; not natively accessible via SQL clients.
Yes
Strong support via Splunk distribution of OTel Collector and Splunk Observability Cloud integrations.
Yes
Supported via Splunk Connect for Kafka (Kafka Connect sink) feeding HEC; external connector required.
Yes
Must parse at ingest

Feature comparison of ClickHouse and Splunk
Featured
Unified architecture with single engine for all workloads	Yes Single columnar engine (ClickHouse) for logs, metrics, traces, and replays	No Multiple backends (Enterprise, Cloud, Observability Cloud) with separate data stores
Single binary deployment	Yes One binary, homogeneous cluster	No Multiple component types (forwarders, indexers, search heads)
Isolation of inserts and queries	Yes Complete separation via compute-compute separation	No Shared resources on indexers; ingest and search contend for CPU & I/O
Columnar storage engine	Yes Fully columnar, vectorized execution	No Row/event-based index buckets
Schema on write	Yes Supported, efficient columnar layout for semi-structured data	No Not supported; schema defined at query time only
Open source	Yes MIT / Apache 2.0 licensed	No Proprietary, closed source
SQL support	Yes Standard SQL for analytics and joins	No Proprietary SPL only
Separation of storage and compute	Yes Fully decoupled; object storage for retention, elastic compute for queries	Intermediate— SmartStore uses object storage for long-term retention, local disks still for hot
Inverted index support (true full-text or JSON path)	Yes Optional secondary inverted indexes for text search	Intermediate— Proprietary event index; not true full-text inverted index
Vertical scalability (multi-core parallelism within node)	Yes Native vectorized parallelism; scales vertically	Intermediate— Limited; vertical scaling possible but constrained by indexer thread model
Natural language search	Yes Supported via HyperDX interface	Intermediate— Basic keyword search; SPL required for complex queries
Schema on read	Yes Supported	Yes Supported
Horizontal scaling	Yes Scales elastically across nodes with distributed queries	Yes Scales via additional indexers; recommended approach
Deployment model	Yes Self-hosted or ClickHouse Cloud	Yes On-prem or cloud offerings
Read and write isolation	Yes Decoupled storage and compute allow independent scaling of ingest and query nodes.	No Ingest and search share indexer resources. Partial separation only via searchable vs non-searchable replicas.
Dynamic scaling to handle bursts	Yes Compute scaled dynamically in Cloud	No No native support. Scaling manual.
Pre-aggregations	Yes Materialized views execute incrementally on inserts. No loss of fidelity, all functions supported.	Intermediate— Supports Data Model Acceleration and Report Acceleration. Updates are not incremental per event - there’s a delay before new data appears in summaries.
Query execution model	Yes Fully parallelized, vectorized execution across CPU cores and cluster nodes.	Intermediate— MapReduce-style search pipeline. The search head distributes subsearches (“map”) to indexers, which process and reduce results before aggregation (“reduce”). Parallelism exists per search pipeline but is not vectorized.
Insert throughput per node	Yes ~1TB per core/day uncompressed	Intermediate— 300 GB per day per 12 vCPU indexer
Elastic scaling	Yes Scale query compute up and dynamically independent of storage in Cloud	Intermediate— Scale by adding or removing indexers. No native separation of storage from compute for hot data.
Vertical scaling	Yes Efficient use of large multi-core nodes due to vectorization.	Intermediate— Supported but requires tuning and can be limited by I/O and pipeline configuration.
Storage and compute separation	Yes Fully decoupled in ClickHouse Cloud. Object storage for long retention with intelligent caching.	Intermediate— SmartStore uses object storage but hot buckets still rely on local disks
Data skipping and secondary indexes	Yes Data skipping indexes on primary keys, plus optional inverted.	Intermediate— Proprietary event index with metadata lookups.
Horizontal scaling	Yes Native distributed queries over many shards.	Yes Add indexers to distribute ingest and search.
Latency guidance	Yes < 1s for aggregations due to columnar execution and skipping.	No Multi-minute long queries common
Data skipping and pruning	Yes Built-in skipping indexes + optional inverted indices reduce scan volumes dramatically.	No Must scan relevant time buckets in full; no native range-pruning or skipping indexes.
Insert throughput	Yes Extremely high insert rates - typically tens of MB/sec per core (≈1 TB/day per core uncompressed)	Intermediate— ~300 GB/day per 12 vCPU indexer with search load. Throughput limited by indexing and search contention.
High-concurrency search at scale	Yes Designed for many concurrent analytical queries on large ranges.	Intermediate— Concurrency is sensitive to indexer load.
Parallel execution model	Yes Fully parallelized across cores and nodes	Intermediate— MapReduce-style search distributes work across indexers but lacks vectorization and fine-grained parallelism.
Pre-aggregation behavior	Yes Materialized views and projections allow real-time aggregation without pre-computation delays.	Intermediate— Data Model and Report Acceleration rely on scheduled jobs; results delayed until summarization completes.
Compression efficiency for OpenTelemetry data	Yes 10x–30x	Intermediate— 2x - 4x
Join and aggregation performance	Yes Optimized for real-time analytical joins and aggregations with full JOINs supported.	Intermediate— Joins and sub-searches in SPL are slow
Tiered performance	Yes Uniformly high performance across hot, warm, and cold tiers due to intelligent caching.	Intermediate— Hot buckets perform best; warm/cold buckets stored in SmartStore add network latency.
Standard query language	Yes Full SQL with hundreds of analytical and statistical functions.	No Proprietary SPL; limited interoperability with SQL-based tools.
External data querying (“query in place”)	Yes Query external data directly using table engines and functions (e.g., s3, url, hdfs, mysql, postgresql).	No Must ingest and index before search; no native query-in-place capability.
Support for open table formats	Yes Reads Parquet, Iceberg, and other open formats natively from S3, HDFS, and local storage.	No No native support for Parquet, Iceberg, or ORC as queryable sources.
Third-party catalog integration	Yes Integrates with catalogs such as Unity, Nessie, AWS Glue for open table formats.	No No catalog support.
External database connectivity	Yes Native table engines for PostgreSQL, MySQL, MongoDB, and ODBC/JDBC sources.	Intermediate— Limited to Splunk DB Connect app (separate plugin); slower, ETL-style integration.
Batch and streaming ingestion	Yes Supports both: streaming via Kafka/HTTP/OTel, batch via S3, Parquet, and bulk inserts.	Intermediate— Primarily streaming via forwarders, HEC, or Kafka Connect; lacks native batch ingestion.
Interoperability with analytics & BI tools	Yes MySQL SQL + JDBC/ODBC drivers allow direct connection from BI and AI platforms.	Intermediate— Limited integration via Splunk SDKs or REST API; not natively accessible via SQL clients.
OpenTelemetry support	Yes OpenTelemetry-native. Accepts OTel traces, logs, and metrics directly.	Yes Strong support via Splunk distribution of OTel Collector and Splunk Observability Cloud integrations.
Kafka ingestion	Yes Native Kafka table engine and ClickPipes in Cloud for high-throughput streaming ingest.	Yes Supported via Splunk Connect for Kafka (Kafka Connect sink) feeding HEC; external connector required.
Support for open file formats (CSV, JSON, Parquet)	Yes Reads/writes natively without conversion; schema-on-write or schema-on-read both supported.	Yes Must parse at ingest

Long-term retention without compromise

Separation of storage and compute and 10–30x compression, enables cost-efficient, near-infinite data retention. Keep full-fidelity data for months or years without sampling or pre-aggregation

Schema on read and write

Splunk pioneered schema-on-read, and ClickStack matches it with powerful parsing and string extraction functions. It also adds dynamic schema-on-write, allowing users to index data efficiently for compression and performance

Consistently low latency at high concurrency

ClickHouse was designed for real-time analytics, sustaining thousands of concurrent queries while maintaining sub-second latency

Unified architecture with simple pricing

ClickStack streamlines observability in a unified engine. Eliminate the operational complexity of multiple products, components and SKUs.

Migrate your workload from Splunk today

Cut costs, boost performance, and unlock observability at scale with ClickHouse.

More comparisons