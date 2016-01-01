Migrating data to ClickStack from Elastic

When migrating from Elastic to ClickStack for observability use cases, we recommend a parallel operation approach rather than attempting to migrate historical data. This strategy offers several advantages:

Minimal risk: by running both systems concurrently, you maintain access to existing data and dashboards while validating ClickStack and familiarizing your users with the new system. Natural data expiration: most observability data has a limited retention period (typically 30 days or less), allowing for a natural transition as data expires from Elastic. Simplified migration: no need for complex data transfer tools or processes to move historical data between systems.

Migrating data We demonstrate an approach for migrating essential data from Elasticsearch to ClickHouse in the section "Migrating data". This should not be used for larger datasets as it is rarely performant - limited by the ability for Elasticsearch to export efficiently, with only JSON format supported.

Configure Dual Ingestion

Set up your data collection pipeline to send data to both Elastic and ClickStack simultaneously.

How this is achieved depends on your current agents for collection - see "Migrating Agents".

Adjust Retention Periods

Configure Elastic's TTL settings to match your desired retention period. Set up the ClickStack TTL to maintain data for the same duration.

Validate and Compare:

Run queries against both systems to ensure data consistency

Compare query performance and results

Migrate dashboards and alerts to ClickStack. This is currently a manual process.

Verify that all critical dashboards and alerts work as expected in ClickStack

Gradual Transition:

As data naturally expires from Elastic, users will increasingly rely on ClickStack

Once confidence in ClickStack is established, you can begin redirecting queries and dashboards

For organizations requiring longer retention periods:

Continue running both systems in parallel until all data has expired from Elastic

ClickStack tiered storage capabilities can help manage long-term data efficiently.

Consider using materialized views to maintain aggregated or filtered historical data while allowing raw data to expire.

The migration timeline will depend on your data retention requirements:

30-day retention : Migration can be completed within a month.

: Migration can be completed within a month. Longer retention : Continue parallel operation until data expires from Elastic.

: Continue parallel operation until data expires from Elastic. Historical data: If absolutely necessary, consider using Migrating data to import specific historical data.

When migrating from Elastic to ClickStack, your indexing and storage settings will need to be adapted to fit ClickHouse's architecture. While Elasticsearch relies on horizontal scaling and sharding for performance and fault tolerance and thus has multiple shards by default, ClickHouse is optimized for vertical scaling and typically performs best with fewer shards.

We recommend starting with a single shard and scaling vertically. This configuration is suitable for most observability workloads and simplifies both management and query performance tuning.

ClickHouse Cloud : Uses a single-shard, multi-replica architecture by default. Storage and compute scale independently, making it ideal for observability use cases with unpredictable ingest patterns and read-heavy workloads.

: Uses a single-shard, multi-replica architecture by default. Storage and compute scale independently, making it ideal for observability use cases with unpredictable ingest patterns and read-heavy workloads. ClickHouse OSS : In self-managed deployments, we recommend: Starting with a single shard Scaling vertically with additional CPU and RAM Using tiered storage to extend local disk with S3-compatible object storage Using ReplicatedMergeTree if high availability is required For fault tolerance, 1 replica of your shard is typically sufficient in Observability workloads.

: In self-managed deployments, we recommend:

Sharding may be necessary if:

Your ingest rate exceeds the capacity of a single node (typically >500K rows/sec)

You need tenant isolation or regional data separation

Your total dataset is too large for a single server, even with object storage

If you do need to shard, refer to Horizontal scaling for guidance on shard keys and distributed table setup.

ClickHouse uses TTL clauses on MergeTree tables to manage data expiration. TTL policies can:

Automatically delete expired data

Move older data to cold object storage

Retain only recent, frequently queried logs on fast disk

We recommend aligning your ClickHouse TTL configuration with your existing Elastic retention policies to maintain a consistent data lifecycle during the migration. For examples, see ClickStack production TTL setup.

While we recommend parallel operation for most observability data, there are specific cases where direct data migration from Elasticsearch to ClickHouse may be necessary:

Small lookup tables used for data enrichment (e.g., user mappings, service catalogs)

Business data stored in Elasticsearch that needs to be correlated with observability data, with ClickHouse's SQL capabilities and Business Intelligence integrations making it easier to maintain and query the data compared to Elasticsearch's more limited query options.

Configuration data that needs to be preserved across the migration

This approach is only viable for datasets under 10 million rows, as Elasticsearch's export capabilities are limited to JSON over HTTP and don't scale well for larger datasets.

The following steps allow the migration of a single Elasticsearch index from ClickHouse.