On April 26th, 2023, Rokt hosted a ClickHouse community meetup at their office in Manhattan. We had the pleasure of hearing from Brooke McKim, co-founder and CTO of Vantage, who shared their experience of transitioning from Redshift and Postgres to ClickHouse. In their talk they took us through Vantage's journey, discussing the challenges with their previous architecture, the decision to switch to ClickHouse, and the benefits they've seen since making the move.
Vantage is a cloud cost optimization platform enabling teams to manage and optimize their cloud costs across ten cloud infrastructure and service providers, such as AWS, Azure, Google Cloud, Datadog, New Relic, Snowflake, Databricks, Fastly, MongoDB Atlas, and Kubernetes. Their Autopilot managed service optimizes AWS bills by automatically buying and selling financial commitments, resulting in up to 72% savings.
The Vantage user interface (UI) offers a comprehensive view of all relevant data, with features for filtering, aggregation, and reporting. A standout feature is the ability to drill down into specific dimensions. For instance, if you're looking at the cost for Amazon's S3 service, you can click into that category and break it down further. Even with larger accounts, which can have billions of records spanning six to twelve months, you can delve into specific resources, providing an in-depth perspective on cost data at scale.
The Vantage UI provides a detailed view of cloud costs with unique drill-down capabilities for in-depth data analysis.
Challenges with Redshift and Postgres
Vantage initially started with Postgres, which seemed adequate at first. However, McKim explained they quickly encountered performance issues as they began to scale. “We were in this growth phase and it seemed like every week or month we were getting a customer that was an order of magnitude larger than the one we had the previous. A lot of that time was spent keeping up with the performance issues. Eventually we just hit a wall with Postgres and had to figure out a different solution.”
Postgres uses a process called 'vacuum' to remove deleted records from disk space. However, with Vantage's frequent data deletions and insertions, this vacuum process was constantly running, using up resources and slowing performance. The system struggled to keep indexes in memory due to constant changes, leading to slow response times. This led Vantage to seek alternatives that offered similar functionality but were more efficient.
The team at Vantage sought a solution similar to Postgres for a quick transition with minimal operational overhead. They decided to choose Redshift for its familiarity, as it's based on an older version of Postgres. However, Redshift presented its own challenges. Although it could handle larger volumes of data, it struggled with Vantage's constant deletes and updates and required data to always be inserted in sort order for optimal performance. The team attempted to mitigate these issues by running vacuum sort operations frequently and partitioning tables. They also created a hybrid architecture with Redshift as their data warehouse and Postgres as a cache. However, this resulted in a limited user experience and the need to move billions of records daily between Redshift and Postgres. Conversely, ClickHouse’s ReplicatedMergeTree table engine allows row updates to be handled efficiently and transparently.
To be able to present cost data to users, Vantage was generating tens of thousands of reports nightly, leading to a large amount of data being deleted and reinserted into Postgres. This resulted in high cloud costs, with Postgres costs surpassing those of Redshift due to the provision of IOPS and the numerous write operations. They could have scaled this further, but reached a cost limit they were unwilling to exceed. This resulted in numerous "roll up" tables with various data variations. McKim mentioned, "We're inserting data into Redshift. It's unsorted. We have to go back and run this job, which ends up being very expensive, especially as your table gets very large."
Discovering ClickHouse Cloud as an Alternative
As Vantage started exploring alternatives, they met with the ClickHouse team at the AWS re:Invent conference. They were attracted to ClickHouse Cloud for its ease of use, low operational overhead, and comprehensive documentation.
“ClickHouse has been great to work with. I think there's a bit of a learning curve when you first approach ClickHouse, but once you understand some of the intricacies of how it differs from traditional relational databases, and how the different engines work, it's actually pretty easy to move over to,” explained McKim.
The team embarked on a 60-day transition process, which mostly involved ensuring the quality and correctness of the data. McKim mentioned, "The documentation is just far more comprehensive and technically deeper than anything you'll get from AWS on Redshift, which has just been super helpful because you can just understand how the system works."
Transitioning to ClickHouse and Tackling Data Ingestion Challenges
With ClickHouse, Vantage no longer needed to delete and reinsert data for their nightly report generation. Instead, they began versioning imports using the ReplacingMergeTree engine and added an import version column that increments with each data import. This approach allowed them to reduce their operational overhead and cut costs while still maintaining high performance. McKim said, "ClickHouse is just taking the existing data in the table and doing a merge automatically with any new data that is inserted, which is great as it saves us having to delete and update rows."
In contrast to Redshift, ClickHouse automatically sorts and merges data, which saves both time and resources. They also appreciate ClickHouse's comprehensive documentation, no downtime updates, and data storage in S3 which is cost-effective. They contrast this with their experiences with Redshift, which has forced updates incuring downtime, potential service interruptions, and seems less suited for real-time analytics.
Since making the switch to ClickHouse Cloud, Vantage has experienced significant cost savings and performance improvements. They have doubled the resources on their ClickHouse setup at a similar price to their Redshift setup. Additionally, the performance they see from ClickHouse is comparable to a well-indexed, pre-aggregated table in RDS Postgres, which is also cheaper than their previous Postgres setup.
Looking Forward to Future Opportunities with ClickHouse
Vantage is excited about the potential for reducing operational overhead and is enthusiastic about the rate of innovation they see from the ClickHouse team, as they continue to improve and expand their product. They also look forward to exploring new use cases and taking advantage of ClickHouse's capabilities to further enhance their platform and better serve their customers.
Vantage's journey from Redshift and Postgres to ClickHouse demonstrates the value of finding a database solution that better aligns with a company's specific needs. With ClickHouse Cloud, Vantage has found a more cost-effective, high-performance solution that has enabled them to grow and improve their platform.