Can I Use ClickHouse As a Time-Series Database?
Note: Please see the blog Working with Time series data in ClickHouse for additional examples of using ClickHouse for time series analysis.
ClickHouse is a generic data storage solution for OLAP workloads, while there are many specialized time-series database management systems. Nevertheless, ClickHouse’s focus on query execution speed allows it to outperform specialized systems in many cases. There are many independent benchmarks on this topic out there, so we’re not going to conduct one here. Instead, let’s focus on ClickHouse features that are important to use if that’s your use case.
First of all, there are specialized codecs which make typical time-series. Either common algorithms like
Gorilla or specific to ClickHouse like
Second, time-series queries often hit only recent data, like one day or one week old. It makes sense to use servers that have both fast nVME/SSD drives and high-capacity HDD drives. ClickHouse TTL feature allows to configure keeping fresh hot data on fast drives and gradually move it to slower drives as it ages. Rollup or removal of even older data is also possible if your requirements demand it.
Even though it’s against ClickHouse philosophy of storing and processing raw data, you can use materialized views to fit into even tighter latency or costs requirements.