简介
本节中的部署示例基于 ClickHouse 支持与服务团队向 ClickHouse 用户提供的建议。这些都是可直接使用的示例,我们建议您先尝试运行,然后根据自身需求进行调整。您也许会在这里找到一个完全符合您需求的示例。
我们在 example repo 中提供了多种不同拓扑结构的“方案”,如果本节中的示例不能完全满足您的需求,建议您查阅这些内容。
Terminology
Replica
A copy of data. ClickHouse always has at least one copy of your data, and so the minimum number of replicas is one. This is an important detail, you may not be used to counting the original copy of your data as a replica, but that is the term used in ClickHouse code and documentation. Adding a second replica of your data provides fault tolerance.
Shard
A subset of data. ClickHouse always has at least one shard for your data, so if you do not split the data across multiple servers, your data will be stored in one shard. Sharding data across multiple servers can be used to divide the load if you exceed the capacity of a single server. The destination server is determined by the sharding key, and is defined when you create the distributed table. The sharding key can be random or as an output of a hash function. The deployment examples involving sharding will use rand() as the sharding key, and will provide further information on when and how to choose a different sharding key.
Distributed coordination
ClickHouse Keeper provides the coordination system for data replication and distributed DDL queries execution. ClickHouse Keeper is compatible with Apache ZooKeeper.