Using Kafka Connect
For users that prefer to manage the Kafka ClickHouse interaction external to ClickHouse, Kafka Connect provides an alternative. As of the time of writing, there are several approaches to using Kafka Connect, depending on the direction of data transfer, each with its own limitations. The following is not a comprehensive tutorial on Kafka Connect, and the user is referred to Confluent documentation for advanced configurations.
- Download and install the Confluent platform [https://www.confluent.io/installation](https://www.confluent.io/installation). This main Confluent package contains the tested version of Kafka Connect v7.0.1.
- Java is required for the Confluent Platform. Refer to their documentation for the currently supported java versions.
- Ensure you have a ClickHouse instance available.
- Kafka instance - Confluent cloud is the easiest for this; otherwise, set up a self-managed instance using the above Confluent package. The setup of Kafka is beyond the scope of these docs.
- Test dataset. A small GitHub JSON-based dataset with an insertion script is provided for convenience here. This will automatically apply a Kafka schema to the data to ensure it is compatible with the JDBC connector.