Options for integrating ClickHouse Cloud with Apache Kafka include:
- Kafka Connect - Kafka Connect is a free, open-source component of Apache Kafka® that works as a centralized data hub for simple data integration between Kafka and other data systems. This is the option we cover here in detail.
- Vector - Vector is a vendor agnostic data pipeline. With the ability to read from Kafka, and send events to ClickHouse, this represents a robust integration option.
Confluent Platform
Confluent Platform is a full-scale data streaming platform that enables you to easily access, store, and manage data as continuous, real-time streams.
Confluent solution is provided as On Prem & Cloud Solutions.
ClickHouse Cloud is only supported on the On Prem and is enabled by using the JdbcSinkConnector.
How to configure Local Single node installation for Confluent
Install Confluent - we recommend the Quick Start for Confluent Platform
Install ClickHouse-JDBC Driver - Download the latest version of ClickHouse-JDBC version into Confluent directory.
wget https://repo1.maven.org/maven2/com/clickhouse/clickhouse-jdbc/0.3.2- patch11/clickhouse-jdbc-0.3.2-patch11.jar
Start Confluent instance.
Prepare configuration in the UI
Create a database and a table in ClickHouse
Create a Kafka topic that our JdbcSinkConnector can pull messages from. Call it
ClickHouse Cloud
Create a new JdbcSinkConnector with this configuration and provide endpoint, password and username properties:
{ "name": "JdbcSinkConnectorConnector_0", "config": { "name": "JdbcSinkConnectorConnector_0", "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector", "key.converter": "org.apache.kafka.connect.storage.StringConverter", "topics": "test_v1", "connection.url": "jdbc:clickhouse://<host>:8443/default?ssl=true", "connection.user": "default", "connection.password": "<password>", "dialect.name": "GenericDatabaseDialect", "auto.create": "false" } }
Press
Launch
and data will start to flow!