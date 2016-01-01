Integrating Kafka with ClickHouse Cloud

You have familiarized yourself with the ClickPipes intro.

Access the SQL Console for your ClickHouse Cloud Service. Select the Data Sources button on the left-side menu and click on "Set up a ClickPipe" Select your data source. Fill out the form by providing your ClickPipe with a name, a description (optional), your credentials, and other connection details. note Currently ClickPipes does not support loading custom CA certificates. Configure the schema registry. A valid schema is required for Avro streams and optional for JSON. This schema will be used to parse AvroConfluent or validate JSON messages on the selected topic. Avro messages that can not be parsed or JSON messages that fail validation will generate an error. Note that ClickPipes will automatically retrieve an updated or different schema from the registry if indicated by the schema ID embedded in the message. There are two ways to format the URL path to retrieve the correct schema:

the path /schemas/ids/[ID] to the schema document by the numeric schema id. A complete url using a schema id would be https://registry.example.com/schemas/ids/1000

to the schema document by the numeric schema id. A complete url using a schema id would be the path /subjects/[subject_name] to the schema document by subject name. Optionally, a specific version can be referenced by appending /versions/[version] to the url (otherwise ClickPipes will retrieve the latest version). A complete url using a schema subject would be https://registry.example.com/subjects/events or https://registry/example.com/subjects/events/versions/4

Select your topic and the UI will display a sample document from the the topic. In the next step, you can select whether you want to ingest data into a new ClickHouse table or reuse an existing one. Follow the instructions in the screen to modify your table name, schema, and settings. You can see a real-time preview of your changes in the sample table at the top. You can also customize the advanced settings using the controls provided Alternatively, you can decide to ingest your data in an existing ClickHouse table. In that case, the UI will allow you to map fields from the source to the ClickHouse fields in the selected destination table. Finally, you can configure permissions for the internal clickpipes user.

Permissions: ClickPipes will create a dedicated user for writing data into a destination table. You can select a role for this internal user using a custom role or one of the predefined role:

- `Full access`: with the full access to the cluster. It might be useful if you use Materialized View or Dictionary with the destination table.

- `Only destination table`: with the `INSERT` permissions to the destination table only.



By clicking on "Complete Setup", the system will register you ClickPipe, and you'll be able to see it listed in the summary table. The summary table provides controls to display sample data from the source or the destination table in ClickHouse As well as controls to remove the ClickPipe and display a summary of the ingest job. Congratulations! you have successfully set up your first ClickPipe. If this is a streaming ClickPipe it will be continuously running, ingesting data in real-time from your remote data source.

Name Logo Type Status Description Apache Kafka Streaming Stable Configure ClickPipes and start ingesting streaming data from Apache Kafka into ClickHouse Cloud. Confluent Cloud Streaming Stable Unlock the combined power of Confluent and ClickHouse Cloud through our direct integration. Redpanda Streaming Stable Configure ClickPipes and start ingesting streaming data from RedPanda into ClickHouse Cloud. AWS MSK Streaming Stable Configure ClickPipes and start ingesting streaming data from AWS MSK into ClickHouse Cloud. Azure Event Hubs Streaming Stable Configure ClickPipes and start ingesting streaming data from Azure Event Hubs into ClickHouse Cloud. Upstash Streaming Stable Configure ClickPipes and start ingesting streaming data from Upstash into ClickHouse Cloud. WarpStream Streaming Stable Configure ClickPipes and start ingesting streaming data from WarpStream into ClickHouse Cloud.

More connectors are will get added to ClickPipes, you can find out more by contacting us.

The supported formats are:

The following ClickHouse types are currently supported for JSON payloads:

Base numeric types Int8 Int16 Int32 Int64 UInt8 UInt16 UInt32 UInt64 Float32 Float64

Boolean

String

FixedString

Date, Date32

DateTime, DateTime64

Enum8/Enum16

LowCardinality(String)

Map with keys and values using any of the above types (including Nullables)

Tuple and Array with elements using any of the above types (including Nullables, one level depth only)

JSON/Object('json'). experimental

note Nullable versions of the above are also supported with these exceptions: Nullable Enums are not supported

supported LowCardinality(Nullable(String)) is not supported

ClickPipes supports all Avro Primitive and Complex types, and all Avro Logical types except time-millis , time-micros , local-timestamp-millis , local_timestamp-micros , and duration . Avro record types are converted to Tuple, array types to Array, and map to Map (string keys only). In general the conversions listed here are available. We recommend using exact type matching for Avro numeric types, as ClickPipes does not check for overflow or precision loss on type conversion.

ClickPipes dynamically retrieves and applies the Avro schema from the configured Schema Registry using the schema ID embedded in each message/event. Schema updates are detected and processed automatically.

At this time ClickPipes is only compatible with schema registries that use the Confluent Schema Registry API. In addition to Confluent Kafka and Cloud, this includes the RedPanda, AWS MSK, and Upstash schema registries. ClickPipes is not currently compatible with the AWS Glue Schema registry or the Azure Schema Registry (coming soon).

The following rules are applied to the mapping between the retrieved Avro schema and the ClickHouse destination table:

If the Avro schema contains a field that is not included in the ClickHouse destination mapping, that field is ignored.

If the Avro schema is missing a field defined in the ClickHouse destination mapping, the ClickHouse column will be populated with a "zero" value, such as 0 or an empty string. Note that DEFAULT expressions are not currently evaluated for ClickPipes inserts (this is temporary limitation pending updates to the ClickHouse server default processing).

If the Avro schema field and the ClickHouse column are incompatible, inserts of that row/message will fail, and the failure will be recorded in the ClickPipes errors table. Note that several implicit conversions are supported (like between numeric types), but not all (for example, an Avro record field can not be inserted into an Int32 ClickHouse column).

The following virtual columns are supported for Kafka compatible streaming data sources. When creating a new destination table virtual columns can be added by using the Add Column button.

Name Description Recommended Data Type _key Kafka Message Key String _timestamp Kafka Timestamp (Millisecond precision) DateTime64(3) _partition Kafka Partition Int32 _offset Kafka Offset Int64 _topic Kafka Topic String _header_keys Parallel array of keys in the record Headers Array(String) _header_values Parallel array of headers in the record Headers Array(String)

DEFAULT is not supported.

ClickPipes for Kafka provides at-least-once delivery semantics (as one of the most commonly used approaches). We'd love to hear your feedback on delivery semantics contact form. If you need exactly-once semantics, we recommend using our official clickhouse-kafka-connect sink.

For Apache Kafka protocol data sources, ClickPipes supports SASL/PLAIN authentication with TLS encryption, as well as SASL/SCRAM-SHA-256 and SASL/SCRAM-SHA-512 . Depending on the streaming source (Redpanda, MSK, etc) will enable all or a subset of these auth mechanisms based on compatibility. If you auth needs differ please give us feedback.

AWS MSK authentication currently only supports SASL/SCRAM-SHA-512 authentication. IAM authentication is coming soon.