Skip to main content
Skip to main content

CapnProto

Description

CapnProto is a binary message format similar to Protocol Buffers and Thrift, but not like JSON or MessagePack. CapnProto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query. See also Format Schema.

Data Types Matching

The table below shows supported data types and how they match ClickHouse data types in INSERT and SELECT queries.

CapnProto data type (INSERT)ClickHouse data typeCapnProto data type (SELECT)
UINT8, BOOLUInt8UINT8
INT8Int8INT8
UINT16UInt16, DateUINT16
INT16Int16INT16
UINT32UInt32, DateTimeUINT32
INT32Int32, Decimal32INT32
UINT64UInt64UINT64
INT64Int64, DateTime64, Decimal64INT64
FLOAT32Float32FLOAT32
FLOAT64Float64FLOAT64
TEXT, DATAString, FixedStringTEXT, DATA
union(T, Void), union(Void, T)Nullable(T)union(T, Void), union(Void, T)
ENUMEnum(8/16)ENUM
LISTArrayLIST
STRUCTTupleSTRUCT
UINT32IPv4UINT32
DATAIPv6DATA
DATAInt128/UInt128/Int256/UInt256DATA
DATADecimal128/Decimal256DATA
STRUCT(entries LIST(STRUCT(key Key, value Value)))MapSTRUCT(entries LIST(STRUCT(key Key, value Value)))
  • Integer types can be converted into each other during input/output.
  • For working with Enum in CapnProto format use the format_capn_proto_enum_comparising_mode setting.
  • Arrays can be nested and can have a value of the Nullable type as an argument. Tuple and Map types also can be nested.

Example Usage

Inserting and Selecting Data

You can insert CapnProto data from a file into ClickHouse table by the following command:

$ cat capnproto_messages.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_schema = 'schema:Message' FORMAT CapnProto"

Where schema.capnp looks like this:

struct Message {
SearchPhrase @0 :Text;
c @1 :Uint64;
}

You can select data from a ClickHouse table and save them into some file in the CapnProto format by the following command:

$ clickhouse-client --query = "SELECT * FROM test.hits FORMAT CapnProto SETTINGS format_schema = 'schema:Message'"

Using autogenerated schema

If you don't have an external CapnProto schema for your data, you can still output/input data in CapnProto format using autogenerated schema. For example:

SELECT * FROM test.hits format CapnProto SETTINGS format_capn_proto_use_autogenerated_schema=1

In this case ClickHouse will autogenerate CapnProto schema according to the table structure using function structureToCapnProtoSchema and will use this schema to serialize data in CapnProto format.

You can also read CapnProto file with autogenerated schema (in this case the file must be created using the same schema):

$ cat hits.bin | clickhouse-client --query "INSERT INTO test.hits SETTINGS format_capn_proto_use_autogenerated_schema=1 FORMAT CapnProto"

The setting format_capn_proto_use_autogenerated_schema is enabled by default and applies if format_schema is not set.

You can also save autogenerated schema in the file during input/output using setting output_format_schema. For example:

SELECT * FROM test.hits format CapnProto SETTINGS format_capn_proto_use_autogenerated_schema=1, output_format_schema='path/to/schema/schema.capnp'

In this case autogenerated CapnProto schema will be saved in file path/to/schema/schema.capnp.

Format Settings