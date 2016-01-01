Data types binary encoding specification
This specification describes the binary format that can be used for binary encoding and decoding of ClickHouse data types. This format is used in
Dynamic column binary serialization and can be used in input/output formats RowBinaryWithNamesAndTypes and Native under corresponding settings.
The table below describes how each data type is represented in binary format. Each data type encoding consist of 1 byte that indicates the type and some optional additional information.
var_uint in the binary encoding means that the size is encoded using Variable-Length Quantity compression.
|ClickHouse data type
|Binary encoding
Nothing
0x00
UInt8
0x01
UInt16
0x02
UInt32
0x03
UInt64
0x04
UInt128
0x05
UInt256
0x06
Int8
0x07
Int16
0x08
Int32
0x09
Int64
0x0A
Int128
0x0B
Int256
0x0C
Float32
0x0D
Float64
0x0E
Date
0x0F
Date32
0x10
DateTime
0x11
DateTime(time_zone)
0x12<var_uint_time_zone_name_size><time_zone_name_data>
DateTime64(P)
0x13<uint8_precision>
DateTime64(P, time_zone)
0x14<uint8_precision><var_uint_time_zone_name_size><time_zone_name_data>
String
0x15
FixedString(N)
0x16<var_uint_size>
Enum8
0x17<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><int8_value_1>...<var_uint_name_size_N><name_data_N><int8_value_N>
Enum16
0x18<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><int16_little_endian_value_1>...><var_uint_name_size_N><name_data_N><int16_little_endian_value_N>
Decimal32(P, S)
0x19<uint8_precision><uint8_scale>
Decimal64(P, S)
0x1A<uint8_precision><uint8_scale>
Decimal128(P, S)
0x1B<uint8_precision><uint8_scale>
Decimal256(P, S)
0x1C<uint8_precision><uint8_scale>
UUID
0x1D
Array(T)
0x1E<nested_type_encoding>
Tuple(T1, ..., TN)
0x1F<var_uint_number_of_elements><nested_type_encoding_1>...<nested_type_encoding_N>
Tuple(name1 T1, ..., nameN TN)
0x20<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><nested_type_encoding_1>...<var_uint_name_size_N><name_data_N><nested_type_encoding_N>
Set
0x21
Interval
0x22<interval_kind> (see interval kind binary encoding)
Nullable(T)
0x23<nested_type_encoding>
Function
0x24<var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N><return_type_encoding>
AggregateFunction(function_name(param_1, ..., param_N), arg_T1, ..., arg_TN)
0x25<var_uint_version><var_uint_function_name_size><function_name_data><var_uint_number_of_parameters><param_1>...<param_N><var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N> (see aggregate function parameter binary encoding)
LowCardinality(T)
0x26<nested_type_encoding>
Map(K, V)
0x27<key_type_encoding><value_type_encoding>
IPv4
0x28
IPv6
0x29
Variant(T1, ..., TN)
0x2A<var_uint_number_of_variants><variant_type_encoding_1>...<variant_type_encoding_N>
Dynamic(max_types=N)
0x2B<uint8_max_types>
Custom type (
Ring,
Polygon, etc)
0x2C<var_uint_type_name_size><type_name_data>
Bool
0x2D
SimpleAggregateFunction(function_name(param_1, ..., param_N), arg_T1, ..., arg_TN)
0x2E<var_uint_function_name_size><function_name_data><var_uint_number_of_parameters><param_1>...<param_N><var_uint_number_of_arguments><argument_type_encoding_1>...<argument_type_encoding_N> (see aggregate function parameter binary encoding)
Nested(name1 T1, ..., nameN TN)
0x2F<var_uint_number_of_elements><var_uint_name_size_1><name_data_1><nested_type_encoding_1>...<var_uint_name_size_N><name_data_N><nested_type_encoding_N>
JSON(max_dynamic_paths=N, max_dynamic_types=M, path Type, SKIP skip_path, SKIP REGEXP skip_path_regexp)
0x30<uint8_serialization_version><var_int_max_dynamic_paths><uint8_max_dynamic_types><var_uint_number_of_typed_paths><var_uint_path_name_size_1><path_name_data_1><encoded_type_1>...<var_uint_number_of_skip_paths><var_uint_skip_path_size_1><skip_path_data_1>...<var_uint_number_of_skip_path_regexps><var_uint_skip_path_regexp_size_1><skip_path_data_regexp_1>...
For type
JSON byte
uint8_serialization_version indicates the version of the serialization. Right now the version is always 0 but can change in future if new arguments will be introduced for
JSON type.
Interval kind binary encoding
The table below describes how different interval kinds of
Interval data type are encoded.
|Interval kind
|Binary encoding
Nanosecond
0x00
Microsecond
0x01
Millisecond
0x02
Second
0x03
Minute
0x04
Hour
0x05
Day
0x06
Week
0x07
Month
0x08
Quarter
0x09
Year
0x1A
Aggregate function parameter binary encoding
The table below describes how parameters of
AggregateFunction and
SimpleAggregateFunction are encoded.
The encoding of a parameter consists of 1 byte indicating the type of the parameter and the value itself.
|Parameter type
|Binary encoding
Null
0x00
UInt64
0x01<var_uint_value>
Int64
0x02<var_int_value>
UInt128
0x03<uint128_little_endian_value>
Int128
0x04<int128_little_endian_value>
UInt128
0x05<uint128_little_endian_value>
Int128
0x06<int128_little_endian_value>
Float64
0x07<float64_little_endian_value>
Decimal32
0x08<var_uint_scale><int32_little_endian_value>
Decimal64
0x09<var_uint_scale><int64_little_endian_value>
Decimal128
0x0A<var_uint_scale><int128_little_endian_value>
Decimal256
0x0B<var_uint_scale><int256_little_endian_value>
String
0x0C<var_uint_size><data>
Array
0x0D<var_uint_size><value_encoding_1>...<value_encoding_N>
Tuple
0x0E<var_uint_size><value_encoding_1>...<value_encoding_N>
Map
0x0F<var_uint_size><key_encoding_1><value_encoding_1>...<key_encoding_N><value_encoding_N>
IPv4
0x10<uint32_little_endian_value>
IPv6
0x11<uint128_little_endian_value>
UUID
0x12<uuid_value>
Bool
0x13<bool_value>
Object
0x14<var_uint_size><var_uint_key_size_1><key_data_1><value_encoding_1>...<var_uint_key_size_N><key_data_N><value_encoding_N>
AggregateFunctionState
0x15<var_uint_name_size><name_data><var_uint_data_size><data>
Negative infinity
0xFE
Positive infinity
0xFF