See Data Types for general reference.
Numeric types encoding matches memory layout of little endian cpus like AMD64 or ARM64.
This allows to implement very efficient encoding and decoding.
String of Int and UInt of 8, 16, 32, 64, 128 or 256 bits, in little endian.
Float32 and Float64 in IEEE 754 binary representation.
Just an array of String, i.e. (len, value).
An array of N-byte sequences.
IPv4 is alias of
UInt32 numeric type and represented as UInt32.
IPv6 is alias of
FixedString(16) and represented as binary directly.
Tuple is just an array of columns. For example, Tuple(String, UInt8) is just two columns encoded continuously.
Map(K, V) consists of three columns:
Offsets ColUInt64, Keys K, Values V.
Rows count in
Values column is last value from
Array(T) consists of two columns:
Offsets ColUInt64, Data T.
Rows count in
Data is last value from
Nullable(T) consists of
Nulls ColUInt8, Values T with same rows count.
// Nulls is nullable "mask" on Values column.
// For example, to encode [null, "", "hello", null, "world"]
// Values: ["", "", "hello", "", "world"] (len: 5)
// Nulls: [ 1, 0, 0, 1, 0] (len: 5)
FixedString(16), UUID value represented as binary.
Int16, but each integer is mapped to some
LowCardinality(T) consists of
Index T, Keys K,
K is one of (UInt8, UInt16, UInt32, UInt64) depending on size of
// Index (i.e. dictionary) column contains unique values, Keys column contains
// sequence of indexes in Index column that represent actual values.
// For example, ["Eko", "Eko", "Amadela", "Amadela", "Amadela", "Amadela"] can
// be encoded as:
// Index: ["Eko", "Amadela"] (String)
// Keys: [0, 0, 1, 1, 1, 1] (UInt8)
// The CardinalityKey is chosen depending on Index size, i.e. maximum value
// of chosen type should be able to represent any index of Index element.
0 is false and
1 is true.