Skip to main content

Column Types

See Data Types for general reference.

Numeric types


Numeric types encoding matches memory layout of little endian cpus like AMD64 or ARM64.

This allows to implement very efficient encoding and decoding.


String of Int and UInt of 8, 16, 32, 64, 128 or 256 bits, in little endian.


Float32 and Float64 in IEEE 754 binary representation.


Just an array of String, i.e. (len, value).


An array of N-byte sequences.


IPv4 is alias of UInt32 numeric type and represented as UInt32.

IPv6 is alias of FixedString(16) and represented as binary directly.


Tuple is just an array of columns. For example, Tuple(String, UInt8) is just two columns encoded continuously.


Map(K, V) consists of three columns: Offsets ColUInt64, Keys K, Values V.

Rows count in Keys and Values column is last value from Offsets.


Array(T) consists of two columns: Offsets ColUInt64, Data T.

Rows count in Data is last value from Offsets.


Nullable(T) consists of Nulls ColUInt8, Values T with same rows count.

// Nulls is nullable "mask" on Values column.
// For example, to encode [null, "", "hello", null, "world"]
// Values: ["", "", "hello", "", "world"] (len: 5)
// Nulls: [ 1, 0, 0, 1, 0] (len: 5)


Alias of FixedString(16), UUID value represented as binary.


Alias of Int8 or Int16, but each integer is mapped to some String value.

Low Cardinality

LowCardinality(T) consists of Index T, Keys K, where K is one of (UInt8, UInt16, UInt32, UInt64) depending on size of Index.

// Index (i.e. dictionary) column contains unique values, Keys column contains
// sequence of indexes in Index column that represent actual values.
// For example, ["Eko", "Eko", "Amadela", "Amadela", "Amadela", "Amadela"] can
// be encoded as:
// Index: ["Eko", "Amadela"] (String)
// Keys: [0, 0, 1, 1, 1, 1] (UInt8)
// The CardinalityKey is chosen depending on Index size, i.e. maximum value
// of chosen type should be able to represent any index of Index element.


Alias of UInt8, where 0 is false and 1 is true.