此类型允许在其中存储任意类型的值,而无需预先了解或枚举所有可能的类型。
要声明一个 Dynamic 类型的列,请使用以下语法:
<column_name> Dynamic(max_types=N)
其中 N 是一个可选参数,取值范围为 0 到 254,用于指示在一个单独存储的数据块内 (例如 MergeTree 表的单个数据分区片段中) ,类型为 Dynamic 的列中可以作为独立子列存储的不同数据类型的数量上限。如果超过此限制,所有具有新类型的值将以二进制形式一起存储在一个特殊的共享数据结构中。max_types 的默认值为 32。
创建 Dynamic 类型
在表的列定义中使用 Dynamic 类型:
CREATE TABLE test (d Dynamic) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT d, dynamicType(d) FROM test;
┌─d─────────────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ 42 │ Int64 │
│ Hello, World! │ String │
│ [1,2,3] │ Array(Int64) │
└───────────────┴────────────────┘
在普通列上使用 CAST:
SELECT 'Hello, World!'::Dynamic AS d, dynamicType(d);
┌─d─────────────┬─dynamicType(d)─┐
│ Hello, World! │ String │
└───────────────┴────────────────┘
对 Variant 列使用 CAST:
SET use_variant_as_common_type = 1;
SELECT multiIf((number % 3) = 0, number, (number % 3) = 1, range(number + 1), NULL)::Dynamic AS d, dynamicType(d) FROM numbers(3)
┌─d─────┬─dynamicType(d)─┐
│ 0 │ UInt64 │
│ [0,1] │ Array(UInt64) │
│ ᴺᵁᴸᴸ │ None │
└───────┴────────────────┘
以子列形式读取 Dynamic 中的嵌套类型
Dynamic 类型支持通过类型名作为子列名,从 Dynamic 列中读取单个嵌套类型。
因此,如果存在一列 d Dynamic,可以使用语法 d.T 读取任意有效类型 T 的子列,
当 T 可以被 Nullable 包裹时,该子列的类型为 Nullable(T),否则为 T。该子列的大小将与原始 Dynamic 列相同,
并且在原始 Dynamic 列中不包含类型 T 的所有行里,该子列将包含 NULL 值 (或者在 T 不能被 Nullable 包裹时为空值) 。
也可以使用函数 dynamicElement(dynamic_column, type_name) 来读取 Dynamic 的子列。
示例:
CREATE TABLE test (d Dynamic) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT d, dynamicType(d), d.String, d.Int64, d.`Array(Int64)`, d.Date, d.`Array(String)` FROM test;
┌─d─────────────┬─dynamicType(d)─┬─d.String──────┬─d.Int64─┬─d.Array(Int64)─┬─d.Date─┬─d.Array(String)─┐
│ ᴺᵁᴸᴸ │ None │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │ ᴺᵁᴸᴸ │ [] │
│ 42 │ Int64 │ ᴺᵁᴸᴸ │ 42 │ [] │ ᴺᵁᴸᴸ │ [] │
│ Hello, World! │ String │ Hello, World! │ ᴺᵁᴸᴸ │ [] │ ᴺᵁᴸᴸ │ [] │
│ [1,2,3] │ Array(Int64) │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │ ᴺᵁᴸᴸ │ [] │
└───────────────┴────────────────┴───────────────┴─────────┴────────────────┴────────┴─────────────────┘
SELECT toTypeName(d.String), toTypeName(d.Int64), toTypeName(d.`Array(Int64)`), toTypeName(d.Date), toTypeName(d.`Array(String)`) FROM test LIMIT 1;
┌─toTypeName(d.String)─┬─toTypeName(d.Int64)─┬─toTypeName(d.Array(Int64))─┬─toTypeName(d.Date)─┬─toTypeName(d.Array(String))─┐
│ Nullable(String) │ Nullable(Int64) │ Array(Int64) │ Nullable(Date) │ Array(String) │
└──────────────────────┴─────────────────────┴────────────────────────────┴────────────────────┴─────────────────────────────┘
SELECT d, dynamicType(d), dynamicElement(d, 'String'), dynamicElement(d, 'Int64'), dynamicElement(d, 'Array(Int64)'), dynamicElement(d, 'Date'), dynamicElement(d, 'Array(String)') FROM test;```
┌─d─────────────┬─dynamicType(d)─┬─dynamicElement(d, 'String')─┬─dynamicElement(d, 'Int64')─┬─dynamicElement(d, 'Array(Int64)')─┬─dynamicElement(d, 'Date')─┬─dynamicElement(d, 'Array(String)')─┐
│ ᴺᵁᴸᴸ │ None │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │ ᴺᵁᴸᴸ │ [] │
│ 42 │ Int64 │ ᴺᵁᴸᴸ │ 42 │ [] │ ᴺᵁᴸᴸ │ [] │
│ Hello, World! │ String │ Hello, World! │ ᴺᵁᴸᴸ │ [] │ ᴺᵁᴸᴸ │ [] │
│ [1,2,3] │ Array(Int64) │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │ ᴺᵁᴸᴸ │ [] │
└───────────────┴────────────────┴─────────────────────────────┴────────────────────────────┴───────────────────────────────────┴───────────────────────────┴────────────────────────────────────┘
To know what variant is stored in each row function dynamicType(dynamic_column) can be used. It returns String with value type name for each row (or 'None' if row is NULL).
Example:
CREATE TABLE test (d Dynamic) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('Hello, World!'), ([1, 2, 3]);
SELECT dynamicType(d) FROM test;
┌─dynamicType(d)─┐
│ None │
│ Int64 │
│ String │
│ Array(Int64) │
└────────────────┘
Conversion between Dynamic 列 and other columns
There are 4 possible conversions that can be performed with Dynamic 列.
Converting an ordinary column to a Dynamic 列
SELECT 'Hello, World!'::Dynamic AS d, dynamicType(d);
┌─d─────────────┬─dynamicType(d)─┐
│ Hello, World! │ String │
└───────────────┴────────────────┘
通过解析将 String 列转换为 Dynamic 列
要从 String 列中解析出 Dynamic 类型的值,您可以启用设置 cast_string_to_dynamic_use_inference:
SET cast_string_to_dynamic_use_inference = 1;
SELECT CAST(materialize(map('key1', '42', 'key2', 'true', 'key3', '2020-01-01')), 'Map(String, Dynamic)') as map_of_dynamic, mapApply((k, v) -> (k, dynamicType(v)), map_of_dynamic) as map_of_dynamic_types;
┌─map_of_dynamic──────────────────────────────┬─map_of_dynamic_types─────────────────────────┐
│ {'key1':42,'key2':true,'key3':'2020-01-01'} │ {'key1':'Int64','key2':'Bool','key3':'Date'} │
└─────────────────────────────────────────────┴──────────────────────────────────────────────┘
Converting a Dynamic 列 to an ordinary column
It is possible to convert a Dynamic 列 to an ordinary column. In this case all nested types will be converted to a destination type:
CREATE TABLE test (d Dynamic) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('42.42'), (true), ('e10');
SELECT d::Nullable(Float64) FROM test;
┌─CAST(d, 'Nullable(Float64)')─┐
│ ᴺᵁᴸᴸ │
│ 42 │
│ 42.42 │
│ 1 │
│ 0 │
└──────────────────────────────┘
将 Variant 列转换为 Dynamic 列
CREATE TABLE test (v Variant(UInt64, String, Array(UInt64))) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), ('String'), ([1, 2, 3]);
SELECT v::Dynamic AS d, dynamicType(d) FROM test;
┌─d───────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ 42 │ UInt64 │
│ String │ String │
│ [1,2,3] │ Array(UInt64) │
└─────────┴────────────────┘
将 Dynamic(max_types=N) 列转换为另一个 Dynamic(max_types=K)
如果 K >= N,则在转换过程中数据不会发生变化:
CREATE TABLE test (d Dynamic(max_types=3)) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), (43), ('42.42'), (true);
SELECT d::Dynamic(max_types=5) as d2, dynamicType(d2) FROM test;
┌─d─────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ 42 │ Int64 │
│ 43 │ Int64 │
│ 42.42 │ String │
│ true │ Bool │
└───────┴────────────────┘
如果 K < N,则类型最少见的值会被插入到一个特殊的单一子列中,但仍可访问:
CREATE TABLE test (d Dynamic(max_types=4)) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), (43), ('42.42'), (true), ([1, 2, 3]);
SELECT d, dynamicType(d), d::Dynamic(max_types=2) as d2, dynamicType(d2), isDynamicElementInSharedData(d2) FROM test;
┌─d───────┬─dynamicType(d)─┬─d2──────┬─dynamicType(d2)─┬─isDynamicElementInSharedData(d2)─┐
│ ᴺᵁᴸᴸ │ None │ ᴺᵁᴸᴸ │ None │ false │
│ 42 │ Int64 │ 42 │ Int64 │ false │
│ 43 │ Int64 │ 43 │ Int64 │ false │
│ 42.42 │ String │ 42.42 │ String │ false │
│ true │ Bool │ true │ Bool │ true │
│ [1,2,3] │ Array(Int64) │ [1,2,3] │ Array(Int64) │ true │
└─────────┴────────────────┴─────────┴─────────────────┴──────────────────────────────────┘
Functions isDynamicElementInSharedData returns true for rows that are stored in a special shared data structure inside Dynamic and as we can see, resulting column contains only 2 types that are not stored in shared data structure.
If K=0, all types will be inserted into single special subcolumn:
CREATE TABLE test (d Dynamic(max_types=4)) ENGINE = Memory;
INSERT INTO test VALUES (NULL), (42), (43), ('42.42'), (true), ([1, 2, 3]);
SELECT d, dynamicType(d), d::Dynamic(max_types=0) as d2, dynamicType(d2), isDynamicElementInSharedData(d2) FROM test;
┌─d───────┬─dynamicType(d)─┬─d2──────┬─dynamicType(d2)─┬─isDynamicElementInSharedData(d2)─┐
│ ᴺᵁᴸᴸ │ None │ ᴺᵁᴸᴸ │ None │ false │
│ 42 │ Int64 │ 42 │ Int64 │ true │
│ 43 │ Int64 │ 43 │ Int64 │ true │
│ 42.42 │ String │ 42.42 │ String │ true │
│ true │ Bool │ true │ Bool │ true │
│ [1,2,3] │ Array(Int64) │ [1,2,3] │ Array(Int64) │ true │
└─────────┴────────────────┴─────────┴─────────────────┴──────────────────────────────────┘
Reading Dynamic 类型 from the data
All text formats (TSV, CSV, CustomSeparated, Values, JSONEachRow, etc) supports reading Dynamic 类型. During data parsing ClickHouse tries to infer the type of each value and use it during insertion to Dynamic column.
Example:
SELECT
d,
dynamicType(d),
dynamicElement(d, 'String') AS str,
dynamicElement(d, 'Int64') AS num,
dynamicElement(d, 'Float64') AS float,
dynamicElement(d, 'Date') AS date,
dynamicElement(d, 'Array(Int64)') AS arr
FROM format(JSONEachRow, 'd Dynamic', $$
{"d" : "Hello, World!"},
{"d" : 42},
{"d" : 42.42},
{"d" : "2020-01-01"},
{"d" : [1, 2, 3]}
$$)
┌─d─────────────┬─dynamicType(d)─┬─str───────────┬──num─┬─float─┬───────date─┬─arr─────┐
│ Hello, World! │ String │ Hello, World! │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42 │ Int64 │ ᴺᵁᴸᴸ │ 42 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [] │
│ 42.42 │ Float64 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ 42.42 │ ᴺᵁᴸᴸ │ [] │
│ 2020-01-01 │ Date │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ 2020-01-01 │ [] │
│ [1,2,3] │ Array(Int64) │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ [1,2,3] │
└───────────────┴────────────────┴───────────────┴──────┴───────┴────────────┴─────────┘
Using Dynamic type in functions
Most of the functions support arguments with type Dynamic. In this case the function is executed separately on each internal data type stored inside Dynamic column.
When the result type of the function depends on the arguments types, the result of such function executed with Dynamic arguments will be Dynamic. When the result type of the function doesn't depend on the arguments types - the result will be Nullable(T) where T the usual result type of this function.
示例:
CREATE TABLE test (d Dynamic) ENGINE=Memory;
INSERT INTO test VALUES (NULL), (1::Int8), (2::Int16), (3::Int32), (4::Int64);
SELECT d, dynamicType(d) FROM test;
┌─d────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ 1 │ Int8 │
│ 2 │ Int16 │
│ 3 │ Int32 │
│ 4 │ Int64 │
└──────┴────────────────┘
SELECT d, d + 1 AS res, toTypeName(res), dynamicType(res) FROM test;
┌─d────┬─res──┬─toTypeName(res)─┬─dynamicType(res)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Dynamic │ None │
│ 1 │ 2 │ Dynamic │ Int16 │
│ 2 │ 3 │ Dynamic │ Int32 │
│ 3 │ 4 │ Dynamic │ Int64 │
│ 4 │ 5 │ Dynamic │ Int64 │
└──────┴──────┴─────────────────┴──────────────────┘
SELECT d, d + d AS res, toTypeName(res), dynamicType(res) FROM test;
┌─d────┬─res──┬─toTypeName(res)─┬─dynamicType(res)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Dynamic │ None │
│ 1 │ 2 │ Dynamic │ Int16 │
│ 2 │ 4 │ Dynamic │ Int32 │
│ 3 │ 6 │ Dynamic │ Int64 │
│ 4 │ 8 │ Dynamic │ Int64 │
└──────┴──────┴─────────────────┴──────────────────┘
SELECT d, d < 3 AS res, toTypeName(res) FROM test;
┌─d────┬──res─┬─toTypeName(res)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Nullable(UInt8) │
│ 1 │ 1 │ Nullable(UInt8) │
│ 2 │ 1 │ Nullable(UInt8) │
│ 3 │ 0 │ Nullable(UInt8) │
│ 4 │ 0 │ Nullable(UInt8) │
└──────┴──────┴─────────────────┘
SELECT d, exp2(d) AS res, toTypeName(res) FROM test;
┌─d────┬──res─┬─toTypeName(res)───┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Nullable(Float64) │
│ 1 │ 2 │ Nullable(Float64) │
│ 2 │ 4 │ Nullable(Float64) │
│ 3 │ 8 │ Nullable(Float64) │
│ 4 │ 16 │ Nullable(Float64) │
└──────┴──────┴───────────────────┘
TRUNCATE TABLE test;
INSERT INTO test VALUES (NULL), ('str_1'), ('str_2');
SELECT d, dynamicType(d) FROM test;
┌─d─────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ str_1 │ String │
│ str_2 │ String │
└───────┴────────────────┘
SELECT d, upper(d) AS res, toTypeName(res) FROM test;
┌─d─────┬─res───┬─toTypeName(res)──┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Nullable(String) │
│ str_1 │ STR_1 │ Nullable(String) │
│ str_2 │ STR_2 │ Nullable(String) │
└───────┴───────┴──────────────────┘
SELECT d, extract(d, '([0-3])') AS res, toTypeName(res) FROM test;
┌─d─────┬─res──┬─toTypeName(res)──┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Nullable(String) │
│ str_1 │ 1 │ Nullable(String) │
│ str_2 │ 2 │ Nullable(String) │
└───────┴──────┴──────────────────┘
TRUNCATE TABLE test;
INSERT INTO test VALUES (NULL), ([1, 2]), ([3, 4]);
SELECT d, dynamicType(d) FROM test;
┌─d─────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ [1,2] │ Array(Int64) │
│ [3,4] │ Array(Int64) │
└───────┴────────────────┘
SELECT d, d[1] AS res, toTypeName(res), dynamicType(res) FROM test;
┌─d─────┬─res──┬─toTypeName(res)─┬─dynamicType(res)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Dynamic │ None │
│ [1,2] │ 1 │ Dynamic │ Int64 │
│ [3,4] │ 3 │ Dynamic │ Int64 │
└───────┴──────┴─────────────────┴──────────────────┘
If function cannot be executed on some type inside Dynamic column, the exception will be thrown:
INSERT INTO test VALUES (42), (43), ('str_1');
SELECT d, dynamicType(d) FROM test;
┌─d─────┬─dynamicType(d)─┐
│ 42 │ Int64 │
│ 43 │ Int64 │
│ str_1 │ String │
└───────┴────────────────┘
┌─d─────┬─dynamicType(d)─┐
│ ᴺᵁᴸᴸ │ None │
│ [1,2] │ Array(Int64) │
│ [3,4] │ Array(Int64) │
└───────┴────────────────┘
SELECT d, d + 1 AS res, toTypeName(res), dynamicType(d) FROM test;
Received exception:
Code: 43. DB::Exception: Illegal types Array(Int64) and UInt8 of arguments of function plus: while executing 'FUNCTION plus(__table1.d : 3, 1_UInt8 :: 1) -> plus(__table1.d, 1_UInt8) Dynamic : 0'. (ILLEGAL_TYPE_OF_ARGUMENT)
我们可以过滤掉不需要的类型:
SELECT d, d + 1 AS res, toTypeName(res), dynamicType(res) FROM test WHERE dynamicType(d) NOT IN ('String', 'Array(Int64)', 'None')
┌─d──┬─res─┬─toTypeName(res)─┬─dynamicType(res)─┐
│ 42 │ 43 │ Dynamic │ Int64 │
│ 43 │ 44 │ Dynamic │ Int64 │
└────┴─────┴─────────────────┴──────────────────┘
或者以子列形式提取所需类型:
SELECT d, d.Int64 + 1 AS res, toTypeName(res) FROM test;
┌─d─────┬──res─┬─toTypeName(res)─┐
│ 42 │ 43 │ Nullable(Int64) │
│ 43 │ 44 │ Nullable(Int64) │
│ str_1 │ ᴺᵁᴸᴸ │ Nullable(Int64) │
└───────┴──────┴─────────────────┘
┌─d─────┬──res─┬─toTypeName(res)─┐
│ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ Nullable(Int64) │
│ [1,2] │ ᴺᵁᴸᴸ │ Nullable(Int64) │
│ [3,4] │ ᴺᵁᴸᴸ │ Nullable(Int64) │
└───────┴──────┴─────────────────┘
类型不匹配时的行为
设置 dynamic_throw_on_type_mismatch 用于控制:当函数应用于 Dynamic 列,而某一行实际存储的类型与该函数不兼容时,应如何处理:
true (默认) — 遇到第一个不兼容的行时抛出异常 (ILLEGAL_TYPE_OF_ARGUMENT) 。
false — 对不兼容的行返回 NULL,同时保留兼容行的结果。
示例:
CREATE TABLE test (d Dynamic) ENGINE = Memory;
INSERT INTO test VALUES ('world'), (123), (456);
-- Default (throw on mismatch): length() does not accept integers, so the query throws.
SELECT length(d) FROM test; -- throws ILLEGAL_TYPE_OF_ARGUMENT
-- With throw disabled: incompatible rows return NULL.
SET dynamic_throw_on_type_mismatch = false;
SELECT d, length(d) FROM test ORDER BY d::String NULLS LAST;
┌─d─────┬─length(d)─┐
│ world │ 5 │
│ 123 │ ᴺᵁᴸᴸ │
│ 456 │ ᴺᵁᴸᴸ │
└───────┴───────────┘
在 ORDER BY 和 GROUP BY 中使用 Dynamic 类型
在 ORDER BY 和 GROUP BY 中,Dynamic 类型的值的比较方式与 Variant 类型的值类似:
对于 Dynamic 类型中底层类型为 T1 的值 d1 和底层类型为 T2 的值 d2,运算符 < 的结果定义如下:
- 如果
T1 = T2 = T,结果为 d1.T < d2.T (将比较底层值) 。
- 如果
T1 != T2,结果为 T1 < T2 (将比较类型名称) 。
默认情况下,Dynamic 类型不允许用作 GROUP BY/ORDER BY 键;如果要使用它,请考虑其特殊的比较规则,并启用 allow_suspicious_types_in_group_by/allow_suspicious_types_in_order_by 设置。
示例:
CREATE TABLE test (d Dynamic) ENGINE=Memory;
INSERT INTO test VALUES (42), (43), ('abc'), ('abd'), ([1, 2, 3]), ([]), (NULL);
SELECT d, dynamicType(d) FROM test;
┌─d───────┬─dynamicType(d)─┐
│ 42 │ Int64 │
│ 43 │ Int64 │
│ abc │ String │
│ abd │ String │
│ [1,2,3] │ Array(Int64) │
│ [] │ Array(Int64) │
│ ᴺᵁᴸᴸ │ None │
└─────────┴────────────────┘
SELECT d, dynamicType(d) FROM test ORDER BY d SETTINGS allow_suspicious_types_in_order_by=1;
┌─d───────┬─dynamicType(d)─┐
│ [] │ Array(Int64) │
│ [1,2,3] │ Array(Int64) │
│ 42 │ Int64 │
│ 43 │ Int64 │
│ abc │ String │
│ abd │ String │
│ ᴺᵁᴸᴸ │ None │
└─────────┴────────────────┘
Note: values of dynamic types with different numeric types are considered as different values and not compared between each other, their type names are compared instead.
Example:
CREATE TABLE test (d Dynamic) ENGINE=Memory;
INSERT INTO test VALUES (1::UInt32), (1::Int64), (100::UInt32), (100::Int64);
SELECT d, dynamicType(d) FROM test ORDER BY d SETTINGS allow_suspicious_types_in_order_by=1;
┌─v───┬─dynamicType(v)─┐
│ 1 │ Int64 │
│ 100 │ Int64 │
│ 1 │ UInt32 │
│ 100 │ UInt32 │
└─────┴────────────────┘
SELECT d, dynamicType(d) FROM test GROUP BY d SETTINGS allow_suspicious_types_in_group_by=1;
┌─d───┬─dynamicType(d)─┐
│ 1 │ Int64 │
│ 100 │ UInt32 │
│ 1 │ UInt32 │
│ 100 │ Int64 │
└─────┴────────────────┘
**注意:**由于 Dynamic 类型的函数具有特殊行为,上述比较规则不会在执行 </>/= 等比较函数时应用。
Reaching the limit in number of different data types stored inside Dynamic
Dynamic data type can store only limited number of different data types as separate 子列. By default, this limit is 32, but you can change it in type declaration using syntax Dynamic(max_types=N) where N is between 0 and 254 (due to implementation details, it's impossible to have more than 254 different data types that can be stored as separate 子列 inside Dynamic).
When the limit is reached, all new data types inserted to Dynamic column will be inserted into a single 共享数据结构 that stores values with different data types in binary form.
Let's see what happens when the limit is reached in different scenarios.
Reaching the limit during data parsing
在从数据中解析 Dynamic 值时,当当前数据块达到限制后,所有新值都将被插入到共享数据结构中:
SELECT d, dynamicType(d), isDynamicElementInSharedData(d) FROM format(JSONEachRow, 'd Dynamic(max_types=3)', '
{"d" : 42}
{"d" : [1, 2, 3]}
{"d" : "Hello, World!"}
{"d" : "2020-01-01"}
{"d" : ["str1", "str2", "str3"]}
{"d" : {"a" : 1, "b" : [1, 2, 3]}}
')
┌─d──────────────────────┬─dynamicType(d)─────────────────┬─isDynamicElementInSharedData(d)─┐
│ 42 │ Int64 │ false │
│ [1,2,3] │ Array(Int64) │ false │
│ Hello, World! │ String │ false │
│ 2020-01-01 │ Date │ true │
│ ['str1','str2','str3'] │ Array(String) │ true │
│ (1,[1,2,3]) │ Tuple(a Int64, b Array(Int64)) │ true │
└────────────────────────┴────────────────────────────────┴─────────────────────────────────┘
As we can see, after inserting 3 different data types Int64, Array(Int64) and String all new types were inserted into special shared data structure.
During merges of 数据分区片段 in MergeTree table engines
During merge of several 数据分区片段 in MergeTree table the Dynamic column in the resulting 数据分区片段 can reach the limit of different data types that can be stored in separate subcolumns inside and won't be able to store all types as subcolumns from source parts.
In this case ClickHouse chooses what types will remain as separate subcolumns after merge and what types will be inserted into shared data structure. In most cases ClickHouse tries to keep the most frequent types and store the rarest types in shared data structure, but it depends on the implementation.
Let's see an example of such merge. First, let's create a table with Dynamic column, set the limit of different data types to 3 and insert values with 5 different types:
CREATE TABLE test (id UInt64, d Dynamic(max_types=3)) ENGINE=MergeTree ORDER BY id;
SYSTEM STOP MERGES test;
INSERT INTO test SELECT number, number FROM numbers(5);
INSERT INTO test SELECT number, range(number) FROM numbers(4);
INSERT INTO test SELECT number, toDate(number) FROM numbers(3);
INSERT INTO test SELECT number, map(number, number) FROM numbers(2);
INSERT INTO test SELECT number, 'str_' || toString(number) FROM numbers(1);
Each insert will create a separate 数据分区片段 with Dynamic column containing single type:
SELECT count(), dynamicType(d), isDynamicElementInSharedData(d), _part FROM test GROUP BY _part, dynamicType(d), isDynamicElementInSharedData(d) ORDER BY _part, count();
┌─count()─┬─dynamicType(d)──────┬─isDynamicElementInSharedData(d)─┬─_part─────┐
│ 5 │ UInt64 │ false │ all_1_1_0 │
│ 4 │ Array(UInt64) │ false │ all_2_2_0 │
│ 3 │ Date │ false │ all_3_3_0 │
│ 2 │ Map(UInt64, UInt64) │ false │ all_4_4_0 │
│ 1 │ String │ false │ all_5_5_0 │
└─────────┴─────────────────────┴─────────────────────────────────┴───────────┘
现在,我们把所有 parts 合并成一个,看看会发生什么:
SYSTEM START MERGES test;
OPTIMIZE TABLE test FINAL;
SELECT count(), dynamicType(d), isDynamicElementInSharedData(d), _part FROM test GROUP BY _part, dynamicType(d), isDynamicElementInSharedData(d) ORDER BY _part, count() desc;
┌─count()─┬─dynamicType(d)──────┬─isDynamicElementInSharedData(d)─┬─_part─────┐
│ 5 │ UInt64 │ false │ all_1_5_2 │
│ 4 │ Array(UInt64) │ false │ all_1_5_2 │
│ 3 │ Date │ false │ all_1_5_2 │
│ 2 │ Map(UInt64, UInt64) │ true │ all_1_5_2 │
│ 1 │ String │ true │ all_1_5_2 │
└─────────┴─────────────────────┴─────────────────────────────────┴───────────┘
正如我们所见,ClickHouse 将最常见的类型 UInt64 和 Array(UInt64) 保留为子列,并将所有其他类型都写入共享数据中。
All JSONExtract* functions support Dynamic 类型:
SELECT JSONExtract('{"a" : [1, 2, 3]}', 'a', 'Dynamic') AS dynamic, dynamicType(dynamic) AS dynamic_type;
┌─dynamic─┬─dynamic_type───────────┐
│ [1,2,3] │ Array(Nullable(Int64)) │
└─────────┴────────────────────────┘
SELECT JSONExtract('{"obj" : {"a" : 42, "b" : "Hello", "c" : [1,2,3]}}', 'obj', 'Map(String, Dynamic)') AS map_of_dynamics, mapApply((k, v) -> (k, dynamicType(v)), map_of_dynamics) AS map_of_dynamic_types
┌─map_of_dynamics──────────────────┬─map_of_dynamic_types────────────────────────────────────┐
│ {'a':42,'b':'Hello','c':[1,2,3]} │ {'a':'Int64','b':'String','c':'Array(Nullable(Int64))'} │
└──────────────────────────────────┴─────────────────────────────────────────────────────────┘
SELECT JSONExtractKeysAndValues('{"a" : 42, "b" : "Hello", "c" : [1,2,3]}', 'Dynamic') AS dynamics, arrayMap(x -> (x.1, dynamicType(x.2)), dynamics) AS dynamic_types```
┌─dynamics───────────────────────────────┬─dynamic_types─────────────────────────────────────────────────┐
│ [('a',42),('b','Hello'),('c',[1,2,3])] │ [('a','Int64'),('b','String'),('c','Array(Nullable(Int64))')] │
└────────────────────────────────────────┴───────────────────────────────────────────────────────────────┘
在 RowBinary 格式中,Dynamic 类型的值以如下格式序列化:
<binary_encoded_data_type><value_in_binary_format_according_to_the_data_type>