BSONEachRow
Input | Output | Alias |
---|---|---|
✔ | ✔ |
Description
The BSONEachRow
format parses data as a sequence of Binary JSON (BSON) documents without any separator between them.
Each row is formatted as a single document and each column is formatted as a single BSON document field with the column name as a key.
Data Types Matching
For output it uses the following correspondence between ClickHouse types and BSON types:
ClickHouse type | BSON Type |
---|---|
Bool | \x08 boolean |
Int8/UInt8/Enum8 | \x10 int32 |
Int16/UInt16/Enum16 | \x10 int32 |
Int32 | \x10 int32 |
UInt32 | \x12 int64 |
Int64/UInt64 | \x12 int64 |
Float32/Float64 | \x01 double |
Date/Date32 | \x10 int32 |
DateTime | \x12 int64 |
DateTime64 | \x09 datetime |
Decimal32 | \x10 int32 |
Decimal64 | \x12 int64 |
Decimal128 | \x05 binary, \x00 binary subtype, size = 16 |
Decimal256 | \x05 binary, \x00 binary subtype, size = 32 |
Int128/UInt128 | \x05 binary, \x00 binary subtype, size = 16 |
Int256/UInt256 | \x05 binary, \x00 binary subtype, size = 32 |
String/FixedString | \x05 binary, \x00 binary subtype or \x02 string if setting output_format_bson_string_as_string is enabled |
UUID | \x05 binary, \x04 uuid subtype, size = 16 |
Array | \x04 array |
Tuple | \x04 array |
Named Tuple | \x03 document |
Map | \x03 document |
IPv4 | \x10 int32 |
IPv6 | \x05 binary, \x00 binary subtype |
For input it uses the following correspondence between BSON types and ClickHouse types:
BSON Type | ClickHouse Type |
---|---|
\x01 double | Float32/Float64 |
\x02 string | String/FixedString |
\x03 document | Map/Named Tuple |
\x04 array | Array/Tuple |
\x05 binary, \x00 binary subtype | String/FixedString/IPv6 |
\x05 binary, \x02 old binary subtype | String/FixedString |
\x05 binary, \x03 old uuid subtype | UUID |
\x05 binary, \x04 uuid subtype | UUID |
\x07 ObjectId | String/FixedString |
\x08 boolean | Bool |
\x09 datetime | DateTime64 |
\x0A null value | NULL |
\x0D JavaScript code | String/FixedString |
\x0E symbol | String/FixedString |
\x10 int32 | Int32/UInt32/Decimal32/IPv4/Enum8/Enum16 |
\x12 int64 | Int64/UInt64/Decimal64/DateTime64 |
Other BSON types are not supported. Additionally, it performs conversion between different integer types.
For example, it is possible to insert a BSON int32
value into ClickHouse as UInt8
.
Big integers and decimals such as Int128
/UInt128
/Int256
/UInt256
/Decimal128
/Decimal256
can be parsed from a BSON Binary value with the \x00
binary subtype.
In this case, the format will validate that the size of the binary data equals the size of the expected value.
This format does not work properly on Big-Endian platforms.
Example Usage
Format Settings
Setting | Description | Default |
---|---|---|
output_format_bson_string_as_string | Use BSON String type instead of Binary for String columns. | false |
input_format_bson_skip_fields_with_unsupported_types_in_schema_inference | Allow skipping columns with unsupported types while schema inference for format BSONEachRow. | false |