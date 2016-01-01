Arrow
|Input
|Output
|Alias
|✔
|✔
Description
Apache Arrow comes with two built-in columnar storage formats. ClickHouse supports read and write operations for these formats.
Arrow is Apache Arrow's "file mode" format. It is designed for in-memory random access.
Data Types Matching
The table below shows the supported data types and how they correspond to ClickHouse data types in
INSERT and
SELECT queries.
|Arrow data type (
INSERT)
|ClickHouse data type
|Arrow data type (
SELECT)
BOOL
|Bool
BOOL
UINT8,
BOOL
|UInt8
UINT8
INT8
|Int8/Enum8
INT8
UINT16
|UInt16
UINT16
INT16
|Int16/Enum16
INT16
UINT32
|UInt32
UINT32
INT32
|Int32
INT32
UINT64
|UInt64
UINT64
INT64
|Int64
INT64
FLOAT,
HALF_FLOAT
|Float32
FLOAT32
DOUBLE
|Float64
FLOAT64
DATE32
|Date32
UINT16
DATE64
|DateTime
UINT32
TIMESTAMP,
TIME32,
TIME64
|DateTime64
UINT32
STRING,
BINARY
|String
BINARY
STRING,
BINARY,
FIXED_SIZE_BINARY
|FixedString
FIXED_SIZE_BINARY
DECIMAL
|Decimal
DECIMAL
DECIMAL256
|Decimal256
DECIMAL256
LIST
|Array
LIST
STRUCT
|Tuple
STRUCT
MAP
|Map
MAP
UINT32
|IPv4
UINT32
FIXED_SIZE_BINARY,
BINARY
|IPv6
FIXED_SIZE_BINARY
FIXED_SIZE_BINARY,
BINARY
|Int128/UInt128/Int256/UInt256
FIXED_SIZE_BINARY
Arrays can be nested and can have a value of the
Nullable type as an argument.
Tuple and
Map types can also be nested.
The
DICTIONARY type is supported for
INSERT queries, and for
SELECT queries there is an
output_format_arrow_low_cardinality_as_dictionary setting that allows to output LowCardinality type as a
DICTIONARY type.
Unsupported Arrow data types:
FIXED_SIZE_BINARY
JSON
UUID
ENUM.
The data types of ClickHouse table columns do not have to match the corresponding Arrow data fields. When inserting data, ClickHouse interprets data types according to the table above and then casts the data to the data type set for the ClickHouse table column.
Example Usage
Inserting Data
You can insert Arrow data from a file into ClickHouse table using the following command:
Selecting Data
You can select data from a ClickHouse table and save it into some file in the Arrow format using the following command:
Format Settings
|Setting
|Description
|Default
input_format_arrow_allow_missing_columns
|Allow missing columns while reading Arrow input formats
1
input_format_arrow_case_insensitive_column_matching
|Ignore case when matching Arrow columns with CH columns.
0
input_format_arrow_import_nested
|Obsolete setting, does nothing.
0
input_format_arrow_skip_columns_with_unsupported_types_in_schema_inference
|Skip columns with unsupported types while schema inference for format Arrow
0
output_format_arrow_compression_method
|Compression method for Arrow output format. Supported codecs: lz4_frame, zstd, none (uncompressed)
lz4_frame
output_format_arrow_fixed_string_as_fixed_byte_array
|Use Arrow FIXED_SIZE_BINARY type instead of Binary for FixedString columns.
1
output_format_arrow_low_cardinality_as_dictionary
|Enable output LowCardinality type as Dictionary Arrow type
0
output_format_arrow_string_as_string
|Use Arrow String type instead of Binary for String columns
1
output_format_arrow_use_64_bit_indexes_for_dictionary
|Always use 64 bit integers for dictionary indexes in Arrow format
0
output_format_arrow_use_signed_indexes_for_dictionary
|Use signed integers for dictionary indexes in Arrow format
1