输入和输出数据的格式

ClickHouse 支持大多数已知的文本和二进制数据格式。这使得几乎可以轻松地将其集成到任何工作数据管道中，从而利用 ClickHouse 的优势。

输入格式

输入格式用于：

解析提供给 INSERT 语句的数据
从基于文件的表（如 File、URL 或 HDFS）执行 SELECT 查询
读取字典

选择合适的输入格式对 ClickHouse 中高效的数据摄取至关重要。随着超过 70 种支持的格式，选择具有最佳性能的选项可以显著影响插入速度、CPU 和内存使用以及整体系统效率。为了帮助导航这些选择，我们对各种格式的摄取性能进行了基准测试，揭示了关键结论：

Native 格式是最有效的输入格式，提供最佳的压缩效果、最低的资源使用和最小的服务器端处理开销。
压缩是至关重要的 - LZ4 以最低的 CPU 成本减少数据大小，而 ZSTD 提供更高的压缩率，但会增加额外的 CPU 使用。
预排序影响适中，因为 ClickHouse 本身已经高效地进行排序。
批处理显著提高效率 - 较大的批处理可以减少插入开销，提高吞吐量。

要深入了解结果和最佳实践，阅读完整的基准分析。有关完整的测试结果，请查阅 FastFormats 在线仪表板。

输出格式

支持的输出格式用于：

安排 SELECT 查询的结果
执行对基于文件的表的 INSERT 操作

格式概述

支持的格式为：

格式	输入	输出
TabSeparated	✔	✔
TabSeparatedRaw	✔	✔
TabSeparatedWithNames	✔	✔
TabSeparatedWithNamesAndTypes	✔	✔
TabSeparatedRawWithNames	✔	✔
TabSeparatedRawWithNamesAndTypes	✔	✔
Template	✔	✔
TemplateIgnoreSpaces	✔	✗
CSV	✔	✔
CSVWithNames	✔	✔
CSVWithNamesAndTypes	✔	✔
CustomSeparated	✔	✔
CustomSeparatedWithNames	✔	✔
CustomSeparatedWithNamesAndTypes	✔	✔
SQLInsert	✗	✔
Values	✔	✔
Vertical	✗	✔
JSON	✔	✔
JSONAsString	✔	✗
JSONAsObject	✔	✗
JSONStrings	✔	✔
JSONColumns	✔	✔
JSONColumnsWithMetadata	✔	✔
JSONCompact	✔	✔
JSONCompactStrings	✗	✔
JSONCompactColumns	✔	✔
JSONEachRow	✔	✔
PrettyJSONEachRow	✗	✔
JSONEachRowWithProgress	✗	✔
JSONStringsEachRow	✔	✔
JSONStringsEachRowWithProgress	✗	✔
JSONCompactEachRow	✔	✔
JSONCompactEachRowWithNames	✔	✔
JSONCompactEachRowWithNamesAndTypes	✔	✔
JSONCompactEachRowWithProgress	✗	✔
JSONCompactStringsEachRow	✔	✔
JSONCompactStringsEachRowWithNames	✔	✔
JSONCompactStringsEachRowWithNamesAndTypes	✔	✔
JSONCompactStringsEachRowWithProgress	✗	✔
JSONObjectEachRow	✔	✔
BSONEachRow	✔	✔
TSKV	✔	✔
Pretty	✗	✔
PrettyNoEscapes	✗	✔
PrettyMonoBlock	✗	✔
PrettyNoEscapesMonoBlock	✗	✔
PrettyCompact	✗	✔
PrettyCompactNoEscapes	✗	✔
PrettyCompactMonoBlock	✗	✔
PrettyCompactNoEscapesMonoBlock	✗	✔
PrettySpace	✗	✔
PrettySpaceNoEscapes	✗	✔
PrettySpaceMonoBlock	✗	✔
PrettySpaceNoEscapesMonoBlock	✗	✔
Prometheus	✗	✔
Protobuf	✔	✔
ProtobufSingle	✔	✔
ProtobufList	✔	✔
Avro	✔	✔
AvroConfluent	✔	✗
Parquet	✔	✔
ParquetMetadata	✔	✗
Arrow	✔	✔
ArrowStream	✔	✔
ORC	✔	✔
One	✔	✗
Npy	✔	✔
RowBinary	✔	✔
RowBinaryWithNames	✔	✔
RowBinaryWithNamesAndTypes	✔	✔
RowBinaryWithDefaults	✔	✗
Native	✔	✔
Null	✗	✔
XML	✗	✔
CapnProto	✔	✔
LineAsString	✔	✔
Regexp	✔	✗
RawBLOB	✔	✔
MsgPack	✔	✔
MySQLDump	✔	✗
DWARF	✔	✗
Markdown	✗	✔
Form	✔	✗

您可以通过 ClickHouse 设置控制一些格式处理参数。有关更多信息，请阅读设置部分。

Template

请参见 Template

TemplateIgnoreSpaces

请参见 TemplateIgnoreSpaces

TSKV

请参见 TSKV

CSV

请参见 CSV

CSVWithNames

请参见 CSVWithNames

CSVWithNamesAndTypes

请参见 CSVWithNamesAndTypes

CustomSeparated

请参见 CustomSeparated

CustomSeparatedWithNames

请参见 CustomSeparatedWithNames

CustomSeparatedWithNamesAndTypes

请参见 CustomSeparatedWithNamesAndTypes

SQLInsert

请参见 SQLInsert

JSON

请参见 JSON

JSONStrings

请参见 JSONStrings

JSONColumns

请参见 JSONColumns

JSONColumnsWithMetadata

请参见 JSONColumnsWithMetadata

JSONAsString

请参见 JSONAsString

JSONAsObject

请参见 JSONAsObject

JSONCompact

请参见 JSONCompact

JSONCompactStrings

请参见 JSONCompactStrings

JSONCompactColumns

请参见 JSONCompactColumns

JSONEachRow

请参见 JSONEachRow

PrettyJSONEachRow

请参见 PrettyJSONEachRow

JSONStringsEachRow

请参见 JSONStringsEachRow

JSONCompactEachRow

请参见 JSONCompactEachRow

JSONCompactStringsEachRow

请参见 JSONCompactStringsEachRow

JSONEachRowWithProgress

请参见 JSONEachRowWithProgress

JSONStringsEachRowWithProgress

请参见 JSONStringsEachRowWithProgress

JSONCompactEachRowWithNames

请参见 JSONCompactEachRowWithNames

JSONCompactEachRowWithNamesAndTypes

请参见 JSONCompactEachRowWithNamesAndTypes

JSONCompactEachRowWithProgress

类似于 JSONEachRowWithProgress，但以紧凑形式输出 row 事件，如 JSONCompactEachRow 格式中那样。

JSONCompactStringsEachRowWithNames

请参见 JSONCompactStringsEachRowWithNames

JSONCompactStringsEachRowWithNamesAndTypes

请参见 JSONCompactStringsEachRowWithNamesAndTypes

JSONObjectEachRow

请参见 JSONObjectEachRow

JSON 格式设置

请参见 JSON 格式设置

BSONEachRow

请参见 BSONEachRow

Native

请参见 Native

Null

请参见 Null

Pretty

请参见 Pretty

PrettyNoEscapes

请参见 PrettyNoEscapes

PrettyMonoBlock

请参见 PrettyMonoBlock

PrettyNoEscapesMonoBlock

请参见 PrettyNoEscapesMonoBlock

PrettyCompact

请参见 PrettyCompact

PrettyCompactNoEscapes

请参见 PrettyCompactNoEscapes

PrettyCompactMonoBlock

请参见 PrettyCompactMonoBlock

PrettyCompactNoEscapesMonoBlock

请参见 PrettyCompactNoEscapesMonoBlock

PrettySpace

请参见 PrettySpace

PrettySpaceNoEscapes

请参见 PrettySpaceNoEscapes

PrettySpaceMonoBlock

请参见 PrettySpaceMonoBlock

PrettySpaceNoEscapesMonoBlock

请参见 PrettySpaceNoEscapesMonoBlock

RowBinary

请参见 RowBinary

RowBinaryWithNames

请参见 RowBinaryWithNames

RowBinaryWithNamesAndTypes

请参见 RowBinaryWithNamesAndTypes

RowBinaryWithDefaults

请参见 RowBinaryWithDefaults

Values

请参见 Values

Vertical

请参见 Vertical

XML

请参见 XML

CapnProto

请参见 CapnProto

Prometheus

请参见 Prometheus

Protobuf

请参见 Protobuf

ProtobufSingle

请参见 ProtobufSingle

ProtobufList

请参见 ProtobufList

Avro

请参见 Avro

AvroConfluent

请参见 AvroConfluent

Parquet

请参见 Parquet

ParquetMetadata

请参见 ParquetMetadata

Arrow

请参见 Arrow

ArrowStream

请参见 ArrowStream

ORC

请参见 ORC

One

请参见 One

Npy

请参见 Npy

LineAsString

请参见：

Regexp

请参见 Regexp

RawBLOB

请参见 RawBLOB

Markdown

请参见 Markdown

MsgPack

请参见 MsgPack

MySQLDump

请参见 MySQLDump

DWARF

请参见 Dwarf

Form

请参见 Form

格式架构

包含格式架构的文件名通过设置 format_schema 设置。在使用格式 Cap'n Proto 和 Protobuf 时，必须设置此设置。格式架构是文件名和该文件中消息类型名称的组合，以冒号分隔，例如 schemafile.proto:MessageType。如果文件具有该格式的标准扩展名（例如，Protobuf 的 .proto），则可以省略，在这种情况下，格式架构看起来像 schemafile:MessageType。

如果您通过客户端在交互模式下输入或输出数据，则格式架构中指定的文件名可以包含绝对路径或相对于客户端当前目录的路径。如果您以批处理模式使用客户端，则由于安全原因，架构的路径必须是相对的。

如果您通过 HTTP 接口输入或输出数据，则格式架构中指定的文件名应位于服务器配置中 format_schema_path 指定的目录下。

跳过错误

某些格式，如 CSV、TabSeparated、TSKV、JSONEachRow、Template、CustomSeparated 和 Protobuf，可以在发生解析错误时跳过损坏的行，并继续从下一行的开头处解析。请参见 input_format_allow_errors_num 和 input_format_allow_errors_ratio 设置。限制：

在解析错误的情况下，JSONEachRow 会跳过所有数据，直到新的行（或 EOF），因此行必须通过 \n 分隔，才能正确计数错误。
Template 和 CustomSeparated 在最后一列后使用分隔符并且在行之间使用分隔符来找到下一行的开头，因此仅当其中至少一个不为空时，跳过错误才能正常工作。

输入格式​

输出格式​

格式概述​

TabSeparated​

TabSeparatedRaw​

TabSeparatedWithNames​

TabSeparatedWithNamesAndTypes​

TabSeparatedRawWithNames​

TabSeparatedRawWithNamesAndTypes​

Template​

TemplateIgnoreSpaces​

TSKV​

CSV​

CSVWithNames​

CSVWithNamesAndTypes​

CustomSeparated​

CustomSeparatedWithNames​

CustomSeparatedWithNamesAndTypes​

SQLInsert​

JSON​

JSONStrings​

JSONColumns​

JSONColumnsWithMetadata​

JSONAsString​

JSONAsObject​

JSONCompact​

JSONCompactStrings​

JSONCompactColumns​

JSONEachRow​

PrettyJSONEachRow​

JSONStringsEachRow​

JSONCompactEachRow​

JSONCompactStringsEachRow​

JSONEachRowWithProgress​

JSONStringsEachRowWithProgress​

JSONCompactEachRowWithNames​

JSONCompactEachRowWithNamesAndTypes​

JSONCompactEachRowWithProgress​

JSONCompactStringsEachRowWithNames​

JSONCompactStringsEachRowWithNamesAndTypes​

JSONObjectEachRow​

JSON 格式设置​

BSONEachRow​

Native​

Null​

Pretty​

PrettyNoEscapes​

PrettyMonoBlock​

PrettyNoEscapesMonoBlock​

PrettyCompact​

PrettyCompactNoEscapes​

PrettyCompactMonoBlock​

PrettyCompactNoEscapesMonoBlock​

PrettySpace​

PrettySpaceNoEscapes​

PrettySpaceMonoBlock​

PrettySpaceNoEscapesMonoBlock​

RowBinary​

RowBinaryWithNames​

RowBinaryWithNamesAndTypes​

RowBinaryWithDefaults​

Values​

Vertical​

XML​

CapnProto​

Prometheus​

Protobuf​

ProtobufSingle​

ProtobufList​

Avro​

AvroConfluent​

Parquet​

ParquetMetadata​

Arrow​

ArrowStream​

ORC​

One​

Npy​

LineAsString​

Regexp​