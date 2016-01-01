Insert Local Files
You can use
clickhouse-client to stream local files into your ClickHouse service. This allows you the ability to preprocess
the data using the many powerful and convenient ClickHouse functions. Let's look at an example...
- Suppose we have a TSV file named
comments.tsvthat contains some Hacker News comments, and the header row contains column names. You need to specify an input format when you insert the data, which in our case is
TabSeparatedWithNames:
- Let's create the table for our Hacker News data:
- We want to lowercase the
authorcolumn, which is easily done with the
lowerfunction. We also want to split the
commentstring into tokens and store the result in the
tokenscolumn, which can be done using the
extractAllfunction. You do all of this in one
clickhouse-clientcommand - notice how the
comments.tsvfile is piped into the
clickhouse-clientusing the
<operator:
Note
The
input function is useful here as it allows us to convert the data as it's being inserted into the
hackernews table. The argument to
input is the format of the incoming raw data, and you will see this in many of the other table functions (where you specify a schema for the incoming data).
- That's it! The data is up in ClickHouse:
The result is:
- Another option is to use a tool like
catto stream the file to
clickhouse-client. For example, the following command has the same result as using the
<operator:
Visit the docs page on
clickhouse-client for details on how to install
clickhouse-client on your local operating system.