To read a pipe-delimited file, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the file with CustomSeparatedWithNames and set the delimiter to a pipe:
clickhouse local -q "
SELECT * FROM file('orders.psv', 'CustomSeparatedWithNames') LIMIT 10
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
2026-01-01 1 GB widget 5 1
2026-01-02 2 US gadget 6.01 2
2026-01-03 3 DE gizmo 7.02 3
2026-01-04 4 FR doohickey 8.03 4
2026-01-05 5 IN widget 9.04 5
Two settings tell it the file is pipe-delimited: format_custom_field_delimiter='|' splits fields on the pipe, and format_custom_escaping_rule='CSV' unwraps any quoted cells. The file is read in place with no import step; the header row supplies the column names and the data supplies the types.
CustomSeparatedWithNames reads the first row as column names; ClickHouse infers each type from the data. Check what it found with DESCRIBE:
clickhouse local -q "
DESCRIBE file('orders.psv', 'CustomSeparatedWithNames')
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
order_date Nullable(Date)
order_id Nullable(Int64)
country Nullable(String)
product Nullable(String)
revenue Nullable(Float64)
quantity Nullable(Int64)
The header gave the names; the data gave the types. Columns come back Nullable because inference allows for empty values. No CREATE TABLE required.
A pipe-delimited file needs two settings, not one, and the second is easy to forget. format_custom_field_delimiter='|' splits fields on the pipe. format_custom_escaping_rule='CSV' tells ClickHouse how a single field is encoded: CSV means a field may be wrapped in double quotes that should be stripped, exactly like a normal CSV cell.
Drop the escaping rule and the default (Escaped, which uses backslash escaping) takes over. It does not unwrap the quotes around each field, and the inference collapses the whole line into a single String column named after the quoted header:
clickhouse local -q "
DESCRIBE file('orders.psv', 'CustomSeparatedWithNames')
SETTINGS format_custom_field_delimiter='|'"
"order_date"|"order_id"|"country"|"product"|"revenue"|"quantity" Nullable(String)
That one column is the symptom to recognize. If a pipe-delimited read returns a single fat String column, or every value carries literal " characters, set format_custom_escaping_rule='CSV' and the columns split cleanly. If your file has no quoting at all, Escaped and CSV behave the same on plain values; CSV is the safe default because it handles both.
Once the columns are typed, the full SQL surface works on the file: WHERE, GROUP BY, aggregates, joins, window functions, with no load step in between.
clickhouse local -q "
SELECT
country,
count() AS orders,
round(sum(revenue), 2) AS revenue,
round(avg(quantity), 2) AS avg_qty
FROM file('orders.psv', 'CustomSeparatedWithNames')
GROUP BY country
ORDER BY revenue DESC
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
US 4 60.4 3.5
GB 4 56.36 2.5
AU 3 48.33 2
IN 3 45.3 2.67
FR 3 42.27 3.33
DE 3 39.24 4
Typing the two SETTINGS on every query gets old. Set them once with SET and query normally for the rest of the session:
clickhouse local -q "
SET format_custom_field_delimiter='|', format_custom_escaping_rule='CSV';
SELECT count() FROM file('orders.psv', 'CustomSeparatedWithNames');"
The same CustomSeparated settings handle any single-character delimiter, the same SQL runs unchanged across dozens of other formats and remote sources, and it scales to a ClickHouse server or ClickHouse Cloud when the data outgrows your machine.
If the file has no header, use CustomSeparated (without WithNames) and supply the schema yourself. The same two delimiter settings still apply:
clickhouse local -q "
SELECT * FROM file('orders_nohdr.psv', 'CustomSeparated',
'order_date Date, order_id UInt32, country String, product String, revenue Float64, quantity UInt8')
ORDER BY revenue DESC LIMIT 3
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
2026-01-20 20 US doohickey 24.19 5
2026-01-19 19 GB gizmo 23.18 4
2026-01-18 18 AU gadget 22.17 3
You don't unzip anything first. clickhouse local detects the .gz suffix and decompresses on the fly, so a .psv.gz is queried exactly like a .psv:
clickhouse local -q "
SELECT country, count() FROM file('orders.psv.gz', 'CustomSeparatedWithNames')
GROUP BY country ORDER BY country
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
AU 3
DE 3
FR 3
GB 4
IN 3
US 4
The same applies to .psv.zst, .psv.xz, and other supported codecs. The compression is inferred from the file name. See querying compressed files for the full list.
The text parsing you avoided importing was the slow part. On a 3,000,000-row, ~126 MB pipe-delimited file (orders_large.psv), a full GROUP BY country with sum and avg over every row runs in:
clickhouse local -q "
SELECT country, count() AS orders, round(sum(revenue),2) AS revenue, round(avg(quantity),3) AS avg_qty
FROM file('orders_large.psv', 'CustomSeparatedWithNames')
GROUP BY country ORDER BY revenue DESC
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
~0.60 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM; clickhouse local 26.6.1.117). That includes parsing the delimited text from scratch on every run; there is no cached table. Query the same data repeatedly and you should convert it once to a typed columnar format.
Pipe-delimited text is slow to scan again and again because every query re-parses it. If you'll query the data more than once, write it to Parquet once and read that instead:
clickhouse local -q "
SELECT * FROM file('orders.psv', 'CustomSeparatedWithNames')
INTO OUTFILE 'orders.parquet' TRUNCATE FORMAT Parquet
SETTINGS format_custom_field_delimiter='|', format_custom_escaping_rule='CSV'"
The result is a real Parquet file carrying the columns and types ClickHouse inferred. See how to query a Parquet file and convert CSV to Parquet for the typing and compression options (the workflow is identical from a pipe-delimited source).
The query you ran on a local .psv is the same SQL you would run on a ClickHouse server or in ClickHouse Cloud. You swap file('orders.psv', 'CustomSeparatedWithNames') for a table name and the SELECT ... GROUP BY stays put. You prototype against a file on your machine and ship the identical logic to production.
The complete, runnable example lives in the ClickHouse examples repo. It has generate.sh (builds the demo .psv, a headerless copy, and the ~126 MB perf file), run.sh (every command above), and expected_output.txt:
github.com/ClickHouse/examples → local-analytics/clickhouse-local-pipe-delimited
git clone https://github.com/ClickHouse/examples
cd examples/local-analytics/clickhouse-local-pipe-delimited
./generate.sh && ./run.sh