To convert Parquet to TSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then read the Parquet file and write it out as TSV:
clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events.tsv' TRUNCATE FORMAT TSV"
2026-01-01 1 GB click 5 [0,0,0]
2026-01-02 2 US view 6.01 [1,1,1]
2026-01-03 3 DE signup 7.02 [2,2,2]
2026-01-04 4 FR purchase 8.03 [0,3,3]
2026-01-05 5 IN click 9.04 [1,4,4]
The schema is read straight from the Parquet footer, so you declare no columns. The file is converted in place with no upload or import step, and because the read and write both stream, this works on files larger than RAM.
Parquet is self-describing: every column carries its type in the file. clickhouse-local reads that footer and uses it directly, so there is no schema guessing on the way in. Check what it sees with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.parquet')"
event_date Date32
event_id UInt64
country String
action String
value Float64
tags Array(UInt8)
Those types are written into the TSV as text. A Date32 becomes 2026-01-01, a Float64 becomes 6.01, and the Array(UInt8) column becomes [0,0,0] in a single tab-separated field. The tabs separate columns; values themselves never contain a raw tab (any tab inside a string is escaped as \t), so the column boundaries stay unambiguous.
Keep the column names with TSVWithNames
Plain TSV writes data rows only. If you want a header row of column names, use TSVWithNames:
clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events_named.tsv' TRUNCATE FORMAT TSVWithNames"
event_date event_id country action value tags
2026-01-01 1 GB click 5 [0,0,0]
2026-01-02 2 US view 6.01 [1,1,1]
This is the version most tools expect, and it lets you read the file back without re-declaring the columns.
The nested-column gotcha
This is where a Parquet to TSV conversion differs from a flat one. Parquet supports nested types: arrays, maps, and structs. TSV is a flat, one-value-per-cell text format with no nested grammar of its own. ClickHouse handles this by serialising the nested value into a single field using its own array literal syntax, which is why tags showed up as [0,0,0].
That representation is not lossy for arrays. Read the TSV back and the array parses straight into an Array again.
clickhouse local -q "DESCRIBE file('events_named.tsv', 'TSVWithNames')"
clickhouse local -q "SELECT tags, length(tags) AS n FROM file('events_named.tsv', 'TSVWithNames') LIMIT 3"
event_date Nullable(Date)
event_id Nullable(Int64)
country Nullable(String)
action Nullable(String)
value Nullable(Float64)
tags Array(Nullable(Int64))
[0,0,0] 3
[1,1,1] 3
[2,2,2] 3
The tags column came back as Array(Nullable(Int64)) and length() still reads 3. Two things shifted on the way back in, both expected from text: scalar columns are now Nullable (TSV inference allows empty cells), and the integer width widened from UInt8 to Int64 because TSV carries no width hint. The values are identical.
If the consumer of your TSV cannot parse a [...] field, flatten the nesting in SQL before you export. arrayJoin turns one row with an array into one row per element, giving a fully flat TSV:
clickhouse local -q "SELECT event_id, country, arrayJoin(tags) AS tag FROM file('events.parquet') ORDER BY event_id LIMIT 5 FORMAT TSV"
1 GB 0
1 GB 0
1 GB 0
2 US 1
2 US 1
Now every value sits in its own cell with no brackets in sight. This is the part an upload-required online converter can't do: you decide the exact shape of the output before it lands.
- Filter and reshape on the way out. The conversion is a normal
SELECT, so add a WHERE, pick columns, rename them, or compute new ones. You convert and trim in one pass instead of exporting everything and cleaning up later.
NULL representation. A SQL NULL is written as \N in TSV by default. Set format_tsv_null_representation if a downstream tool expects an empty cell or a literal string instead.
- Date and number formats. Dates serialise as
YYYY-MM-DD and floats use a . decimal point. If you need a different layout, cast in the SELECT (for example toString(formatDateTime(event_date, '%d/%m/%Y'))).
- Compress in the same command. Give the output an extension like
events.tsv.gz and ClickHouse compresses on the fly. The codec is inferred from the file name.
- Variants.
TSVWithNamesAndTypes adds a second header row carrying the ClickHouse type of each column, which makes a later read fully self-describing.
On a 3,000,000-row Parquet file (events_large.parquet, ~26 MB on disk), the full conversion to a ~123 MB TSV runs in:
clickhouse local -q "SELECT * FROM file('events_large.parquet') INTO OUTFILE 'events_large.tsv' TRUNCATE FORMAT TSV"
~0.19 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The number is best-of-three and may wobble slightly under concurrent load. The work is bounded by reading the columnar Parquet and writing text, both streamed, so it scales linearly and never needs the whole file in memory.
To go the other way, see convert TSV to Parquet: the same file() to INTO OUTFILE pattern with the formats swapped.
The same conversion runs in-process with chDB, the embedded ClickHouse engine for Python. Same SQL, written to a file from a Python script:
import chdb
# Convert Parquet -> TSV (with a header row)
chdb.query(
"SELECT * FROM file('events.parquet') "
"INTO OUTFILE 'events_chdb.tsv' TRUNCATE FORMAT TSVWithNames"
)
# Read it back into a pandas DataFrame to confirm the round-trip
df = chdb.query(
"SELECT event_id, country, tags FROM file('events_chdb.tsv', 'TSVWithNames') "
"ORDER BY event_id LIMIT 3",
"DataFrame",
)
print(df)
event_id country tags
0 1 GB [0, 0, 0]
1 2 US [1, 1, 1]
2 3 DE [2, 2, 2]
The tags array comes back as a real Python list inside the DataFrame. chDB is a fair alternative to reading Parquet with pandas or pyarrow and then writing CSV; you stay in one engine and one query.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample Parquet (including the 3M-row file used for the timing above), run.sh with every command on this page, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-parquet-to-tsv
The same SQL scales unchanged from this file to a ClickHouse server to ClickHouse Cloud when the data outgrows your laptop. Related: convert Parquet to CSV, query a TSV file with SQL, what is a TSV file, and how to query a Parquet file.