How to convert Parquet to CSV

To convert Parquet to CSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then write the Parquet file out as CSV:

1clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events.csv' FORMAT CSVWithNames"

1"event_date","event_id","country","action","amount","attrs"
2"2026-01-01",1,"GB","click",5,"{'os':'mac','plan':'free'}"
3"2026-01-02",2,"US","view",6.01,"{'os':'win','plan':'pro'}"
4"2026-01-03",3,"DE","signup",7.02,"{'os':'linux','plan':'free'}"

The schema is read from the Parquet footer, so there is nothing to declare. CSVWithNames writes a header row from those column names, and the typed Parquet values become text in the CSV, all in place with no upload or import step.

Parquet stores its own schema, so there's nothing to declare. ClickHouse reads it from the footer. Check what it found with DESCRIBE:

1clickhouse local -q "DESCRIBE file('events.parquet')"

1event_date	Date32
2event_id	UInt64
3country	String
4action	String
5amount	Float64
6attrs	Map(String, String)

Those types are what get written to the CSV as text. A Date32 becomes 2026-01-01, a Float64 becomes 6.01. CSV itself carries no types, so the column types live in the Parquet file and are lost the moment you write text. That's expected for CSV; if you need the types preserved, keep the Parquet or convert to a typed format instead.

A quick round-trip check confirms every row made it across:

1clickhouse local -q "SELECT count() FROM file('events.csv')"

OPTIONS: header, delimiter, nested columns

This is where a scriptable converter beats an upload-and-download web tool. You control exactly what the CSV looks like.

CSVWithNames writes the header row shown above. If you want a headerless CSV, use FORMAT CSV instead:

1clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events_noheader.csv' FORMAT CSV"

CSV does not have to mean comma. Set format_csv_delimiter to write a semicolon-separated file, which is what many European locales and Excel installs expect:

1clickhouse local -q "
2SELECT event_date, country, amount FROM file('events.parquet')
3INTO OUTFILE 'events_semi.csv' FORMAT CSVWithNames
4SETTINGS format_csv_delimiter=';'"

1"event_date";"country";"amount"
2"2026-01-01";"GB";5
3"2026-01-02";"US";6.01

Nested columns are the one thing to watch when going Parquet to CSV. Parquet supports nested types (here attrs is a Map(String, String)); CSV is flat. A nested column is serialized into a single text cell:

11	{'os':'mac','plan':'free'}
22	{'os':'win','plan':'pro'}

That string is readable but awkward to parse downstream. Usually you want the nested fields as their own CSV columns. Pull them out in the SELECT and the conversion stays one command:

1clickhouse local -q "
2SELECT event_date, country, attrs['os'] AS os, attrs['plan'] AS plan
3FROM file('events.parquet')
4INTO OUTFILE 'events_flat.csv' FORMAT CSVWithNames"

1"event_date","country","os","plan"
2"2026-01-01","GB","mac","free"
3"2026-01-02","US","win","pro"
4"2026-01-03","DE","linux","free"

Now the map is two flat columns. Because the input is a full SQL table, you can also filter, rename, reorder, or aggregate during the conversion — not just dump the file verbatim.

Reverse direction?

Going the other way, CSV back to Parquet, is the same pattern with the formats swapped. See convert CSV to Parquet for the column-typing and compression options worth setting when you write Parquet.

How fast is it?

On a 3,000,000-row events_large.parquet, writing the full table out to a ~211 MB CSV completes in:

1clickhouse local -q "SELECT * FROM file('events_large.parquet') INTO OUTFILE 'events_large.csv' FORMAT CSVWithNames"

~0.21 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The work is decoding the columnar Parquet and serializing every value to text; it streams, so memory stays flat whether the file is 200 MB or 200 GB. A concurrent load on the machine can nudge the number; the point is that the conversion is I/O-bound, not a bottleneck.

clickhouse-local runs the same SQL unchanged across dozens of formats and remote sources, and the same query scales from a file on your laptop to a ClickHouse server or ClickHouse Cloud when the data outgrows one machine, with no rewrite.

chDB: the same conversion from Python

chDB is ClickHouse as an in-process Python library. The conversion is the identical SQL, no server and no subprocess:

1import chdb
2
3chdb.query("""
4SELECT * FROM file('data/events.parquet')
5INTO OUTFILE 'data/events_chdb.csv' TRUNCATE FORMAT CSVWithNames
6""")
7
8print(str(chdb.query("SELECT count() FROM file('data/events_chdb.csv')", "CSV")).rstrip())

If you read Parquet into pandas already, chDB writes the CSV without the DataFrame round-trip, and the same flatten and delimiter options apply.

Run it yourself

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample Parquet files (including the 3M-row file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-parquet-to-csv

Related: how to query a Parquet file, what is a Parquet file, and the reverse conversion CSV to Parquet.

Schema comes from the Parquet footer

OPTIONS: header, delimiter, nested columns

Reverse direction?

How fast is it?

chDB: the same conversion from Python

Run it yourself

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

How to convert Parquet to CSV

Schema comes from the Parquet footer #

OPTIONS: header, delimiter, nested columns #

Reverse direction? #

How fast is it? #

chDB: the same conversion from Python #

Run it yourself #

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

Schema comes from the Parquet footer

OPTIONS: header, delimiter, nested columns

Reverse direction?

How fast is it?

chDB: the same conversion from Python

Run it yourself