To convert JSON to CSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the JSON file and write the result INTO OUTFILE as CSV:
clickhouse local -q "SELECT * FROM file('events.jsonl', JSONEachRow) INTO OUTFILE 'events.csv' FORMAT CSVWithNames"
"event_id","event_type","ts","user","amounts"
1,"login","2026-06-01 09:00:00","GB","free","[1.5,0.5]"
2,"purchase","2026-06-01 09:00:37","US","pro","[2.5,1.5]"
The JSON is read in place with no import step: ClickHouse infers the schema from the data and streams each row out as CSV. The one thing to watch is nesting. CSV is flat, so a nested user object expands inline and an amounts array serialises as a string. The sections below show how to flatten them into named columns.
The sample is newline-delimited JSON (one object per line, the JSONEachRow format). Each event carries a nested user object and an amounts array:
{"event_id":1,"event_type":"login","ts":"2026-06-01 09:00:00","user":{"country":"GB","plan":"free"},"amounts":[1.5,0.5]}
{"event_id":2,"event_type":"purchase","ts":"2026-06-01 09:00:37","user":{"country":"US","plan":"pro"},"amounts":[2.5,1.5]}
clickhouse-local infers the schema from the data. Run DESCRIBE to see what it found:
clickhouse local -q "DESCRIBE file('events.jsonl', JSONEachRow)"
event_id Nullable(Int64)
event_type Nullable(String)
ts Nullable(DateTime)
user Tuple(country Nullable(String), plan Nullable(String))
amounts Array(Nullable(Float64))
user came back as a Tuple (a nested record) and amounts as an Array. Those two columns are what CSV can't represent directly.
If you convert with SELECT *, the nested user object is expanded inline. The header still lists five columns, but each data row now carries six fields, because country and plan were spread out with no names of their own:
clickhouse local -q "SELECT * FROM file('events.jsonl', JSONEachRow) INTO OUTFILE 'naive.csv' FORMAT CSVWithNames"
"event_id","event_type","ts","user","amounts"
1,"login","2026-06-01 09:00:00","GB","free","[1.5,0.5]"
2,"purchase","2026-06-01 09:00:37","US","pro","[2.5,1.5]"
Five header names, six values per row. Anything that reads this CSV by counting columns against the header will misalign. This is the flattening problem online converters either choke on or silently mangle.
The fix is to project the nested fields into named top-level columns yourself. Address the tuple members by path (user.country), pick array elements by index (amounts[1]), and serialize the rest of an array into one string column if you want to keep it:
clickhouse local -q "
SELECT
event_id,
event_type,
ts,
user.country AS user_country,
user.plan AS user_plan,
amounts[1] AS amount_primary,
arrayStringConcat(arrayMap(x -> toString(x), amounts), ';') AS amounts_list
FROM file('events.jsonl', JSONEachRow)
INTO OUTFILE 'events.csv' FORMAT CSVWithNames"
"event_id","event_type","ts","user_country","user_plan","amount_primary","amounts_list"
1,"login","2026-06-01 09:00:00","GB","free",1.5,"1.5;0.5"
2,"purchase","2026-06-01 09:00:37","US","pro",2.5,"2.5;1.5"
3,"logout","2026-06-01 09:01:14","DE","team",3.5,"3.5;2.5"
4,"signup","2026-06-01 09:01:51","FR","free",4.5,"4.5;3.5"
Every column is now flat and named. Read the CSV back and the schema round-trips cleanly:
clickhouse local -q "DESCRIBE file('events.csv', CSVWithNames)"
event_id Nullable(Int64)
event_type Nullable(String)
ts Nullable(DateTime)
user_country Nullable(String)
user_plan Nullable(String)
amount_primary Nullable(Float64)
amounts_list Nullable(String)
This is the lossy step to be deliberate about: an array of two values became a primary scalar plus a delimited string. That is a choice, not a default. Decide per column whether you want one element, a delimited list, or a JSON string (toJSONString(amounts)), because CSV will not keep the structure for you.
These are the levers you don't get from a drag-and-drop converter site:
- Header or no header.
CSVWithNames writes the header row shown above. Use CSV to omit it.
- Pick or rename columns. The
SELECT list is your column spec. Drop fields, reorder them, cast types, or compute new ones in the same query.
- Delimiter. CSV uses commas; for tab- or pipe-separated output switch the format (
TSV) or set format_csv_delimiter.
- Keep nesting as JSON strings. If you'd rather not flatten, wrap each nested field with
toJSONString(...) so the structure survives as text in a single cell.
- Top-level JSON arrays. If your file is one big
[ {...}, {...} ] array instead of one object per line, read it with the JSONEachRow format too — it handles both shapes.
Converting a 1,000,000-row, ~125 MB newline-delimited JSON file (events_large.jsonl) to CSV, with the full flattening projection above, completes in:
clickhouse local -q "
SELECT event_id, event_type, ts, user.country AS user_country, user.plan AS user_plan,
amounts[1] AS amount_primary,
arrayStringConcat(arrayMap(x -> toString(x), amounts), ';') AS amounts_list
FROM file('events_large.jsonl', JSONEachRow)
INTO OUTFILE 'events_large.csv' FORMAT CSVWithNames"
~0.42 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses every JSON object from scratch and writes 1,000,000 CSV rows; there's no cached table in between. A concurrent build on the same machine can nudge that number, so treat it as a ballpark, not a benchmark.
If you live in Python, chDB is the same engine in-process. Run the identical SQL — including INTO OUTFILE — with chdb.query, and skip pandas entirely for the conversion:
import chdb
chdb.query("""
SELECT
event_id,
event_type,
ts,
user.country AS user_country,
user.plan AS user_plan,
amounts[1] AS amount_primary,
arrayStringConcat(arrayMap(x -> toString(x), amounts), ';') AS amounts_list
FROM file('events.jsonl', JSONEachRow)
INTO OUTFILE 'events_chdb.csv' TRUNCATE
FORMAT CSVWithNames
""")
"event_id","event_type","ts","user_country","user_plan","amount_primary","amounts_list"
1,"login","2026-06-01 09:00:00","GB","free",1.5,"1.5;0.5"
2,"purchase","2026-06-01 09:00:37","US","pro",2.5,"2.5;1.5"
The flattening logic lives in the SQL, so it's identical on the command line and in Python. For loading JSON into a DataFrame instead of a file, see how to read a JSON file in Python with chDB and how to flatten nested JSON in Python.
Going the other way is just as direct. See how to convert CSV to JSON, or keep the data columnar instead with convert JSON to Parquet — Parquet keeps the nested structure and the types, so it's the better target if you'll query the data again rather than hand it to a tool that needs flat CSV. To explore the JSON in place before converting, see how to run SQL on a JSON file.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSON (including the ~125 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-json-to-csv
The same SQL that converted this file scales unchanged to a ClickHouse server or ClickHouse Cloud when the data outgrows your laptop, with no rewrite.