To convert a JSONL file to CSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then flatten the JSONL into CSV columns and write the result:
clickhouse local -q "
SELECT
event_id,
ts,
action,
geo.country AS geo_country,
geo.city AS geo_city,
arrayStringConcat(tags, '|') AS tags
FROM file('events.jsonl')
INTO OUTFILE 'events.csv' TRUNCATE FORMAT CSVWithNames"
"event_id","ts","action","geo_country","geo_city","tags"
1,"2026-06-01 00:00:00","login","GB","London",""
2,"2026-06-01 00:02:17","click","US","New York","mobile"
3,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"
4,"2026-06-01 00:06:51","logout","FR","Paris",""
The file is read in place with no import step: ClickHouse infers the schema from the JSON objects, flattens the nested geo object into two columns and the tags array into a pipe-joined cell, and writes the CSV directly to disk.
JSONL allows nested objects and arrays inside each line. CSV is flat: every row is a list of scalar cells. So a naive SELECT * has nowhere to put the structure. Take this input, where each event has a nested geo object and a tags array:
{"event_id":1,"ts":"2026-06-01 00:00:00","action":"login","geo":{"country":"GB","city":"London"},"tags":[]}
{"event_id":2,"ts":"2026-06-01 00:02:17","action":"click","geo":{"country":"US","city":"New York"},"tags":["mobile"]}
Check what clickhouse-local infers with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.jsonl')"
event_id Nullable(Int64)
ts Nullable(DateTime)
action Nullable(String)
geo Tuple(city Nullable(String), country Nullable(String))
tags Array(Nullable(String))
geo becomes a two-field Tuple and tags becomes an Array. If you convert with SELECT *, ClickHouse expands the tuple into separate cells but the header still has a single geo column, so the columns and the header no longer line up:
"event_id","ts","action","geo","tags"
1,"2026-06-01 00:00:00","login","London","GB","[]"
2,"2026-06-01 00:02:17","click","New York","US","['mobile']"
Five header names, six data values. That CSV is misaligned and will confuse whatever reads it next. Flatten the nested fields explicitly instead.
Flatten nested fields into columns
Name each nested field as its own column. Reach into the object with dot access (geo.country), and turn the array into one scalar cell with arrayStringConcat, picking a separator that won't clash with your data:
clickhouse local -q "
SELECT
event_id,
ts,
action,
geo.country AS geo_country,
geo.city AS geo_city,
arrayStringConcat(tags, '|') AS tags
FROM file('events.jsonl')
INTO OUTFILE 'events.csv' TRUNCATE FORMAT CSVWithNames"
Now the header and the data agree, the nested object is two clean columns, and the array is a single pipe-joined cell:
"event_id","ts","action","geo_country","geo_city","tags"
1,"2026-06-01 00:00:00","login","GB","London",""
2,"2026-06-01 00:02:17","click","US","New York","mobile"
3,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"
4,"2026-06-01 00:06:51","logout","FR","Paris",""
TRUNCATE overwrites the output file if it already exists, so the command is safe to re-run. Read the CSV straight back to confirm the round-trip:
clickhouse local -q "SELECT geo_country, count() AS events FROM file('events.csv') GROUP BY geo_country ORDER BY geo_country"
The conversion is one line, but a few details decide whether the CSV is correct downstream:
- Flatten, don't drop. Dot access (
geo.country) reaches one level; for deeper nesting, chain it (a.b.c) or use JSONExtract on a raw JSON string. This is the one lossy step in the conversion: a tree becomes flat columns, and you choose how.
- Pick an array separator deliberately.
arrayStringConcat(tags, '|') keeps the array in a single cell. A comma separator would collide with CSV delimiters; pipe or semicolon is safer. Or explode the array into one row per element with ARRAY JOIN.
- Header or no header.
CSVWithNames writes the header row. Plain CSV omits it.
- Types are carried, then stringified. CSV has no schema, so
Int64 and DateTime are written as text. The values are exact; only the type metadata is lost, which is inherent to CSV, not to the tool.
- Compress on the way out. Add a codec suffix to the output name and it is written compressed:
INTO OUTFILE 'events.csv.gz' produces gzip with no extra flag.
That list is the difference between this and a drag-and-drop converter site: you control flattening, separators, headers, and compression, the file never leaves your machine, and the whole thing is one scriptable command. Online converters also cap upload size; here the input streams from disk, so a file larger than RAM is no different from a small one. clickhouse-local runs the same SQL unchanged across dozens of formats, against a ClickHouse server, and against ClickHouse Cloud when the data outgrows your laptop.
Reverse direction? See how to convert CSV to JSONL.
If you would rather stay in Python, chDB is the same ClickHouse engine in-process. Run the identical SELECT ... INTO OUTFILE, and you get the byte-identical CSV:
import chdb
chdb.query("""
SELECT
event_id,
ts,
action,
geo.country AS geo_country,
geo.city AS geo_city,
arrayStringConcat(tags, '|') AS tags
FROM file('events.jsonl')
INTO OUTFILE 'events_chdb.csv' TRUNCATE FORMAT CSVWithNames
""")
"event_id","ts","action","geo_country","geo_city","tags"
1,"2026-06-01 00:00:00","login","GB","London",""
2,"2026-06-01 00:02:17","click","US","New York","mobile"
3,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"
There is no pandas round-trip and no json.loads loop to write yourself; the flattening happens in the SQL. See how to read a JSONL file in Python with chDB for querying the same data into a DataFrame.
On a 1,200,000-row, ~146 MB events_large.jsonl, parsing every JSON line, flattening the nested geo object, joining the tags array, and writing the full CSV out completes in ~0.39 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That is the whole conversion, parse to written file, not a cached table. Concurrent load on the machine can move that number a little; the point is the conversion is not the slow part of your pipeline.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSONL (including the ~146 MB file used for the timing), run.sh with every command on this page, a run.py / run.ipynb chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-jsonl-to-csv