How to convert JSONL to CSV

To convert a JSONL file to CSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then flatten the JSONL into CSV columns and write the result:

1clickhouse local -q "
2SELECT
3  event_id,
4  ts,
5  action,
6  geo.country AS geo_country,
7  geo.city    AS geo_city,
8  arrayStringConcat(tags, '|') AS tags
9FROM file('events.jsonl')
10INTO OUTFILE 'events.csv' TRUNCATE FORMAT CSVWithNames"

1"event_id","ts","action","geo_country","geo_city","tags"
21,"2026-06-01 00:00:00","login","GB","London",""
32,"2026-06-01 00:02:17","click","US","New York","mobile"
43,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"
54,"2026-06-01 00:06:51","logout","FR","Paris",""

The file is read in place with no import step: ClickHouse infers the schema from the JSON objects, flattens the nested geo object into two columns and the tags array into a pipe-joined cell, and writes the CSV directly to disk.

The nested-JSON gotcha

JSONL allows nested objects and arrays inside each line. CSV is flat: every row is a list of scalar cells. So a naive SELECT * has nowhere to put the structure. Take this input, where each event has a nested geo object and a tags array:

1{"event_id":1,"ts":"2026-06-01 00:00:00","action":"login","geo":{"country":"GB","city":"London"},"tags":[]}
2{"event_id":2,"ts":"2026-06-01 00:02:17","action":"click","geo":{"country":"US","city":"New York"},"tags":["mobile"]}

Check what clickhouse-local infers with DESCRIBE:

1clickhouse local -q "DESCRIBE file('events.jsonl')"

1event_id	Nullable(Int64)
2ts	Nullable(DateTime)
3action	Nullable(String)
4geo	Tuple(city Nullable(String), country Nullable(String))
5tags	Array(Nullable(String))

geo becomes a two-field Tuple and tags becomes an Array. If you convert with SELECT *, ClickHouse expands the tuple into separate cells but the header still has a single geo column, so the columns and the header no longer line up:

1"event_id","ts","action","geo","tags"
21,"2026-06-01 00:00:00","login","London","GB","[]"
32,"2026-06-01 00:02:17","click","New York","US","['mobile']"

Five header names, six data values. That CSV is misaligned and will confuse whatever reads it next. Flatten the nested fields explicitly instead.

Flatten nested fields into columns

Name each nested field as its own column. Reach into the object with dot access (geo.country), and turn the array into one scalar cell with arrayStringConcat, picking a separator that won't clash with your data:

1clickhouse local -q "
2SELECT
3  event_id,
4  ts,
5  action,
6  geo.country AS geo_country,
7  geo.city    AS geo_city,
8  arrayStringConcat(tags, '|') AS tags
9FROM file('events.jsonl')
10INTO OUTFILE 'events.csv' TRUNCATE FORMAT CSVWithNames"

Now the header and the data agree, the nested object is two clean columns, and the array is a single pipe-joined cell:

1"event_id","ts","action","geo_country","geo_city","tags"
21,"2026-06-01 00:00:00","login","GB","London",""
32,"2026-06-01 00:02:17","click","US","New York","mobile"
43,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"
54,"2026-06-01 00:06:51","logout","FR","Paris",""

TRUNCATE overwrites the output file if it already exists, so the command is safe to re-run. Read the CSV straight back to confirm the round-trip:

1clickhouse local -q "SELECT geo_country, count() AS events FROM file('events.csv') GROUP BY geo_country ORDER BY geo_country"

1DE	3
2FR	3
3GB	3
4US	3

Options worth knowing

The conversion is one line, but a few details decide whether the CSV is correct downstream:

Flatten, don't drop. Dot access (geo.country) reaches one level; for deeper nesting, chain it (a.b.c) or use JSONExtract on a raw JSON string. This is the one lossy step in the conversion: a tree becomes flat columns, and you choose how.
Pick an array separator deliberately. arrayStringConcat(tags, '|') keeps the array in a single cell. A comma separator would collide with CSV delimiters; pipe or semicolon is safer. Or explode the array into one row per element with ARRAY JOIN.
Header or no header. CSVWithNames writes the header row. Plain CSV omits it.
Types are carried, then stringified. CSV has no schema, so Int64 and DateTime are written as text. The values are exact; only the type metadata is lost, which is inherent to CSV, not to the tool.
Compress on the way out. Add a codec suffix to the output name and it is written compressed: INTO OUTFILE 'events.csv.gz' produces gzip with no extra flag.

That list is the difference between this and a drag-and-drop converter site: you control flattening, separators, headers, and compression, the file never leaves your machine, and the whole thing is one scriptable command. Online converters also cap upload size; here the input streams from disk, so a file larger than RAM is no different from a small one. clickhouse-local runs the same SQL unchanged across dozens of formats, against a ClickHouse server, and against ClickHouse Cloud when the data outgrows your laptop.

Reverse direction? See how to convert CSV to JSONL.

chDB: the same conversion in Python

If you would rather stay in Python, chDB is the same ClickHouse engine in-process. Run the identical SELECT ... INTO OUTFILE, and you get the byte-identical CSV:

1import chdb
2
3chdb.query("""
4SELECT
5  event_id,
6  ts,
7  action,
8  geo.country AS geo_country,
9  geo.city    AS geo_city,
10  arrayStringConcat(tags, '|') AS tags
11FROM file('events.jsonl')
12INTO OUTFILE 'events_chdb.csv' TRUNCATE FORMAT CSVWithNames
13""")

1"event_id","ts","action","geo_country","geo_city","tags"
21,"2026-06-01 00:00:00","login","GB","London",""
32,"2026-06-01 00:02:17","click","US","New York","mobile"
43,"2026-06-01 00:04:34","purchase","DE","Berlin","mobile|web"

There is no pandas round-trip and no json.loads loop to write yourself; the flattening happens in the SQL. See how to read a JSONL file in Python with chDB for querying the same data into a DataFrame.

How fast is it?

On a 1,200,000-row, ~146 MB events_large.jsonl, parsing every JSON line, flattening the nested geo object, joining the tags array, and writing the full CSV out completes in ~0.39 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That is the whole conversion, parse to written file, not a cached table. Concurrent load on the machine can move that number a little; the point is the conversion is not the slow part of your pipeline.

Run it yourself

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSONL (including the ~146 MB file used for the timing), run.sh with every command on this page, a run.py / run.ipynb chDB version, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-jsonl-to-csv

The nested-JSON gotcha

Flatten nested fields into columns

Options worth knowing

chDB: the same conversion in Python

How fast is it?

Run it yourself

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

How to convert JSONL to CSV

The nested-JSON gotcha #

Flatten nested fields into columns #

Options worth knowing #

chDB: the same conversion in Python #

How fast is it? #

Run it yourself #

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

The nested-JSON gotcha

Flatten nested fields into columns

Options worth knowing

chDB: the same conversion in Python

How fast is it?

Run it yourself