How to convert JSONL to JSON

Al Brown
Last updated: Jun 15, 2026

To convert a JSONL file into a single JSON array, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then convert the file with output_format_json_array_of_rows:

1clickhouse local -q "SELECT * FROM file('events.jsonl') INTO OUTFILE 'events.json' FORMAT JSONEachRow SETTINGS output_format_json_array_of_rows = 1"
1[
2{"event_id":1,"event":"login","country":"GB","amount":1},
3{"event_id":2,"event":"click","country":"US","amount":2.01},
4{"event_id":3,"event":"purchase","country":"DE","amount":3.02},
5{"event_id":4,"event":"logout","country":"FR","amount":4.03},
6{"event_id":5,"event":"login","country":"GB","amount":5.04}
7]

The JSONL is read in place with no import step: ClickHouse infers the schema from the file extension and data, wraps the rows into a single array, and writes events.json. The rest of this page covers the one decision that actually matters: which JSON shape you want.

JSONL vs JSON: what you are really converting #

JSONL (also called NDJSON) is one self-contained JSON object per line. It streams well and appends cheaply, which is why logs and exports use it. A single JSON document is one value, usually an array of objects, that a parser reads in one go. The conversion is structural: take N lines and wrap them into one array.

clickhouse-local reads the JSONL with no hints. The .jsonl extension maps to the JSONEachRow format, and the column types come from the data:

1clickhouse local -q "DESCRIBE file('events.jsonl')"
1event_id	Nullable(Int64)
2event	Nullable(String)
3country	Nullable(String)
4amount	Nullable(Float64)

Those types carry into the output. amount stays a JSON number, not a quoted string; event_id stays an integer. A naive line-joining script that just inserts commas and brackets keeps whatever the source typed, but the moment you need to reshape, filter, or coerce a column, having a typed table in between is what makes it one command instead of a parser.

Two valid JSON shapes (pick the right one) #

There are two things people mean by "a JSON file", and ClickHouse writes both.

A pure array of objects. This is what most tools and APIs expect. Use FORMAT JSONEachRow with output_format_json_array_of_rows = 1 (the one-liner above). The output is exactly [ {...}, {...} ] and nothing else.

The ClickHouse JSON envelope. FORMAT JSON wraps the rows in a document that also carries the column schema and query statistics:

1clickhouse local -q "SELECT * FROM file('events.jsonl') INTO OUTFILE 'events_envelope.json' FORMAT JSON"
1{
2	"meta":
3	[
4		{
5			"name": "event_id",
6			"type": "Nullable(Int64)"
7		},
8		...
9	],
10
11	"data":
12	[
13		{
14			"event_id": 1,
15			"event": "login",
16			"country": "GB",
17			"amount": 1
18		},
19		...
20	],
21
22	"rows": 5,
23
24	"statistics":
25	{
26		"elapsed": 0.002144042,
27		"rows_read": 5,
28		"bytes_read": 301
29	}
30}

The rows live under data, with meta describing the schema and statistics describing the run. This is handy when the consumer wants the types alongside the data. If you just want the array, it is the wrong format; reach for JSONEachRow with the array setting instead. Knowing which one you need is the difference between a file that loads and one that needs another reshape step.

Options an upload converter can't give you #

Online JSONL-to-JSON converters work for a 2 KB paste. The reasons to do it locally:

  • No upload. The data never leaves your machine. That matters for anything with PII or under a data-handling policy.
  • Correct types, not stringified everything. Numbers stay numbers, dates stay dates, because the conversion goes through a typed table rather than naive text munging.
  • Bigger than RAM. clickhouse-local streams. A multi-gigabyte JSONL file converts without loading the whole thing into memory.
  • Scriptable and repeatable. It is one command in a cron job, a Makefile, or a CI step.
  • Reshape on the way through. Because there is a SELECT in the middle, you can project columns, filter rows, rename fields, or cast types in the same pass: SELECT event_id, upper(country) AS country FROM file('events.jsonl') ....

Read it back to check #

JSONEachRow reads a top-level array directly, so you can verify the output round-trips:

1clickhouse local -q "SELECT count() AS rows, round(sum(amount), 2) AS total FROM file('events.json', 'JSONEachRow')"
15	15.1

Five rows in, five rows out, numbers intact.

How fast is it? #

On a 195 MB, 3,000,000-row events.jsonl, converting to a single JSON array runs in ~0.60 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses every line and re-serializes the array from scratch each run; there is no cached table. The output array is 198 MB.

1clickhouse local -q "SELECT * FROM file('events_large.jsonl') INTO OUTFILE 'events_large.json' FORMAT JSONEachRow SETTINGS output_format_json_array_of_rows = 1"

Do it in Python with chDB #

If you live in Python, chDB is the same engine in-process, with the identical SQL and no server. Pass the same SELECT ... INTO OUTFILE ... FORMAT and ClickHouse writes the file:

1import chdb
2
3chdb.query("""
4SELECT * FROM file('events.jsonl')
5INTO OUTFILE 'events_array.json' TRUNCATE
6FORMAT JSONEachRow SETTINGS output_format_json_array_of_rows = 1
7""")
1[
2{"event_id":1,"event":"login","country":"GB","amount":1},
3{"event_id":2,"event":"click","country":"US","amount":2.01},
4{"event_id":3,"event":"purchase","country":"DE","amount":3.02},
5{"event_id":4,"event":"logout","country":"FR","amount":4.03},
6{"event_id":5,"event":"login","country":"GB","amount":5.04}
7]

This avoids json.loads over a million lines and the memory cost of holding the whole list, since chDB streams the file the same way the CLI does.

Reverse direction? #

Going the other way — splitting a JSON array back into one object per line — is convert JSON to JSONL. The same file() -> INTO OUTFILE pattern, just with the formats swapped.

Run it yourself #

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSONL (including the 3M-row file used for the timing), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-jsonl-to-json

The same SQL that converts a file on your laptop runs unchanged against a ClickHouse server or ClickHouse Cloud when the data outgrows it. For more on querying these files directly, see how to run SQL on a JSONL file.

Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...