To convert a JSON file to JSONL, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then read the top-level array and write it back as JSONEachRow:
clickhouse local -q "SELECT * FROM file('events.json', JSONEachRow) INTO OUTFILE 'events.jsonl' TRUNCATE FORMAT JSONEachRow"
{"id":1,"event":"login","country":"GB","amount":1,"user":{"name":"Ada","tier":"free"},"items":[0]}
{"id":2,"event":"purchase","country":"US","amount":2.01,"user":{"name":"Linus","tier":"pro"},"items":[0,1]}
{"id":3,"event":"view","country":"DE","amount":3.02,"user":{"name":"Grace","tier":"free"},"items":[0,1,2]}
{"id":4,"event":"logout","country":"FR","amount":4.03,"user":{"name":"Dennis","tier":"enterprise"},"items":[0]}
JSONEachRow reads the [ {...}, {...} ] array in place and writes one object per line, so the file is converted without any import step. Nested fields like user and items survive the round-trip as objects and arrays, not stringified values.
A JSON file usually holds one big array:
[
{"id":1,"event":"login","country":"GB","amount":1,"user":{"name":"Ada","tier":"free"},"items":[0]},
{"id":2,"event":"purchase","country":"US","amount":2.01,"user":{"name":"Linus","tier":"pro"},"items":[0,1]}
]
JSONL (also called NDJSON) is the same records with the array brackets and commas removed: one self-contained JSON object per line. That format streams line by line, which is why log pipelines, jq, and most data tools prefer it. The conversion above produces exactly that:
{"id":1,"event":"login","country":"GB","amount":1,"user":{"name":"Ada","tier":"free"},"items":[0]}
{"id":2,"event":"purchase","country":"US","amount":2.01,"user":{"name":"Linus","tier":"pro"},"items":[0,1]}
{"id":3,"event":"view","country":"DE","amount":3.02,"user":{"name":"Grace","tier":"free"},"items":[0,1,2]}
{"id":4,"event":"logout","country":"FR","amount":4.03,"user":{"name":"Dennis","tier":"enterprise"},"items":[0]}
Note that you point file() at the JSONEachRow format even though the input is a single array. JSONEachRow reads a top-level JSON array as well as line-delimited objects, so the same format handles both the read and the write. The .json extension alone would also be inferred correctly here; naming the format keeps the command unambiguous.
Prefer Python? The same conversion with chDB is in the chDB block below, and how to read a JSON file in Python covers querying JSON from a DataFrame.
Types and nesting are inferred, not flattened
clickhouse-local reads the array, infers a schema, and carries it through. Check what it found with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.json', JSONEachRow)"
id Nullable(Int64)
event Nullable(String)
country Nullable(String)
amount Nullable(Float64)
user Tuple(name Nullable(String), tier Nullable(String))
items Array(Nullable(Int64))
This is the difference from converting JSON to a flat format like CSV. A nested object (user) and a nested array (items) have nowhere to go in a single CSV cell, so they get stringified or dropped. JSONL is still JSON, so the structure survives the round-trip untouched: user stays an object, items stays an array. No flattening, no lossy casts.
A quick sanity check: the number of array elements in equals the number of lines out.
clickhouse local -q "SELECT count() FROM file('events.json', JSONEachRow)" # 8
wc -l < events.jsonl # 8
One array element, one line. Nothing dropped, nothing merged.
Online JSON-to-JSONL tools work for a small paste, but they upload your data, cap the file size, and offer no transforms. Doing it locally gives you the whole SQL surface for free.
Filter and reshape while you convert. The conversion is a SELECT, so you can project columns, rename them, and filter rows in the same pass:
clickhouse local -q "
SELECT id, country, amount, user.name AS name
FROM file('events.json', JSONEachRow)
WHERE event = 'purchase'
INTO OUTFILE 'purchases.jsonl' TRUNCATE FORMAT JSONEachRow"
{"id":2,"country":"US","amount":2.01,"name":"Linus"}
{"id":6,"country":"US","amount":6.05,"name":"Linus"}
Write compressed JSONL directly. Add a .gz (or .zst, .lz4) suffix and clickhouse-local picks the codec from the extension, with no separate compress step:
clickhouse local -q "SELECT * FROM file('events.json', JSONEachRow) INTO OUTFILE 'events.jsonl.gz' TRUNCATE FORMAT JSONEachRow"
No size limit. It streams, so the input array can be much larger than memory. That is the case the upload sites cannot handle.
If you would rather stay in Python, chDB is the same engine embedded in-process. Same SQL, same JSONEachRow read and write:
import chdb
chdb.query(
"SELECT * FROM file('events.json', JSONEachRow) "
"INTO OUTFILE 'events.jsonl' TRUNCATE FORMAT JSONEachRow"
)
{"id":1,"event":"login","country":"GB","amount":1,"user":{"name":"Ada","tier":"free"},"items":[0]}
{"id":2,"event":"purchase","country":"US","amount":2.01,"user":{"name":"Linus","tier":"pro"},"items":[0,1]}
No pandas round-trip and no manual json.loads over the array. The file is written by the engine.
Fast enough that the conversion is not the bottleneck. Converting a 2,000,000-element array (a ~226 MB events_large.json) to JSONL:
clickhouse local -q "SELECT * FROM file('events_large.json', JSONEachRow) INTO OUTFILE 'events_large.jsonl' TRUNCATE FORMAT JSONEachRow"
~0.83 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses the JSON array, infers the schema, and serialises 2,000,000 lines back out. The number may shift slightly under concurrent load; the point is that it is parse-bound, not tool-bound.
Going the other way, JSONL back into a single JSON array, is the same idea with the formats swapped. See how to convert JSONL to JSON.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSON array (including the ~226 MB file used for the timing), run.sh with every command on this page, a run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-json-to-jsonl
The same SQL that converts a file on your laptop runs unchanged against a ClickHouse server or ClickHouse Cloud when the data outgrows it. Want to query the JSON in place instead of converting it? See how to run SQL on a JSON file and how to run SQL on a JSONL file.