To convert a CSV file to JSONL, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then write the CSV to JSONL in one command:
clickhouse local -q "SELECT * FROM file('events.csv') INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"
{"event_date":"2026-01-01","event_id":1,"country":"GB","action":"click","amount":5,"quantity":1}
{"event_date":"2026-01-02","event_id":2,"country":"US","action":"view","amount":6.01,"quantity":2}
{"event_date":"2026-01-03","event_id":3,"country":"DE","action":"purchase","amount":7.02,"quantity":3}
ClickHouse reads the CSV header for column names, infers a real type per column, and streams each row out as a JSON object. The file is read in place with no import step, and because conversion streams, memory use stays flat regardless of file size.
What JSONL is, and why JSONEachRow
JSONL (also written NDJSON or .ndjson) is one JSON object per line. No outer array, no commas between records, so a reader can process it line by line and a writer can append to it. That line-per-record shape is exactly what ClickHouse's JSONEachRow format produces, which is why the conversion is a single command.
Types are inferred from the CSV and carried into the JSON
A CSV is all text. The win here is that clickhouse-local infers a real type per column and emits the matching JSON type, so numbers come out as JSON numbers and only strings get quoted. Check what was inferred with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.csv')"
event_date Nullable(Date)
event_id Nullable(Int64)
country Nullable(String)
action Nullable(String)
amount Nullable(Float64)
quantity Nullable(Int64)
In the output above, event_id, amount, and quantity are bare JSON numbers; country and action are quoted strings. An upload-required converter that treats every CSV cell as a string would quote all six. Getting the types right at conversion time means the JSONL is correct for whatever reads it next, with no post-processing.
The default format for file('events.csv') is CSVWithNames: first row is column names, types come from the values. If inference guesses wrong (a ZIP code or an ID that should stay a string), pass an explicit schema as the second and third arguments to file(), exactly as in how to run SQL on a CSV file.
JSONEachRow is self-describing, so reading the result needs no schema. The file is a table:
clickhouse local -q "
SELECT country, count() AS events, round(sum(amount),2) AS amount
FROM file('events.jsonl')
GROUP BY country ORDER BY amount DESC"
US 4 60.4
GB 4 56.36
AU 3 48.33
IN 3 45.3
FR 3 42.27
DE 3 39.24
For more on querying the result, see how to run SQL on a JSONL file.
This is where a one-line command beats a web converter: you control the output.
Keep 64-bit integers JS-safe. A value above 2^53 loses precision when a JavaScript JSON.parse reads it as a number. If the JSONL will be consumed by JavaScript, quote the 64-bit integers so they arrive as exact strings:
clickhouse local -q "
SELECT * FROM file('events.csv') LIMIT 2
FORMAT JSONEachRow
SETTINGS output_format_json_quote_64bit_integers=1"
{"event_date":"2026-01-01","event_id":"1","country":"GB","action":"click","amount":5,"quantity":"1"}
{"event_date":"2026-01-02","event_id":"2","country":"US","action":"view","amount":6.01,"quantity":"2"}
event_id and quantity are now quoted; amount (a float) is untouched.
Convert and compress in one step. Name the output .jsonl.gz and the file is gzipped as it is written. Reading it back is transparent too — no decompress step:
clickhouse local -q "SELECT * FROM file('events.csv') INTO OUTFILE 'events.jsonl.gz' FORMAT JSONEachRow"
clickhouse local -q "SELECT count() FROM file('events.jsonl.gz')"
.zst, .lz4, and .xz work the same way; the codec is taken from the file name.
Select or reshape on the way out. Because the conversion is a SELECT, you can project columns, filter rows, rename fields, or compute new ones in the same command rather than fixing the JSONL afterwards.
To go back, swap the formats: read the JSONL and write CSVWithNames. See how to convert JSONL to CSV for the round-trip, including how a column that holds nested JSON flattens when it lands in a flat CSV.
clickhouse local -q "SELECT * FROM file('events.jsonl') INTO OUTFILE 'events.csv' FORMAT CSVWithNames"
On a 3,000,000-row, ~120 MB CSV (events_large.csv), converting the whole file to JSONL — parsing every CSV cell and re-encoding it as JSON — completes in ~0.34 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The output is a 309 MB .jsonl file. Conversion is streamed, so memory use stays flat regardless of file size.
If you work in Python, chDB is the same engine in-process — same SQL, same JSONEachRow format, same one-command conversion:
import chdb
# JSONEachRow = one JSON object per line (JSONL)
chdb.query("""
SELECT * FROM file('data/events.csv')
INTO OUTFILE 'data/events.jsonl' TRUNCATE
FORMAT JSONEachRow
""")
That writes events.jsonl directly, no pandas round-trip required. See also how to read a CSV file in Python with chDB.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample CSVs (including the ~120 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-csv-to-jsonl
The same SQL that converts one file on your laptop runs unchanged against a ClickHouse server or ClickHouse Cloud when the data outgrows it. If you query the data more than once, consider converting to Parquet instead, which is columnar, typed, and far faster to scan.