To convert a CSV file to NDJSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then write the rows out as JSONEachRow:
clickhouse local -q "SELECT * FROM file('events.csv') INTO OUTFILE 'events.ndjson' FORMAT JSONEachRow"
{"event_date":"2026-01-01","event_id":1,"country":"GB","action":"click","value":5,"count":1}
{"event_date":"2026-01-02","event_id":2,"country":"US","action":"view","value":6.01,"count":2}
{"event_date":"2026-01-03","event_id":3,"country":"DE","action":"signup","value":7.02,"count":3}
{"event_date":"2026-01-04","event_id":4,"country":"FR","action":"purchase","value":8.03,"count":4}
{"event_date":"2026-01-05","event_id":5,"country":"IN","action":"click","value":9.04,"count":5}
The CSV is read in place with no import step: clickhouse local reads the header for column names, infers each column's type from the data, and emits one JSON object per line. Numbers stay numbers in the output; dates come out as quoted ISO strings.
NDJSON, JSONL, and the .ndjson extension
NDJSON (newline-delimited JSON) and JSONL (JSON Lines) are the same format: one self-contained JSON object per line, no enclosing array, no commas between records. ClickHouse writes both with the JSONEachRow format. The only thing that changes is the file extension you choose. Use .ndjson here, or .jsonl if that's what your downstream tool expects:
clickhouse local -q "SELECT * FROM file('events.csv') INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"
This is different from the JSON format, which writes a single top-level array plus metadata. NDJSON is the line-oriented variant that streaming tools, log pipelines, and jq expect, because each line stands alone.
The reason to convert with a SQL engine rather than a text-munging script is type fidelity. The CSV header gives column names; the data gives types. Check what was inferred with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.csv')"
event_date Nullable(Date)
event_id Nullable(Int64)
country Nullable(String)
action Nullable(String)
value Nullable(Float64)
count Nullable(Int64)
Those types decide how each value is rendered in the JSON. Numbers come out as JSON numbers ("value":6.01), not quoted strings. Dates come out as quoted strings in ISO form. A generic CSV-to-JSON converter that treats every field as text would quote everything, and you'd have to re-cast on the other side.
Options: things an upload-and-convert site can't do
The information gain over a browser-based converter is here.
Force a type when inference guesses wrong. A zero-padded order id or a US ZIP code looks like an integer and loses its leading zeros if inferred as one. Pass the format and an explicit schema as the second and third arguments to file() to keep it a string:
clickhouse local -q "
SELECT * FROM file('events.csv', 'CSVWithNames',
'event_date Date, event_id String, country String, action String, value Float64, count UInt8')
INTO OUTFILE 'events_typed.ndjson' FORMAT JSONEachRow"
{"event_date":"2026-01-01","event_id":"1","country":"GB","action":"click","value":5,"count":1}
{"event_date":"2026-01-02","event_id":"2","country":"US","action":"view","value":6.01,"count":2}
event_id is now "1" (quoted) instead of 1.
Transform on the way out. Because it's SQL, you can filter, rename, reshape, or compute columns in the same pass — SELECT country, value FROM file('events.csv') WHERE action = 'purchase'. No converter site does that.
No upload. The file never leaves your machine. For anything with customer data, that alone settles it.
The conversion is lossless and the output reads straight back. JSONEachRow is self-describing, so ClickHouse re-infers the schema with no extra arguments:
clickhouse local -q "
SELECT country, count() AS events, round(sum(value),2) AS total
FROM file('events.ndjson')
GROUP BY country ORDER BY total DESC"
US 4 60.4
GB 4 56.36
AU 3 48.33
IN 3 45.3
FR 3 42.27
DE 3 39.24
Reverse direction? See how to convert NDJSON to CSV.
If you'd rather stay in Python, chDB is the same ClickHouse engine as an in-process module. The SQL is identical; you capture the bytes and write them to a file:
import chdb
ndjson = chdb.query("SELECT * FROM file('events.csv') FORMAT JSONEachRow").bytes()
with open("events.ndjson", "wb") as f:
f.write(ndjson)
No server, no temporary table, and it slots into an existing pandas or pyarrow pipeline.
On a 3,000,000-row, ~123 MB CSV (events_large.csv), the full conversion to a ~312 MB NDJSON file completes in ~0.35 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That's about 350 MB/s of input parsed, typed, and re-serialized as JSON. The row count matches exactly: 3,000,000 in, 3,000,000 out.
run 1: real 0.35
run 2: real 0.35
run 3: real 0.35
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample CSVs (including the ~123 MB file used for the timing), run.sh with every command on this page, a run.py / run.ipynb for the chDB path, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-csv-to-ndjson
The same SELECT ... FORMAT JSONEachRow runs unchanged against a file on your laptop, against a ClickHouse server, and against ClickHouse Cloud when the data outgrows one machine. Related: convert CSV to JSON, convert CSV to Parquet, and run SQL on a JSONL file.