To convert Parquet to NDJSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the Parquet file and write the output as JSONEachRow:
clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events.ndjson' FORMAT JSONEachRow"
{"event_id":1,"ts":"2026-05-31 23:00:00.000","action":"login","country":"GB","amount":1,"is_member":1,"attrs":{"os":"mac","plan":"free"}}
{"event_id":2,"ts":"2026-06-01 00:00:00.000","action":"purchase","country":"US","amount":2.01,"is_member":0,"attrs":{"os":"win","plan":"pro"}}
{"event_id":3,"ts":"2026-06-01 01:00:00.000","action":"logout","country":"DE","amount":3.02,"is_member":1,"attrs":{"os":"linux","plan":"free"}}
clickhouse local reads the Parquet schema from the file footer, so there are no columns to declare. The Parquet file is read in place with no import step, and rows stream out as NDJSON one JSON object per line, so files larger than RAM convert without buffering them whole.
Parquet carries its own typed schema in the file footer, so there is nothing to declare. ClickHouse reads it and you can see exactly what it found with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.parquet')"
event_id UInt64
ts DateTime64(3, 'UTC')
action String
country String
amount Float64
is_member UInt8
attrs Map(String, String)
Those types decide how each value is written in the NDJSON. Numbers stay numbers, strings stay strings, and a nested Map becomes a nested JSON object ("attrs":{"os":"mac","plan":"free"}). This is the advantage of converting Parquet rather than a flat text format: the nesting and types survive the trip, instead of being flattened or guessed.
This is where a one-line conversion beats an upload-required online converter: you control the output, and nothing leaves your machine.
Booleans. A Parquet boolean usually arrives as ClickHouse UInt8, so it serialises as 1 or 0. If you want literal true / false in the JSON, cast it to Bool:
clickhouse local -q "SELECT event_id, is_member::Bool AS is_member FROM file('events.parquet') LIMIT 2 FORMAT JSONEachRow"
{"event_id":1,"is_member":true}
{"event_id":2,"is_member":false}
Large integers. JSONEachRow writes 64-bit integers unquoted by default. Many JSON parsers (including JavaScript's JSON.parse) lose precision above 2^53. If your IDs are large Int64/UInt64 values, set output_format_json_quote_64bit_integers=1 to emit them as quoted strings and keep them exact.
Project and filter while converting. Because the read is a normal SELECT, you can rename columns, drop the ones you don't need, filter rows, and sort, all in the same pass. No second tool:
clickhouse local -q "
SELECT event_id, country, amount
FROM file('events.parquet')
WHERE action = 'purchase'
ORDER BY amount DESC
LIMIT 3
INTO OUTFILE 'purchases.ndjson' FORMAT JSONEachRow"
{"event_id":18,"country":"DE","amount":18.17}
{"event_id":14,"country":"FR","amount":14.13}
{"event_id":10,"country":"IN","amount":10.09}
Compress on the way out. Add a .gz (or .zst, .lz4, .xz) suffix to the output name and ClickHouse compresses as it writes. NDJSON is verbose, so this matters. Reading it back needs no flag either, because the codec is inferred from the file name:
clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events.ndjson.gz' FORMAT JSONEachRow"
clickhouse local -q "SELECT count() FROM file('events.ndjson.gz')"
Conversion is I/O-bound and streams, so it is quick. Converting a 3,000,000-row Parquet file to NDJSON (a ~448 MB .ndjson) completes in:
clickhouse local -q "SELECT * FROM file('events_large.parquet') INTO OUTFILE 'events_large.ndjson' FORMAT JSONEachRow"
~0.29 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The number may shift a little under concurrent load, but the point holds: the conversion is not the slow part of your pipeline. The same command streams files far larger than memory at the same rate, since rows are read and written in batches rather than buffered whole.
Going the other way is the same idea with the formats swapped. NDJSON is the schema-light interchange format; Parquet is the compact, typed, columnar one you query repeatedly. See convert NDJSON to Parquet for that pass, including how the JSON types are inferred.
If you live in Python, chDB is the same ClickHouse engine in-process, so the SQL is identical. The same SELECT ... INTO OUTFILE ... FORMAT JSONEachRow writes the file:
import chdb
chdb.query(
"SELECT * FROM file('events.parquet') "
"INTO OUTFILE 'events.ndjson' TRUNCATE FORMAT JSONEachRow"
)
Drop the INTO OUTFILE and chDB returns the NDJSON bytes instead, ready to stream into a request body or another process:
ndjson = chdb.query(
"SELECT event_id, country, amount FROM file('events.parquet') "
"WHERE action = 'purchase' ORDER BY amount DESC LIMIT 2 FORMAT JSONEachRow"
)
print(str(ndjson).rstrip())
{"event_id":18,"country":"DE","amount":18.17}
{"event_id":14,"country":"FR","amount":14.13}
This is also a clean alternative to looping pandas.read_parquet then to_json(orient="records", lines=True) when the file is bigger than memory: chDB streams it instead of materialising a DataFrame.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample Parquet files (including the 3M-row file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-parquet-to-ndjson
The same SQL that converts one local file runs unchanged against a directory of files, a Parquet file on S3, a ClickHouse server, or ClickHouse Cloud when the data outgrows your laptop. Related guides: how to query a Parquet file, what is NDJSON, and run SQL on a JSON Lines file.