To run SQL on a JSONL file, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the file with the file() function:
clickhouse local -q "SELECT * FROM file('events.jsonl') LIMIT 5"
┌─event_date─┬─event_id─┬─country─┬─event_type─┬─revenue─┬─quantity─┐
1. │ 2026-01-01 │ 1 │ GB │ click │ 5 │ 1 │
2. │ 2026-01-02 │ 2 │ US │ view │ 6.01 │ 2 │
3. │ 2026-01-03 │ 3 │ DE │ purchase │ 7.02 │ 3 │
4. │ 2026-01-04 │ 4 │ FR │ refund │ 8.03 │ 4 │
5. │ 2026-01-05 │ 5 │ IN │ click │ 9.04 │ 5 │
└────────────┴──────────┴─────────┴────────────┴─────────┴──────────┘
A .jsonl file holds one JSON object per line; ClickHouse reads it with the JSONEachRow format, turns each key into a column, and infers the types from the values. The file is read in place with no import step.
Prefer Python? See How to read a JSONL file in Python and query it with SQL for the same queries against a pandas DataFrame.
JSONL is one JSON object per line, no enclosing array and no commas between objects:
{"event_date":"2026-01-01","event_id":1,"country":"GB","event_type":"click","revenue":5,"quantity":1}
{"event_date":"2026-01-02","event_id":2,"country":"US","event_type":"view","revenue":6.01,"quantity":2}
That layout goes by several names. JSONL, NDJSON (newline-delimited JSON), and JSON Lines are the same thing, and you will see the extensions .jsonl, .ndjson, and sometimes a line-delimited .json. ClickHouse reads all of them with one format, JSONEachRow, and detects it from the .jsonl and .ndjson extensions automatically. The only time you name the format yourself is when the extension does not give it away (for example a line-delimited file ending in .json, which would otherwise be read as a single JSON document):
clickhouse local -q "SELECT count() FROM file('events.json', JSONEachRow)"
If your data is instead a single top-level JSON array or a pretty-printed document, that is a different shape — see how to run SQL on a JSON file.
You never write CREATE TABLE. ClickHouse scans the objects, collects the keys, and infers a type per column. DESCRIBE shows what it found:
clickhouse local -q "DESCRIBE file('events.jsonl') FORMAT PrettyCompact"
┌─name───────┬─type──────────────┐
1. │ event_date │ Nullable(Date) │
2. │ event_id │ Nullable(Int64) │
3. │ country │ Nullable(String) │
4. │ event_type │ Nullable(String) │
5. │ revenue │ Nullable(Float64) │
6. │ quantity │ Nullable(Int64) │
└────────────┴───────────────────┘
The keys gave the names; the values gave the types. Columns are Nullable because a JSONL stream can omit a key on any line, and a missing key reads back as NULL. If inference guesses a type you don't want, pass an explicit structure as the third argument: file('events.jsonl', 'JSONEachRow', 'event_id UInt32, country String, ...').
Filter, aggregate, and group by
Because the file is a table, the full SQL surface works on it (WHERE, GROUP BY, aggregate and window functions, joins) with no load step in between:
clickhouse local -q "
SELECT country,
count() AS purchases,
round(sum(revenue), 2) AS revenue,
round(avg(quantity), 3) AS avg_qty
FROM file('events.jsonl')
WHERE event_type = 'purchase'
GROUP BY country
ORDER BY revenue DESC
FORMAT PrettyCompact"
┌─country─┬─purchases─┬─revenue─┬─avg_qty─┐
1. │ GB │ 2 │ 34.24 │ 3 │
2. │ DE │ 2 │ 26.16 │ 4 │
3. │ IN │ 1 │ 15.1 │ 1 │
└─────────┴───────────┴─────────┴─────────┘
The same SQL runs unchanged across dozens of formats and remote sources, and against a ClickHouse server or ClickHouse Cloud when the data outgrows your machine.
You don't unzip anything first. clickhouse local detects the .gz suffix and decompresses on the fly, so a .jsonl.gz file is queried exactly like a .jsonl:
clickhouse local -q "SELECT country, count() FROM file('events.jsonl.gz') GROUP BY country ORDER BY country FORMAT PrettyCompact"
┌─country─┬─count()─┐
1. │ AU │ 3 │
2. │ DE │ 3 │
3. │ FR │ 3 │
4. │ GB │ 4 │
5. │ IN │ 3 │
6. │ US │ 4 │
└─────────┴─────────┘
The same applies to .jsonl.zst, .jsonl.xz, and other supported codecs. The compression is read from the file name. (Query a compressed file.)
JSONL is convenient to produce and stream, but slow to scan repeatedly: every query re-parses text and re-infers types. If you'll query the data more than once, convert it to Parquet once and read that instead — SELECT from the JSONL, INTO OUTFILE as Parquet:
clickhouse local -q "SELECT * FROM file('events.jsonl') INTO OUTFILE 'events.parquet' TRUNCATE FORMAT Parquet"
The result is a real Parquet file with the columns and types ClickHouse inferred. Read it back with the same file() call. See how to query a Parquet file and the dedicated convert JSONL to Parquet guide for the typing and compression options.
Small files are instant in anything; the difference shows up at scale. On a 3,000,000-row, ~342 MB JSONL file (events_large.jsonl), a filter-and-group-by over every line, parsing the JSON text from scratch on each run, completes in:
clickhouse local -q "
SELECT country, count(), round(sum(revenue), 2) AS revenue, round(avg(quantity), 3)
FROM file('events_large.jsonl')
WHERE event_type = 'purchase'
GROUP BY country ORDER BY revenue DESC"
~0.54 seconds, best of three with the file warm in the OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM; clickhouse local 26.6.1.117). That includes parsing every JSON object from scratch; there's no cached table. The number may move a little under concurrent load. Convert the same data to Parquet and the scan gets faster again, because Parquet is columnar and typed.
The query you just ran on a laptop file is the same SQL you would run on a ClickHouse server, or in ClickHouse Cloud. Nothing about SELECT ... WHERE ... GROUP BY changes. You swap file('events.jsonl') for a table name and the rest stays put. You prototype against a file on your machine and ship the identical logic to production.
The complete, runnable example lives in the ClickHouse examples repo. It has generate.sh (builds the demo file, a gzipped copy, and the ~342 MB perf file), run.sh (every command on this page), and expected_output.txt:
github.com/ClickHouse/examples → local-analytics/clickhouse-local-jsonl
git clone https://github.com/ClickHouse/examples
cd examples/local-analytics/clickhouse-local-jsonl
./generate.sh && ./run.sh