How to convert CSV to JSON

To convert a CSV file to JSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then select from the CSV and write the result with FORMAT JSONEachRow:

1clickhouse local -q "SELECT * FROM file('orders.csv') INTO OUTFILE 'orders.jsonl' FORMAT JSONEachRow"

1{"order_date":"2026-01-01","order_id":1,"country":"GB","product":"widget","revenue":5,"quantity":1}
2{"order_date":"2026-01-02","order_id":2,"country":"US","product":"gadget","revenue":6.01,"quantity":2}
3{"order_date":"2026-01-03","order_id":3,"country":"DE","product":"gizmo","revenue":7.02,"quantity":3}

The CSV is read in place with no import step: the header supplies column names, the data supplies types, and those types carry straight into the JSON values.

One object per line, or one JSON array?

This is the choice that trips people up. "JSON" can mean two different file shapes, and CSV-to-JSON converters often pick one without telling you. ClickHouse exposes both, so you pick.

One object per line (NDJSON, also called JSON Lines) uses FORMAT JSONEachRow. Each row is an independent JSON object on its own line. There is no enclosing array, so a reader never has to hold the whole file in memory:

1clickhouse local -q "SELECT * FROM file('orders.csv') INTO OUTFILE 'orders.jsonl' FORMAT JSONEachRow"

A single JSON array of objects uses FORMAT JSON. ClickHouse emits one document: a data array of row objects, plus a meta block describing the columns and a statistics footer.

1clickhouse local -q "SELECT * FROM file('orders.csv') FORMAT JSON" > orders.json

1{
2	"meta":
3	[
4		{
5			"name": "order_date",
6			"type": "Nullable(Date)"
7		},
8		...
9	],
10
11	"data":
12	[
13		{
14			"order_date": "2026-01-01",
15			"order_id": 1,
16			"country": "GB",
17			"product": "widget",
18			"revenue": 5,
19			"quantity": 1
20		},
21		...
22	],
23
24	"rows": 2,
25	...
26}

Use JSONEachRow for anything streaming or large, and for APIs that ingest line-delimited records. Use JSON when a downstream tool expects one self-describing document. If you want a bare array of arrays with no metadata wrapper, FORMAT JSONCompactEachRow writes one array per row:

1["2026-01-01", 1, "GB", "widget", 5, 1]
2["2026-01-02", 2, "US", "gadget", 6.01, 2]

For more on the line-delimited shape, see what is NDJSON.

Types are preserved, not stringified

This is the part a generic converter usually gets wrong: it reads every CSV cell as text and writes every JSON value as a quoted string. clickhouse-local infers the type of each column first, then writes the matching JSON type. Check what it inferred with DESCRIBE:

1clickhouse local -q "DESCRIBE file('orders.csv')"

1order_date	Nullable(Date)
2order_id	Nullable(Int64)
3country	Nullable(String)
4product	Nullable(String)
5revenue	Nullable(Float64)
6quantity	Nullable(Int64)

In the resulting JSON, order_id and quantity are bare integers and revenue is a float. No quotes, no manual cleanup downstream:

1{"order_date":"2026-01-01","order_id":1,"country":"GB","product":"widget","revenue":5,"quantity":1}

Options that matter

A web-based converter takes a file, returns a file, and gives you no control over the two things that actually bite you. From the command line you control both.

Keep a column as a string when inference guesses wrong. A zero-padded ID like 007 or a US ZIP code looks numeric, and an automatic converter will silently turn it into 7, dropping the leading zeros. Pin the column type by passing the format and an explicit schema as the second and third arguments to file():

1clickhouse local -q "
2SELECT * FROM file('orders.csv', 'CSVWithNames',
3  'order_date Date, order_id String, country String, product String, revenue Float64, quantity UInt8')
4LIMIT 1 FORMAT JSONEachRow"

1{"order_date":"2026-01-01","order_id":"1","country":"GB","product":"widget","revenue":5,"quantity":1}

order_id is now a quoted string, so a value like "007" survives intact.

Transform on the way out. Because the CSV is a table, you can filter, rename, compute, and reshape in the same SELECT that does the conversion. Rename columns with AS, drop ones you do not need, add a computed field, or WHERE out the rows you do not want. The conversion and the cleanup are one pass.

Reverse direction?

Going the other way, from JSON back to CSV, is the mirror image: SELECT from the JSON file and write FORMAT CSVWithNames. See how to convert JSON to CSV.

In Python with chDB

If your pipeline is Python, chDB is the same engine in-process. Run the same SQL with the same FORMAT and write to a file:

1import chdb
2
3# One object per line (NDJSON).
4chdb.query(
5    "SELECT * FROM file('orders.csv') "
6    "INTO OUTFILE 'orders.jsonl' TRUNCATE FORMAT JSONEachRow"
7)
8
9# Or capture a single JSON array document as a string.
10doc = chdb.query("SELECT * FROM file('orders.csv') FORMAT JSON")

For reading JSON into a DataFrame from Python, see how to read a JSON file in Python with chDB.

How fast is it?

On a 2,000,000-row, ~84 MB CSV (orders_large.csv), converting the whole file to line-delimited JSON:

1clickhouse local -q "SELECT * FROM file('orders_large.csv') INTO OUTFILE 'orders_large.jsonl' FORMAT JSONEachRow"

~0.29 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses every CSV row and writes 2,000,000 JSON objects, no cached table involved. The JSON output is larger than the CSV (here ~222 MB vs ~84 MB) because every field repeats its key on every row; pipe it through a .gz filename to compress on the fly.

Run it yourself

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample CSVs (including the ~84 MB file used for the timing), run.sh with every command on this page, a chDB run.py / run.ipynb, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-csv-to-json

The same SQL scales from a file on your laptop to a ClickHouse server to ClickHouse Cloud with no rewrite. Want to query the CSV in place rather than convert it? See how to run SQL on a CSV file.

One object per line, or one JSON array?

Types are preserved, not stringified

Options that matter

Reverse direction?

In Python with chDB

How fast is it?

Run it yourself

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

How to convert CSV to JSON

One object per line, or one JSON array? #

Types are preserved, not stringified #

Options that matter #

Reverse direction? #

In Python with chDB #

How fast is it? #

Run it yourself #

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

One object per line, or one JSON array?

Types are preserved, not stringified

Options that matter

Reverse direction?

In Python with chDB

How fast is it?

Run it yourself