To convert a TSV file to JSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then write the result to a JSON file with INTO OUTFILE:
clickhouse local -q "SELECT * FROM file('events.tsv') INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"
{"event_date":"2026-01-01","event_id":1,"country":"GB","action":"click","value":5,"quantity":1}
{"event_date":"2026-01-02","event_id":2,"country":"US","action":"view","value":6.01,"quantity":2}
{"event_date":"2026-01-03","event_id":3,"country":"DE","action":"purchase","value":7.02,"quantity":3}
The TSV header becomes the JSON keys and column types are inferred from the data, so value lands as 6.01 (a number) not "6.01" (a string). The file is read in place with no upload or import step, and because the read is streamed it handles files larger than RAM.
A TSV is just text: tab-separated values with a header row. clickhouse-local reads that header for the key names and infers each column's type from the values, so numbers come out as JSON numbers and strings as JSON strings. Check what was inferred with DESCRIBE:
clickhouse local -q "DESCRIBE file('events.tsv')"
event_date Nullable(Date)
event_id Nullable(Int64)
country Nullable(String)
action Nullable(String)
value Nullable(Float64)
quantity Nullable(Int64)
This is the information gain over a generic upload-a-file converter site: types are carried through, so value lands as 6.01 (a number) and not "6.01" (a string), and nothing leaves your machine. If inference guesses wrong (a leading-zero ID that should stay a string, say), pass an explicit schema as the third argument to file(), exactly as you would when querying the TSV directly.
This is the one decision that trips people up. There are two shapes commonly called "JSON", and they are not interchangeable.
JSONEachRow writes one object per line. It streams, so a billion-row file converts in constant memory, and downstream tools read it line by line:
clickhouse local -q "SELECT * FROM file('events.tsv') INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"
A single JSON array wraps every row in one [ ... ] document. That is what JSON.parse() in a browser wants, but it forces the whole result into one value. Get it by keeping JSONEachRow and turning on output_format_json_array_of_rows:
clickhouse local -q "SELECT * FROM file('events.tsv') INTO OUTFILE 'events.json' FORMAT JSONEachRow" \
--output_format_json_array_of_rows=1
[
{"event_date":"2026-01-01","event_id":1,"country":"GB","action":"click","value":5,"quantity":1},
{"event_date":"2026-01-02","event_id":2,"country":"US","action":"view","value":6.01,"quantity":2},
{"event_date":"2026-01-03","event_id":3,"country":"DE","action":"purchase","value":7.02,"quantity":3}
]
Pick the array when something downstream must load the document whole; pick JSONEachRow for everything else, especially large files and streaming pipelines. (ClickHouse also has a plain JSON format, but it wraps the rows in a meta/data/statistics envelope meant for query results, not for a clean data file, so it is rarely what you want for a conversion.)
Because the TSV is a table, the full SQL surface is available during the conversion. Project columns, filter rows, rename, derive — the JSON only contains what you select:
clickhouse local -q "
SELECT event_date, country, upper(action) AS action_upper, value
FROM file('events.tsv')
WHERE action = 'purchase'
ORDER BY value DESC
LIMIT 3
INTO OUTFILE 'purchases.jsonl' FORMAT JSONEachRow"
{"event_date":"2026-01-19","country":"GB","action_upper":"PURCHASE","value":23.18}
{"event_date":"2026-01-15","country":"DE","action_upper":"PURCHASE","value":19.14}
{"event_date":"2026-01-11","country":"IN","action_upper":"PURCHASE","value":15.1}
A flat TSV becomes nested JSON just as easily. Build an object with map() (or a tuple / named tuple) and it serializes as a nested JSON value:
clickhouse local -q "
SELECT event_id, map('country', country, 'action', action) AS attrs, value
FROM file('events.tsv')
LIMIT 3
INTO OUTFILE 'nested.jsonl' FORMAT JSONEachRow"
{"event_id":1,"attrs":{"country":"GB","action":"click"},"value":5}
{"event_id":2,"attrs":{"country":"US","action":"view"},"value":6.01}
{"event_id":3,"attrs":{"country":"DE","action":"purchase"},"value":7.02}
Add a .gz (or .zst, .lz4, .xz) suffix to the output name and the file is compressed as it is written. No extra flag, and the same suffix is decompressed transparently on read:
clickhouse local -q "SELECT * FROM file('events.tsv') INTO OUTFILE 'events.jsonl.gz' FORMAT JSONEachRow"
clickhouse local -q "SELECT count() FROM file('events.jsonl.gz', 'JSONEachRow')"
On a 3,000,000-row, ~105 MB TSV (events_large.tsv), converting the whole file to JSONEachRow completes in:
clickhouse local -q "SELECT * FROM file('events_large.tsv') INTO OUTFILE 'events_large.jsonl' FORMAT JSONEachRow"
~0.27 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses the TSV text and serializes ~312 MB of JSON from scratch each run, with no cached table. Numbers may shift slightly under concurrent load; the point is that the conversion is I/O-bound, not a bottleneck. An online converter would have you upload the file first.
chDB is the same engine embedded in Python, so the conversion is the same SQL with the result written to a file:
import chdb
# JSONEachRow (NDJSON): one object per line.
with open("events.jsonl", "w") as f:
f.write(str(chdb.query("SELECT * FROM file('events.tsv')", "JSONEachRow")))
# A single JSON array instead.
with open("events.json", "w") as f:
f.write(str(chdb.query(
"SELECT * FROM file('events.tsv') SETTINGS output_format_json_array_of_rows = 1",
"JSONEachRow",
)))
{"event_date":"2026-01-01","event_id":1,"country":"GB","action":"click","value":5,"quantity":1}
{"event_date":"2026-01-02","event_id":2,"country":"US","action":"view","value":6.01,"quantity":2}
{"event_date":"2026-01-03","event_id":3,"country":"DE","action":"purchase","value":7.02,"quantity":3}
Reverse direction? See how to convert JSON to TSV.
clickhouse-local runs the same SELECT ... INTO OUTFILE ... FORMAT unchanged across dozens of formats, against remote files, and against a ClickHouse server or ClickHouse Cloud when the data outgrows your laptop, with no rewrite. Once the data is JSON, you can run SQL on the JSONL output the same way.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample TSVs (including the ~105 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-tsv-to-json