How to convert JSON to TSV

Al Brown
Last updated: Jun 15, 2026

To convert JSON to TSV, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then write the JSON file out as TSV:

1clickhouse local -q "SELECT * FROM file('events.jsonl') INTO OUTFILE 'events.tsv' TRUNCATE FORMAT TSVWithNames"
1event_id	event_type	ts	user	source	amount
21	login	2026-06-01 00:00:00	('1','free')	web	0.5
32	purchase	2026-06-01 00:02:17	('2','pro')	ios	1.5
43	logout	2026-06-01 00:04:34	('3','team')	android	2.5

clickhouse local reads the JSON in place, infers a schema and column types, and streams rows straight into the tab-separated file with no import step first. TSVWithNames writes a header row of column names; use TSV if you want data only.

The gotcha: nested objects don't fit a flat cell #

Look at the user column above. The input had a nested object:

1{"event_id":1,"event_type":"login","ts":"2026-06-01 00:00:00","user":{"id":"1","plan":"free"},"source":"web","amount":0.5}

TSV is flat: one value per column, no structure. JSON is not. When ClickHouse meets a nested object, it infers a Tuple and serializes the whole thing into one TSV cell as ('1','free'). Run DESCRIBE to see it:

1clickhouse local -q "DESCRIBE file('events.jsonl')"
1event_id	Nullable(Int64)
2event_type	Nullable(String)
3ts	Nullable(DateTime)
4user	Tuple(id Nullable(String), plan Nullable(String))
5source	Nullable(String)
6amount	Nullable(Float64)

That user tuple is technically valid TSV, but it's not what a downstream spreadsheet or load script expects. Tab-separated tables are columnar and flat; a tuple crammed into one column defeats the point of converting.

Flatten before TSV #

Promote the nested fields to top-level columns by selecting them by path. user.id and user.plan each become their own column:

1clickhouse local -q "
2SELECT
3  event_id,
4  event_type,
5  ts,
6  user.id   AS user_id,
7  user.plan AS user_plan,
8  source,
9  amount
10FROM file('events.jsonl')
11INTO OUTFILE 'events.tsv' TRUNCATE FORMAT TSVWithNames"
1event_id	event_type	ts	user_id	user_plan	source	amount
21	login	2026-06-01 00:00:00	1	free	web	0.5
32	purchase	2026-06-01 00:02:17	2	pro	ios	1.5
43	logout	2026-06-01 00:04:34	3	team	android	2.5
54	signup	2026-06-01 00:06:51	4	free	web	3.5

Now every column holds a single scalar value, which is what TSV is for. This is the rule for JSON to TSV in general: convert with SELECT * only when your JSON is already flat. The moment it nests, name the leaf fields you actually want and let the rest drop.

For deeper or array-valued JSON, you have the full SQL surface to reshape during the convert: JSONExtractString to pull a field out of a raw JSON string, arrayJoin to explode an array into rows, tupleElement to reach into a tuple. See how to query nested JSON with SQL for the reshaping patterns. The principle holds: flatten to scalars before you write TSV.

Types carry over, with one thing to watch #

The schema was inferred from the JSON values, and those types land in the TSV as-is: ts stays a formatted DateTime, amount stays a number, the ids stay strings. TSV itself is untyped text, so the type information lives in how each value is written, not in the file. If a consumer needs the schema too, keep TSVWithNames for the header (or TSVWithNamesAndTypes, which writes a second row of types).

One lossy case to know: a JSON null inside a nested field, once flattened, is written as \N (TSV's null marker). That round-trips back through ClickHouse correctly, but some other readers treat \N as the literal text. Check your downstream tool if nulls matter.

Options worth knowing #

This is where a one-line local convert beats a paste-into-a-website converter: you control the output, and nothing leaves your machine.

  • Pick the columns and their order. SELECT is the projection. Rename with AS, reorder freely, drop noise columns, cast a field with CAST(amount AS Decimal(10,2)).
  • Filter while you convert. Add a WHERE so the TSV holds only the rows you want: WHERE event_type = 'purchase'. No second pass.
  • Header or no header. TSVWithNames writes column names; TSV writes data only; TSVWithNamesAndTypes adds a types row.
  • A top-level JSON array works too. If your file is one big [ {...}, {...} ] array rather than line-delimited, JSONEachRow reads it directly. Point file('events.json', 'JSONEachRow') at it.
  • Compressed in, compressed out. A .jsonl.gz input is decompressed automatically from its extension; name the output events.tsv.gz and it's gzipped on write. No flags.

clickhouse-local runs the same SQL unchanged across dozens of formats, against remote files, and against a ClickHouse server or ClickHouse Cloud when the data outgrows your laptop — no rewrite.

How fast is it? #

On a 1,000,000-row, ~137 MB JSONL file (events_large.jsonl), flattening the nested user object and writing the full TSV completes in:

1clickhouse local -q "
2SELECT event_id, event_type, ts, user.id AS user_id, user.plan AS user_plan, source, amount
3FROM file('events_large.jsonl')
4INTO OUTFILE 'events_large.tsv' TRUNCATE FORMAT TSVWithNames"

~0.38 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). That parses every JSON line and writes a 57 MB TSV from scratch on each run. A concurrent load on the machine can nudge the number; the point is that a local convert is I/O-bound, not a queue-and-upload round trip.

Reverse direction? #

Going the other way, from a tab-separated file back to JSON, is the same shape with the formats swapped. See how to convert TSV to JSON. If you want comma-separated output instead of tabs, see how to convert JSON to CSV. New to the format itself? What is a TSV file covers the basics, and how to query a TSV file shows running SQL on the result.

In Python with chDB #

chDB is the same ClickHouse engine embedded in Python, so the SQL is identical, flatten included. Run the SELECT ... INTO OUTFILE ... FORMAT TSVWithNames straight from a script:

1import chdb
2
3sql = """
4SELECT
5  event_id,
6  event_type,
7  ts,
8  user.id   AS user_id,
9  user.plan AS user_plan,
10  source,
11  amount
12FROM file('events.jsonl')
13INTO OUTFILE 'events.tsv' TRUNCATE FORMAT TSVWithNames
14"""
15chdb.query(sql)
1event_id	event_type	ts	user_id	user_plan	source	amount
21	login	2026-06-01 00:00:00	1	free	web	0.5
32	purchase	2026-06-01 00:02:17	2	pro	ios	1.5
43	logout	2026-06-01 00:04:34	3	team	android	2.5

Same output, no server. If you'd rather read the JSON into a DataFrame and work with it there first, see how to read a JSONL file in Python with chDB.

Run it yourself #

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample JSONL files (including the ~137 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-json-to-tsv

Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...