To convert MessagePack to JSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
Then point it at the file and write the result as JSON. MsgPack carries no schema, so you give the columns once in the file() call:
{"event_id":1,"ts":"2026-01-01 00:00:00","event_type":"login","country":"GB","amount":5}
{"event_id":2,"ts":"2026-01-01 00:02:17","event_type":"click","country":"US","amount":6.01}
{"event_id":3,"ts":"2026-01-01 00:04:34","event_type":"purchase","country":"DE","amount":7.02}
The file is read in place with no import step. ClickHouse knows the schema you declared, so numbers come out as JSON numbers rather than quoted strings.
Most formats describe themselves. Parquet and Avro embed their column names and types; a CSV has a header row. MessagePack does none of this. It is a compact binary encoding of values with no table-level metadata, so clickhouse-local cannot guess the shape of a .msgpack file. Read one without a structure and it tells you so:
The fix is the second and third arguments to file(): the format name MsgPack and an explicit column list. The columns are read positionally in the order they were packed, so the names are yours to choose but the order and types must match the data:
Once that structure is set, the file behaves like any other table.
There are two common shapes for "JSON", and clickhouse-local writes both. The format you pick decides the layout; the file() read is identical.
JSONEachRow writes one JSON object per line (NDJSON / JSON Lines). It streams, stays append-friendly, and is the right default for logs and data pipelines:
JSON writes a single document: a data array plus a meta block listing each column and its type. Use it when a consumer wants one well-formed JSON value:
{
"meta":
[
{ "name": "event_id", "type": "UInt64" },
{ "name": "ts", "type": "DateTime" },
{ "name": "event_type", "type": "String" },
{ "name": "country", "type": "String" },
{ "name": "amount", "type": "Float64" }
],
"data":
[
{ "event_id": 1, "ts": "2026-01-01 00:00:00", "event_type": "login", "country": "GB", "amount": 5 },
{ "event_id": 2, "ts": "2026-01-01 00:02:17", "event_type": "click", "country": "US", "amount": 6.01 }
]
}
Notice the types survive the trip. event_id stays an integer, amount stays a float, and the meta block records exactly what each column is. That is the difference from a generic byte-for-byte converter: ClickHouse knows the schema, so numbers come out as JSON numbers rather than quoted strings.
A converter that is also a SQL engine gives you more than a format swap. A few things that browser upload tools can't do:
-
Filter and reshape in the same command. The conversion is just a SELECT, so project columns, rename them, filter rows, or aggregate before you write. Convert only what you need:
clickhouse local -q "SELECT event_id, ts, country, amount
FROM file('events.msgpack', MsgPack,
'event_id UInt64, ts DateTime, event_type String, country String, amount Float64')
WHERE event_type = 'purchase' ORDER BY amount DESC LIMIT 3
FORMAT JSONEachRow"
{"event_id":19,"ts":"2026-01-01 00:41:06","country":"FR","amount":23.18}
{"event_id":15,"ts":"2026-01-01 00:31:58","country":"IN","amount":19.14}
{"event_id":11,"ts":"2026-01-01 00:22:50","country":"GB","amount":15.1}
-
Cast types on the way out. If a packed value should be a different type in the JSON, set it in the structure (amount Decimal(10,2)) or cast it in the SELECT.
-
Compress the output for free. Give the output file a .gz, .zst, or .lz4 suffix and clickhouse-local compresses as it writes, no extra flag: INTO OUTFILE 'events.jsonl.zst'.
-
Nothing leaves your machine. The data is never uploaded. Run it on a laptop, in CI, or on a locked-down box with no internet.
On a 3,000,000-row, ~93 MB events.msgpack, the full convert-to-NDJSON (read every value, write a 301 MB .jsonl) completes in ~0.52 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The number includes reading every MsgPack value and serializing every JSON line; there is no cached table. Concurrent load on the machine can move it slightly.
chDB is ClickHouse as an in-process Python library, so the exact same SQL runs without a server. Same file() call, same explicit MsgPack structure:
That writes the same NDJSON file the CLI does. To read a MsgPack file into a DataFrame instead of converting it, see how to read a MessagePack file in Python with chDB.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample MsgPack files (including the ~93 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-messagepack-to-json
The same SELECT ... INTO OUTFILE works whether you point it at one file on a laptop, a directory of files, or a ClickHouse Cloud table — the SQL doesn't change as the data grows. To convert to a different target, see how to convert MessagePack to CSV; to keep working in JSON, see how to run SQL on a JSON Lines file.