How to convert MessagePack to JSON

To convert MessagePack to JSON, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then point it at the file and write the result as JSON. MsgPack carries no schema, so you give the columns once in the file() call:

1clickhouse local -q "SELECT * FROM file('events.msgpack', MsgPack,
2  'event_id UInt64, ts DateTime, event_type String, country String, amount Float64')
3  INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"

1{"event_id":1,"ts":"2026-01-01 00:00:00","event_type":"login","country":"GB","amount":5}
2{"event_id":2,"ts":"2026-01-01 00:02:17","event_type":"click","country":"US","amount":6.01}
3{"event_id":3,"ts":"2026-01-01 00:04:34","event_type":"purchase","country":"DE","amount":7.02}

The file is read in place with no import step. ClickHouse knows the schema you declared, so numbers come out as JSON numbers rather than quoted strings.

The one gotcha: MsgPack has no schema

Most formats describe themselves. Parquet and Avro embed their column names and types; a CSV has a header row. MessagePack does none of this. It is a compact binary encoding of values with no table-level metadata, so clickhouse-local cannot guess the shape of a .msgpack file. Read one without a structure and it tells you so:

1Code: 636. DB::Exception: The table structure cannot be extracted from a MsgPack format file.
2You must specify setting input_format_msgpack_number_of_columns to extract table schema from MsgPack data.
3You can specify the structure manually.

The fix is the second and third arguments to file(): the format name MsgPack and an explicit column list. The columns are read positionally in the order they were packed, so the names are yours to choose but the order and types must match the data:

1clickhouse local -q "SELECT * FROM file('events.msgpack', MsgPack,
2  'event_id UInt64, ts DateTime, event_type String, country String, amount Float64') LIMIT 3"

Once that structure is set, the file behaves like any other table.

Write line-delimited JSON or one JSON array

There are two common shapes for "JSON", and clickhouse-local writes both. The format you pick decides the layout; the file() read is identical.

JSONEachRow writes one JSON object per line (NDJSON / JSON Lines). It streams, stays append-friendly, and is the right default for logs and data pipelines:

1clickhouse local -q "SELECT * FROM file('events.msgpack', MsgPack,
2  'event_id UInt64, ts DateTime, event_type String, country String, amount Float64')
3  INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"

JSON writes a single document: a data array plus a meta block listing each column and its type. Use it when a consumer wants one well-formed JSON value:

1clickhouse local -q "SELECT * FROM file('events.msgpack', MsgPack,
2  'event_id UInt64, ts DateTime, event_type String, country String, amount Float64')
3  INTO OUTFILE 'events.json' FORMAT JSON"

1{
2	"meta":
3	[
4		{ "name": "event_id", "type": "UInt64" },
5		{ "name": "ts", "type": "DateTime" },
6		{ "name": "event_type", "type": "String" },
7		{ "name": "country", "type": "String" },
8		{ "name": "amount", "type": "Float64" }
9	],
10	"data":
11	[
12		{ "event_id": 1, "ts": "2026-01-01 00:00:00", "event_type": "login", "country": "GB", "amount": 5 },
13		{ "event_id": 2, "ts": "2026-01-01 00:02:17", "event_type": "click", "country": "US", "amount": 6.01 }
14	]
15}

Notice the types survive the trip. event_id stays an integer, amount stays a float, and the meta block records exactly what each column is. That is the difference from a generic byte-for-byte converter: ClickHouse knows the schema, so numbers come out as JSON numbers rather than quoted strings.

Options worth knowing

A converter that is also a SQL engine gives you more than a format swap. A few things that browser upload tools can't do:

Filter and reshape in the same command. The conversion is just a SELECT, so project columns, rename them, filter rows, or aggregate before you write. Convert only what you need:

1clickhouse local -q "SELECT event_id, ts, country, amount
2  FROM file('events.msgpack', MsgPack,
3    'event_id UInt64, ts DateTime, event_type String, country String, amount Float64')
4  WHERE event_type = 'purchase' ORDER BY amount DESC LIMIT 3
5  FORMAT JSONEachRow"

1{"event_id":19,"ts":"2026-01-01 00:41:06","country":"FR","amount":23.18}
2{"event_id":15,"ts":"2026-01-01 00:31:58","country":"IN","amount":19.14}
3{"event_id":11,"ts":"2026-01-01 00:22:50","country":"GB","amount":15.1}

Cast types on the way out. If a packed value should be a different type in the JSON, set it in the structure (amount Decimal(10,2)) or cast it in the SELECT.
Compress the output for free. Give the output file a .gz, .zst, or .lz4 suffix and clickhouse-local compresses as it writes, no extra flag: INTO OUTFILE 'events.jsonl.zst'.
Nothing leaves your machine. The data is never uploaded. Run it on a laptop, in CI, or on a locked-down box with no internet.

How fast is it?

On a 3,000,000-row, ~93 MB events.msgpack, the full convert-to-NDJSON (read every value, write a 301 MB .jsonl) completes in ~0.52 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). The number includes reading every MsgPack value and serializing every JSON line; there is no cached table. Concurrent load on the machine can move it slightly.

Prefer Python? Same conversion with chDB

chDB is ClickHouse as an in-process Python library, so the exact same SQL runs without a server. Same file() call, same explicit MsgPack structure:

1import chdb
2
3struct = "event_id UInt64, ts DateTime, event_type String, country String, amount Float64"
4chdb.query(
5    f"SELECT * FROM file('events.msgpack', MsgPack, '{struct}') "
6    "INTO OUTFILE 'events.jsonl' FORMAT JSONEachRow"
7)

That writes the same NDJSON file the CLI does. To read a MsgPack file into a DataFrame instead of converting it, see how to read a MessagePack file in Python with chDB.

Run it yourself

The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample MsgPack files (including the ~93 MB file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.

github.com/ClickHouse/examples/tree/main/local-analytics/convert-messagepack-to-json

The same SELECT ... INTO OUTFILE works whether you point it at one file on a laptop, a directory of files, or a ClickHouse Cloud table — the SQL doesn't change as the data grows. To convert to a different target, see how to convert MessagePack to CSV; to keep working in JSON, see how to run SQL on a JSON Lines file.

The one gotcha: MsgPack has no schema

Write line-delimited JSON or one JSON array

Options worth knowing

How fast is it?

Prefer Python? Same conversion with chDB

Run it yourself

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

How to convert MessagePack to JSON

The one gotcha: MsgPack has no schema #

Write line-delimited JSON or one JSON array #

Options worth knowing #

How fast is it? #

Prefer Python? Same conversion with chDB #

Run it yourself #

Subscribe to our newsletter

More like this

How to engineer cost-efficient open source observability with ClickHouse (ClickStack) - 2026 technical playbook

Build a dashboard in Python with ClickHouse and Streamlit

Instrumenting OpenAI with OpenTelemetry (OTel)

Real-time data visualization

The one gotcha: MsgPack has no schema

Write line-delimited JSON or one JSON array

Options worth knowing

How fast is it?

Prefer Python? Same conversion with chDB

Run it yourself