How to query a mysqldump file without importing

Al Brown
Last updated: Jun 15, 2026

To query a mysqldump file without importing it, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then point it at the .sql file with the MySQLDump format and name the table you want:

1clickhouse local -q "
2SELECT * FROM file('shop.sql', MySQLDump)
3SETTINGS input_format_mysql_dump_table_name = 'orders'
4FORMAT PrettyCompact"
1   ┌─id─┬─customer_id─┬─product─┬─revenue─┬─quantity─┐
21. │  1 │           1 │ widget  │       5 │        1 │
32. │  2 │           2 │ gadget  │    6.01 │        2 │
43. │  3 │           3 │ gizmo   │    7.02 │        3 │
54. │  4 │           1 │ widget  │    8.03 │        4 │
6   └────┴─────────────┴─────────┴─────────┴──────────┘

The MySQLDump format reads the INSERT statements straight out of the dump and exposes them as a table, so the file is queried in place with no restore step first.

A dump usually has several tables — list them first #

A typical mysqldump file holds every table in a database, one CREATE TABLE plus one INSERT INTO per table. Before you query, it helps to see what's in there. The table names live on the INSERT INTO lines, so read the file as raw lines and pull them out:

1clickhouse local -q "
2SELECT extract(line, 'INSERT INTO .(\w+).') AS table_name
3FROM file('shop.sql', LineAsString)
4WHERE line LIKE 'INSERT INTO%'"
1customers
2orders

That tells you which names you can pass to input_format_mysql_dump_table_name below.

See the schema without a CREATE TABLE #

mysqldump writes the CREATE TABLE DDL right above the data, and clickhouse local reads it. DESCRIBE shows the column names and the types it mapped from the MySQL definitions, with no schema to declare yourself:

1clickhouse local -q "
2DESCRIBE file('shop.sql', MySQLDump)
3SETTINGS input_format_mysql_dump_table_name = 'orders'
4FORMAT PrettyCompact"
1   ┌─name────────┬─type─────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
21. │ id          │ Int32                    │              │                    │         │                  │                │
32. │ customer_id │ Nullable(Int32)          │              │                    │         │                  │                │
43. │ product     │ Nullable(String)         │              │                    │         │                  │                │
54. │ revenue     │ Nullable(Decimal(10, 2)) │              │                    │         │                  │                │
65. │ quantity    │ Nullable(Int32)          │              │                    │         │                  │                │
7   └─────────────┴──────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

The MySQL int NOT NULL primary key became Int32, the nullable int columns became Nullable(Int32), and decimal(10,2) was preserved as Decimal(10, 2). Reading the rows is then a plain SELECT:

1clickhouse local -q "
2SELECT * FROM file('shop.sql', MySQLDump)
3SETTINGS input_format_mysql_dump_table_name = 'orders'
4FORMAT PrettyCompact"
1   ┌─id─┬─customer_id─┬─product─┬─revenue─┬─quantity─┐
21. │  1 │           1 │ widget  │       5 │        1 │
32. │  2 │           2 │ gadget  │    6.01 │        2 │
43. │  3 │           3 │ gizmo   │    7.02 │        3 │
54. │  4 │           1 │ widget  │    8.03 │        4 │
6   └────┴─────────────┴─────────┴─────────┴──────────┘

Picking a table — the one setting that matters #

input_format_mysql_dump_table_name is the whole trick. The dump may contain twenty tables; this setting tells clickhouse local which one's INSERT rows to return. Switch the value to read a different table out of the same file:

1clickhouse local -q "
2SELECT * FROM file('shop.sql', MySQLDump)
3SETTINGS input_format_mysql_dump_table_name = 'customers'"
11	GB	2026-01-01
22	US	2026-01-02
33	DE	2026-01-03

Leave the setting off and clickhouse local reads the first table it finds in the file. That's a handy default for single-table dumps (the kind mysqldump db t produces), but with a multi-table dump you almost always want to be explicit:

1clickhouse local -q "SELECT count() FROM file('shop.sql', MySQLDump)"
13

The setting applies to the whole query, so you read one table per query. To combine tables, query each into a Parquet or Native file once and join those, or load the dump into a ClickHouse server.

Aggregate straight on the dump #

Because the dump is just a SQL source, the full ClickHouse dialect works on it — WHERE, GROUP BY, aggregates, window functions — with nothing loaded in between:

1clickhouse local -q "
2SELECT product, count() AS orders, round(sum(revenue), 2) AS revenue
3FROM file('shop.sql', MySQLDump)
4GROUP BY product
5ORDER BY revenue DESC
6SETTINGS input_format_mysql_dump_table_name = 'orders'
7FORMAT PrettyCompact"
1   ┌─product─┬─orders─┬─revenue─┐
21. │ widget  │      2 │   13.03 │
32. │ gizmo   │      1 │    7.02 │
43. │ gadget  │      1 │    6.01 │
5   └─────────┴────────┴─────────┘

This is the everyday win: a colleague hands you a .sql dump, you answer a question about it in one command instead of restoring it into a database first. A .sql.gz dump works the same way. clickhouse local detects the .gz extension and decompresses on the fly, so you never unzip anything (querying compressed files).

How fast is it? #

The dump is plain text, so reading it means parsing SQL on every run, with no pre-built table behind it. Even so, parsing is the only cost. On a single-table dump of 2,000,000 rows (events_large.sql, ~64 MB of INSERT text generated by the example folder below), a full GROUP BY country with sum and avg over every row runs in:

1clickhouse local -q "
2SELECT country, count() AS events, round(sum(revenue), 2) AS revenue, round(avg(quantity), 3) AS avg_qty
3FROM file('events_large.sql', MySQLDump)
4GROUP BY country
5ORDER BY revenue DESC
6LIMIT 5
7SETTINGS input_format_mysql_dump_table_name = 'events'
8FORMAT PrettyCompact"
1   ┌─country─┬─events─┬─────revenue─┬─avg_qty─┐
21. │ IN      │ 200674 │ 50256630.75 │   2.997 │
32. │ CA      │ 200217 │ 50087310.66 │   2.996 │
43. │ BR      │ 200349 │ 50075129.05 │   2.997 │
54. │ NL      │ 200017 │ 50053869.17 │   2.999 │
65. │ AU      │ 200001 │ 50030477.09 │   2.999 │
7   └─────────┴────────┴─────────────┴─────────┘

~0.35 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM; clickhouse local 26.6.1.117). That includes parsing the whole dump from text on every run. If you'll query the data more than once, convert the table to Parquet once with INTO OUTFILE 'events.parquet' FORMAT Parquet and read that instead. Parquet is columnar and typed, so repeat scans get faster again. See how to query a Parquet file.

The same SQL scales unchanged #

The SELECT ... GROUP BY you ran on a dump file is the same SQL you'd run on a ClickHouse server, or in ClickHouse Cloud. When the data outgrows a laptop, you swap file('shop.sql', MySQLDump) for a table name and the rest of the query stays put. You prototype against the dump on your machine and ship the identical logic to production.

Run it yourself #

The complete, runnable example lives here. It has generate.sh (builds a two-table shop.sql dump and the 2M-row events_large.sql perf dump, both in real mysqldump layout), run.sh (every command above), and expected_output.txt:

github.com/ClickHouse/examples → local-analytics/clickhouse-local-mysqldump

1git clone https://github.com/ClickHouse/examples
2cd examples/local-analytics/clickhouse-local-mysqldump
3./generate.sh && ./run.sh
Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...