How to read a Feather file from the command line

Al Brown
Last updated: Jun 15, 2026

To read a Feather file from the command line, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.

Install it with clickhousectl:

1curl https://clickhouse.com/cli | sh   # install clickhousectl
2clickhousectl local use latest         # download ClickHouse and put it on your PATH

Then query the file directly:

1clickhouse local -q "SELECT * FROM file('events.feather') LIMIT 10"
1    ┌──────────────event_time─┬─event_id─┬─country─┬─event_type─┬─revenue─┬─quantity─┐
2 1.2026-01-01 00:00:00.0001 │ GB      │ click      │       513 2.2026-01-01 01:00:00.0002 │ US      │ view6.0124 3.2026-01-01 02:00:00.0003 │ DE      │ purchase   │    7.0235 4.2026-01-01 03:00:00.0004 │ FR      │ refund     │    8.0346 5.2026-01-01 04:00:00.0005IN      │ click      │    9.0457    └─────────────────────────┴──────────┴─────────┴────────────┴─────────┴──────────┘

Feather V2 is the Arrow IPC file format, so ClickHouse recognizes the .feather extension automatically and reads the file in place with no import step.

Feather is Arrow: read it with FORMAT Arrow #

This is the one thing to know. "Feather" is not a separate format with its own reader. Feather V2 is the on-disk Arrow IPC file format, the same columnar layout the Arrow project uses for files. A .feather file written by pandas, pyarrow, R, or Polars is an Arrow IPC file with a different extension.

So ClickHouse reads it with FORMAT Arrow. The extension is detected from the file name, but you can also say it outright:

1clickhouse local -q "SELECT count() FROM file('events.feather', 'Arrow')"
120

Both forms read the same file. If you ever rename a .feather to something ClickHouse doesn't recognize, pass 'Arrow' as the second argument and it reads exactly the same.

See the schema without declaring one #

Arrow files carry their own schema, so you never write CREATE TABLE. DESCRIBE prints the column names and the types ClickHouse read from the file:

1clickhouse local -q "DESCRIBE file('events.feather') FORMAT PrettyCompact"
1   ┌─name───────┬─type─────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
21. │ event_time │ DateTime64(3, 'UTC') │              │                    │         │                  │                │
32. │ event_id   │ UInt64               │              │                    │         │                  │                │
43. │ country    │ String               │              │                    │         │                  │                │
54. │ event_type │ String               │              │                    │         │                  │                │
65. │ revenue    │ Float64              │              │                    │         │                  │                │
76. │ quantity   │ UInt8                │              │                    │         │                  │                │
8   └────────────┴──────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

The types come straight from the file's Arrow schema: the timestamp lands as DateTime64, the integers keep their widths, and the floats stay Float64. (The extra empty columns are CREATE TABLE metadata, like defaults, codecs and TTLs, which an Arrow file doesn't carry.)

Filter, aggregate, and group by #

A viewer shows you rows. A query engine answers questions. Because the file is just a SQL source, you have the full ClickHouse dialect (WHERE, GROUP BY, aggregate functions, window functions, joins):

1clickhouse local -q "
2SELECT country,
3       count() AS events,
4       round(sum(revenue), 2) AS revenue,
5       round(avg(quantity), 3) AS avg_qty
6FROM file('events.feather')
7WHERE event_type = 'purchase'
8GROUP BY country
9ORDER BY revenue DESC
10FORMAT PrettyCompact"
1   ┌─country─┬─events─┬─revenue─┬─avg_qty─┐
21. │ GB      │      2 │   34.24 │       3 │
32. │ DE      │      2 │   26.16 │       4 │
43. │ IN      │      1 │    15.1 │       1 │
5   └─────────┴────────┴─────────┴─────────┘

Arrow is columnar, so a query that touches a few columns reads only those columns. The rest are never decoded.

The Feather V1 gotcha #

There are two things people call "Feather". Feather V2 is the Arrow IPC format, and it is what every current tool writes by default; its files begin with the magic bytes ARROW1. There is also a legacy Feather V1 format from 2016, which predates the Arrow IPC spec and starts with FEA1. They are different on-disk layouts.

ClickHouse's Arrow reader handles the Arrow IPC format (V2). Hand it a legacy V1 file and it tells you plainly:

1clickhouse local -q "SELECT * FROM file('events_v1.feather', 'Arrow')"
1Code: 636. DB::Exception: The table structure cannot be extracted from a Arrow format file. Error:
2Code: 1002. DB::Exception: Error while opening a table: Invalid: Not an Arrow file. (UNKNOWN_EXCEPTION)

If you hit this, you have an old V1 file. Re-save it as the default (V2) in whatever tool produced it (for example pyarrow.feather.write_feather(table, "out.feather") writes V2) and ClickHouse reads it. In practice almost every .feather you meet today is already V2.

Convert Feather to Parquet in one line #

Feather and Parquet are both columnar Arrow-adjacent formats, but they trade off differently: Feather (Arrow IPC) is optimized for fast read/write and zero-copy interchange between tools, while Parquet compresses harder for long-term storage. Converting between them is one command: SELECT from the Feather file, INTO OUTFILE as Parquet:

1clickhouse local -q "SELECT * FROM file('events.feather') INTO OUTFILE 'events.parquet' TRUNCATE FORMAT Parquet"
120

Read it back with the same file() call. See how to query a Parquet file for the typing and compression options.

How fast is it on a real file? #

Small files are instant in anything. The difference shows up at scale. On a 3,000,000-row, ~77 MB Feather file (events_large.feather, built by the example folder below), the same filter-and-group-by query runs in:

1clickhouse local --time -q "
2SELECT country, count(), round(sum(revenue), 2), round(avg(quantity), 3)
3FROM file('events_large.feather')
4WHERE event_type = 'purchase'
5GROUP BY country ORDER BY 3 DESC
6FORMAT Null"
10.067

~0.07 seconds of query execution (the --time flag reports the query time, not process startup), best of three with the file warm in the OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM; clickhouse local 26.6.1.117). End to end, including launching the binary, the same run finishes in about 0.2 seconds. Arrow stores columns contiguously, so the scan reads only the four columns the query needs and runs across all cores.

clickhouse local reads Feather from the command line with one binary that also reads Parquet, CSV, JSON, ORC and many other formats, talks to S3, MySQL and Postgres, and runs the same SQL unchanged when you move from a file to a server to the Cloud.

The same SQL scales unchanged #

The query you just ran on a laptop file is the same SQL you would run on a ClickHouse server, or in ClickHouse Cloud. Nothing about SELECT ... WHERE ... GROUP BY changes. You swap file('events.feather') for a table name and the rest stays put. You prototype against a file on your machine and ship the identical logic to production.

Run it yourself #

The complete, runnable example lives here. It has generate.sh (builds the demo file, a legacy V1 file, and the ~77 MB perf file), run.sh (every command above), and expected_output.txt:

github.com/ClickHouse/examples → local-analytics/clickhouse-local-feather

1git clone https://github.com/ClickHouse/examples
2cd examples/local-analytics/clickhouse-local-feather
3./generate.sh && ./run.sh
Share this resource

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...