To convert Parquet to Avro, use clickhouse local. It runs SQL directly on files from the command line, with no server to install. It's part of ClickHouse, so the same query scales to billions of rows when you outgrow your laptop.
Install it with clickhousectl:
curl https://clickhouse.com/cli | sh # install clickhousectl
clickhousectl local use latest # download ClickHouse and put it on your PATH
Then point it at the Parquet file and write the result out as Avro:
clickhouse local -q "SELECT * FROM file('events.parquet') INTO OUTFILE 'events.avro' FORMAT Avro"
clickhouse local reads the schema straight from the Parquet footer, so you name no columns and no types. The file is read in place with no import step, and it streams rather than loading into memory, so inputs larger than RAM convert fine.
What changes: columnar to row-oriented
Parquet is columnar: values for one column sit together on disk, which is what makes column scans and compression so good. Avro is row-oriented: each record is written whole, one after another, with the schema stored once in the file header. The conversion is a transpose plus a re-encode, and clickhouse-local derives the Avro schema from the column types it read out of the Parquet footer. You don't write a .avsc by hand.
Check what was read from the Parquet file:
clickhouse local -q "DESCRIBE file('events.parquet')"
event_id UInt64
ts DateTime64(3, 'UTC')
country String
event_type String
amount Float64
items UInt8
device Tuple(
`1` String,
`2` UInt8)
Now read the same columns back from the Avro file that was produced:
clickhouse local -q "DESCRIBE file('events.avro')"
event_id Int64
ts DateTime64(3)
country String
event_type String
amount Float64
items Int32
device Tuple(
`1` String,
`2` Int32)
Look closely at the two schemas. event_id went from UInt64 to Int64, and items went from UInt8 to Int32. That isn't a bug. Avro's primitive number types are int (32-bit) and long (64-bit), both signed — there is no unsigned integer in the Avro spec. clickhouse-local widens each unsigned column to the smallest signed type that holds its full range, so no value is ever truncated. A UInt8 (0–255) maps to a signed 32-bit int, and a UInt64 maps to a signed 64-bit long. This is the one type change worth knowing about before you convert; everything else (strings, floats, timestamps) carries across unchanged.
The nested structure survives too. The device column is a Tuple(String, UInt8) in Parquet, and it becomes a nested Avro record. Read three rows back as JSON to see it intact:
clickhouse local -q "SELECT * FROM file('events.avro') ORDER BY event_id LIMIT 3 FORMAT JSONEachRow"
{"event_id":1,"ts":"2026-01-01 00:00:00.000","country":"GB","event_type":"click","amount":5,"items":1,"device":{"1":"mobile","2":1}}
{"event_id":2,"ts":"2026-01-01 01:00:00.000","country":"US","event_type":"view","amount":6.01,"items":2,"device":{"1":"desktop","2":0}}
{"event_id":3,"ts":"2026-01-01 02:00:00.000","country":"DE","event_type":"purchase","amount":7.02,"items":3,"device":{"1":"tablet","2":1}}
The nested fields are addressable after conversion, so nothing is flattened away:
clickhouse local -q "SELECT event_id, device.1 AS device_type, device.2 AS is_even FROM file('events.avro') ORDER BY event_id LIMIT 3"
1 mobile 1
2 desktop 0
3 tablet 1
These are the things an upload-required converter site won't give you, and they are why doing it locally pays off:
-
Compress the Avro. Avro supports a block codec. Set it before the conversion to shrink the output:
clickhouse local -q "SET output_format_avro_codec='deflate'; SELECT * FROM file('events.parquet') INTO OUTFILE 'events.avro' FORMAT Avro"
snappy and null (uncompressed) are also valid. Row-oriented Avro typically lands larger than the columnar Parquet it came from, even compressed, because Parquet's per-column encoding is hard to beat for analytical data. Expect the Avro to be bigger, not smaller.
-
Convert a subset. Because the source is a SQL table, you filter and project in the same statement instead of converting then trimming: SELECT event_id, country FROM file('events.parquet') WHERE country = 'US' INTO OUTFILE 'us.avro' FORMAT Avro.
-
Pin the types. If you want to keep the unsigned semantics or force a specific width, cast in the SELECT (CAST(items AS Int16)), or pass an explicit structure to file(). The schema is yours to control, not whatever a black-box converter decided.
-
It's scriptable and offline. A single command in a Makefile or cron job, with no file leaving your machine. Useful when the Parquet holds data you can't upload anywhere.
On a 3,000,000-row events_large.parquet (~41 MB), the full Parquet-to-Avro conversion (read the columnar file, transpose to rows, encode Avro, write it out) completes in:
clickhouse local -q "SELECT * FROM file('events_large.parquet') INTO OUTFILE 'events_large.avro' FORMAT Avro"
~0.6 seconds, best of three with a warm OS page cache, on an Apple M4 Pro laptop (14 cores, 24 GB RAM). Measured with /usr/bin/time -p; figures may vary slightly under concurrent load. The conversion streams, so memory stays flat regardless of file size.
Need to go the other way? See how to convert Avro to Parquet — the same one-liner with the formats swapped.
If you're already in a notebook, chDB is the same engine in-process. The SQL is identical; only the wrapper changes:
import chdb
# Convert Parquet -> Avro: the same SELECT ... FORMAT Avro, into a file.
chdb.query("SELECT * FROM file('events.parquet') INTO OUTFILE 'events.avro' TRUNCATE FORMAT Avro")
# Read it back to confirm.
print(chdb.query("SELECT count() AS rows FROM file('events.avro')", "DataFrame"))
For reading Avro into a DataFrame afterwards, see how to read an Avro file in Python with chDB, and for querying Avro from the terminal, how to read an Avro file.
The complete, runnable example lives in the ClickHouse examples repo: generate.sh to create the sample Parquet files (including the 3M-row file used for the timing above), run.sh with every command on this page, run.py / run.ipynb for the chDB version, and expected_output.txt.
github.com/ClickHouse/examples/tree/main/local-analytics/convert-parquet-to-avro
The same SQL that converts one file on your laptop runs unchanged against a directory of Parquet files, a remote object store, or a ClickHouse Cloud service when the data outgrows your machine. See also how to query a Parquet file.