Paimon table engine
This engine provides a read-only integration with existing Apache Paimon tables in Amazon S3, Azure, HDFS and locally stored tables. It supports snapshot reads, incremental reads, and basic partition pruning provided by the engine.
Create table
Note that the Paimon table must already exist in the storage, this command does not take DDL parameters to create a new table.
Creating Paimon* tables is gated by allow_experimental_paimon_storage_engine (disabled by default), so enable it before running CREATE TABLE.
Engine arguments
Description of the arguments coincides with description of arguments in engines S3, AzureBlobStorage, HDFS and File correspondingly.
format stands for the format of data files in the Paimon table.
Engine parameters can be specified using Named Collections
Example
Using named collections:
Capabilities
- Snapshot reads from the latest table snapshot.
- Incremental reads based on committed snapshot id when enabled.
- Partition pruning when
use_paimon_partition_pruningis enabled. - Optional background refresh of metadata when configured.
- Stable table UUID when using Atomic/Replicated databases, enabling
{uuid}macros in Keeper paths.
Settings
This engine uses the same settings as the corresponding object storage engines and adds Paimon-specific settings:
allow_experimental_paimon_storage_engine— enables creation ofPaimon,PaimonS3,PaimonAzure,PaimonHDFS, andPaimonLocaltable engines. Default:0(disabled).paimon_incremental_read— enable incremental read mode.paimon_metadata_refresh_interval_sec— background metadata refresh interval in seconds. When set to a value greater than 0, a background task periodically pulls the latest snapshot and schema from object storage. Default: 30.paimon_keeper_path— Keeper path for incremental read state. Must be set and unique per table; supports macros such as{database},{table},{uuid}.paimon_replica_name— Replica name for incremental read state. Must be set and unique per replica; supports macros such as{replica}.
Incremental read examples
Incremental read with Keeper state:
Targeted snapshot delta read (query-level):
Paimon to MergeTree via Refreshable Materialized View
You can build an end-to-end pipeline that continuously syncs data from a Paimon table into a MergeTree table using a refreshable Materialized View in APPEND mode. Each refresh cycle reads only new incremental data from Paimon and appends it to the destination table.
Step 1 — Create the Paimon source table with incremental read and metadata refresh enabled.
The example below uses PaimonLocal. Replace the engine with PaimonS3, PaimonAzure, PaimonHDFS, or the Paimon alias as appropriate for your storage backend:
paimon_metadata_refresh_interval_sec sets the background metadata refresh interval in seconds. When greater than 0, a background task periodically pulls the latest snapshot and schema from object storage, so that the MV refresh cycle can see newly committed data without waiting for a query to trigger the metadata update. Default is 30. Use cautiously on many tables to avoid excessive object storage and Keeper I/O.
Step 2 — Create the MergeTree destination table (schema cloned from the Paimon table):
Step 3 — Create the refreshable Materialized View:
Every 10 seconds the MV fires a SELECT * FROM paimon_mv_source, which returns only the rows added since the last committed snapshot, and appends them to paimon_mv_dest.
Cleanup:
Stop the MV before dropping it to prevent background refresh from blocking DDL operations.
Limitations
- Incremental read requires Keeper (ZooKeeper) to be configured.
- Incremental read requires
paimon_keeper_pathto be set and unique per table. paimon_replica_namemust be unique per replica within the same Keeper path.- The table engine is read-only; data modification is not supported.
- Incremental read does not handle historical data deletions from the Paimon source. If upstream Paimon data is deleted or updated, the corresponding rows already written to a ClickHouse MergeTree destination table will not be automatically removed. You must manually issue
ALTER TABLE ... DELETEon the MergeTree table to clean up stale data.
Aliases
Table engine Paimon is an alias to PaimonS3 now.
Virtual Columns
_path— Path to the file. Type:LowCardinality(String)._file— Name of the file. Type:LowCardinality(String)._size— Size of the file in bytes. Type:Nullable(UInt64). If the file size is unknown, the value isNULL._time— Last modified time of the file. Type:Nullable(DateTime). If the time is unknown, the value isNULL._etag— The etag of the file. Type:LowCardinality(String). If the etag is unknown, the value isNULL.
Data Types supported
| Paimon Data Type | ClickHouse Data Type |
|---|---|
| BOOLEAN | Int8 |
| TINYINT | Int8 |
| SMALLINT | Int16 |
| INTEGER | Int32 |
| BIGINT | Int64 |
| FLOAT | Float32 |
| DOUBLE | Float64 |
| STRING,VARCHAR,BYTES,VARBINARY | String |
| DATE | Date |
| TIME(p),TIME | Time('UTC') |
| TIMESTAMP(p) WITH LOCAL TIME ZONE | DateTime64 |
| TIMESTAMP(p) | DateTime64('UTC') |
| CHAR | FixedString(1) |
| BINARY(n) | FixedString(n) |
| DECIMAL(P,S) | Decimal(P,S) |
| ARRAY | Array |
| MAP | Map |
Partition supported
Data types supported in Paimon partition keys:
CHARVARCHARBOOLEANDECIMALTINYINTSMALLINTINTEGERDATETIMETIMESTAMPTIMESTAMP WITH LOCAL TIME ZONEBIGINTFLOATDOUBLE