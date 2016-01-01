DataLakeCatalog
The
DataLakeCatalog database engine enables you to connect ClickHouse to external
data catalogs and query open table format data without the need for data duplication.
This transforms ClickHouse into a powerful query engine that works seamlessly with
your existing data lake infrastructure.
Supported catalogs
The
DataLakeCatalog engine supports the following data catalogs:
- AWS Glue Catalog - For Iceberg tables in AWS environments
- Databricks Unity Catalog - For Delta Lake and Iceberg tables
- Hive Metastore - Traditional Hadoop ecosystem catalog
- REST Catalogs - Any catalog supporting the Iceberg REST specification
Creating a database
You will need to enable the relevant settings below to use the
DataLakeCatalog engine:
Databases with the
DataLakeCatalog engine can be created using the following syntax:
The following settings are supported:
|Setting
|Description
catalog_type
|Type of catalog:
glue,
unity (Delta),
rest (Iceberg),
hive
warehouse
|The warehouse/database name to use in the catalog.
catalog_credential
|Authentication credential for the catalog (e.g., API key or token)
auth_header
|Custom HTTP header for authentication with the catalog service
auth_scope
|OAuth2 scope for authentication (if using OAuth)
storage_endpoint
|Endpoint URL for the underlying storage
oauth_server_uri
|URI of the OAuth2 authorization server for authentication
vended_credentials
|Boolean indicating whether to use vended credentials (AWS-specific)
aws_access_key_id
|AWS access key ID for S3/Glue access (if not using vended credentials)
aws_secret_access_key
|AWS secret access key for S3/Glue access (if not using vended credentials)
region
|AWS region for the service (e.g.,
us-east-1)
Examples
See below pages for examples of using the
DataLakeCatalog engine: