Skip to main content
Skip to main content

Integrating BigQuery with ClickHouse Cloud

Private preview in ClickHouse Cloud

The BigQuery ClickPipe provides a fully-managed and resilient way to ingest data from BigQuery into ClickHouse Cloud. In Private Preview, it supports the initial load replication method to help you bulk load BigQuery datasets for exploration and prototyping. CDC will be supported in the future — in the meantime, we recommend using the Google Cloud Storage ClickPipe to continuously sync BigQuery data exports into ClickHouse Cloud once the initial load is completed.

BigQuery ClickPipes can be deployed and managed manually using the ClickPipes UI, as well as programmatically using OpenAPI and Terraform.

Features

Initial load

The BigQuery ClickPipe will load selected tables in a BigQuery dataset into the ClickHouse destination table(s) in a single batch operation. Once the ingestion task completes, the ClickPipe stops automatically. The initial load ingestion process requires a user-provided Google Cloud Storage (GCS) bucket for staging. In the future, the intermediary bucket will be provided and managed by ClickPipes.

Note

ClickPipes relies on batch extract jobs to fetch data from BigQuery into the staging GCS bucket. This operations incurs no processing charges in BigQuery.

CDC (Change Data Capture)

CDC is not supported in Private Preview, but will be supported in the future. In the meantime, we recommend using the Google Cloud Storage ClickPipe to continuously sync BigQuery data exports into ClickHouse Cloud once the initial load is completed.

Data type mapping

BigQuery data types.

BigQuery Data TypeClickHouse Data TypeDetails
BOOLBool
INT64Int64
FLOAT64Float64
NUMERICDecimal(P, S)Precision up to 38, scale up to 9. Precision/scale is preserved.
BIGNUMERICDecimal(P, S)Precision up to 76, scale up to 38. Precision/scale is preserved.
STRINGString
BYTESString
JSONString (JSON)
DATEDate
TIMEStringMicrosecond precision.
DATETIMEDateTimeMicrosecond precision.
TIMESTAMPDateTime64(6)Microsecond precision.
GEOGRAPHYString
GEOMETRYString
UUIDString
ARRAY<T>Array(T)
ARRAY<DATE>Array(Date)
STRUCT (RECORD)String

Access control

Authentication

Service account credentials

ClickPipes authenticates to your Google Cloud project using a service account key. We recommend creating a dedicated service account with the minimum required set of permissions to allow ClickPipes to export data from BigQuery, load it into the staging GCS bucket, and read it into ClickHouse.

Permissions

BigQuery

The service account must have the following BigQuery roles:

To further scope access, we recommend using IAM conditions to restrict the resources the role has access to. For example, you can restrict the dataViewer role to the specific dataset containing the tables you want to sync:

resource.name.startsWith("projects/<PROJECT_ID>/datasets/<DATASET_NAME>")

Cloud Storage

The service account must have the following Cloud Storage roles:

To further scope access, we recommend using IAM conditions to restrict the resources the role has access to. For example, you can restrict the objectAdmin and bucketViewer roles to the dedicated bucket created for ClickPipes syncs.

resource.name.startsWith("projects/_/buckets/<BUCKET_NAME>")