Back to integrations

Google Cloud Storage (GCS)

core

Reading data from Google Cloud Storage

Using ClickHouse S3 table function, users can query GCS data as a source without requiring persistence in ClickHouse. The following example illustrates how to read 10 rows of the NYC Taxi dataset.

SELECT 
    trip_id, 
    total_amount, 
    pickup_longitude, 
    pickup_latitude, 
    dropoff_longitude, 
    dropoff_latitude, 
    pickup_datetime, 
    dropoff_datetime,
    trip_distance
FROM s3(
    'https://storage.googleapis.com/clickhouse-public-datasets/nyc-taxi/trips_{0..2}.gz',
    'TabSeparatedWithNames'
) LIMIT 10;

Inserting Data from Google Cloud Storage

To transfer data from GCS to ClickHouse, users can combine the s3 table function with INSERT statement. Let's create an empty trips table:

CREATE TABLE trips ORDER BY tuple
(
) EMPTY AS SELECT * FROM s3(
    'https://storage.googleapis.com/clickhouse-public-datasets/nyc-taxi/trips_{0..2}.gz',
    'TabSeparatedWithNames'
);

This creates an empty table using the schema inferred from the data. We can then insert the first 1 million rows from the remote dataset

INSERT INTO trips SELECT *
FROM 
s3(
    'https://storage.googleapis.com/clickhouse-public-datasets/nyc-taxi/trips_{0..2}.gz',
    'TabSeparatedWithNames'
    )
LIMIT 1000000;

Other integrations

Get started with ClickHouse Cloud for free

We'll get you started on a 30 day trial and $300 credits to spend at your own pace.