Backing up a specific partition
· 3 min read
How can I backup a specific partition in ClickHouse?
Question
How can I backup a specific partition in ClickHouse?
Answer
See the below example, this uses the S3(Minio) disk configuration listed in our docker compose examples page.
Note
This does NOT apply to ClickHouse Cloud
Create a table:
ch_minio_s3 :) CREATE TABLE my_table
(
`event_time` DateTime,
`field_foo` String,
`field_bar` String,
`number` UInt256
)
ENGINE = MergeTree
PARTITION BY number % 2
ORDER BY tuple()
CREATE TABLE my_table
(
`event_time` DateTime,
`field_foo` String,
`field_bar` String,
`number` UInt256
)
ENGINE = MergeTree
PARTITION BY number % 2
ORDER BY tuple()
Query id: a1a54a5a-eac0-477c-b847-b40acaa62780
Ok.
0 rows in set. Elapsed: 0.016 sec.
Add some data that will fill both partitions equally:
ch_minio_s3 :) INSERT INTO my_table SELECT
toDateTime(now() + number) AS event_time,
randomPrintableASCII(10) AS field_foo,
randomPrintableASCII(20) AS field_bar,
number
FROM numbers(1000000)
INSERT INTO my_table SELECT
toDateTime(now() + number) AS event_time,
randomPrintableASCII(10) AS field_foo,
randomPrintableASCII(20) AS field_bar,
number
FROM numbers(1000000)
Query id: bf6ef803-5747-4ea1-ad00-a17967e349b6
Ok.
0 rows in set. Elapsed: 0.282 sec. Processed 1.00 million rows, 8.00 MB (3.55 million rows/s., 28.39 MB/s.)
verify data:
ch_minio_s3 :) SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
Query id: d8febfb0-5339-4f97-aefa-ef0003128526
┌─partition_id─┬─cityHash64(sum(number))─┬──count─┐
│ 0 │ 15460940821314360342 │ 500000 │
│ 1 │ 11827822647069388611 │ 500000 │
└──────────────┴─────────────────────────┴────────┘
2 rows in set. Elapsed: 0.025 sec. Processed 1.00 million rows, 32.00 MB (39.97 million rows/s., 1.28 GB/s.)
backup partition with id 1
to configured s3
disk:
ch_minio_s3 :) BACKUP TABLE my_table PARTITION 1 TO Disk('s3','backups/');
BACKUP TABLE my_table PARTITION 1 TO Disk('s3', 'backups/')
Query id: 810f6144-e282-42e2-99d0-9a80c75a927d
┌─id───────────────────────────────────┬─status─────────┐
│ 4d1da197-c4c9-4b6e-966c-76202eadbd53 │ BACKUP_CREATED │
└──────────────────────────────────────┴────────────────┘
1 row in set. Elapsed: 0.095 sec.
Drop the table:
ch_minio_s3 :) DROP TABLE my_table
DROP TABLE my_table
Query id: c3456044-4689-406e-82ac-8d08b8b618fe
Ok.
0 rows in set. Elapsed: 0.007 sec.
restore just partition with id 1
from backup:
ch_minio_s3 :) RESTORE TABLE my_table PARTITION 1 FROM Disk('s3','backups/');
RESTORE TABLE my_table PARTITION 1 FROM Disk('s3', 'backups/')
Query id: ea306c73-83c5-479f-9c0c-391594facc69
┌─id───────────────────────────────────┬─status───┐
│ ec6841a8-0607-465e-bc4d-d446f960d40a │ RESTORED │
└──────────────────────────────────────┴──────────┘
1 row in set. Elapsed: 0.065 sec.
validate the restored data:
ch_minio_s3 :) SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
SELECT
_partition_id AS partition_id,
cityHash64(sum(number)) AS hash,
count() AS count
FROM my_table
GROUP BY partition_id
Query id: a916176d-6a6e-47fc-ba7d-79bb33b152d8
┌─partition_id─┬─────────────────hash─┬──count─┐
│ 1 │ 11827822647069388611 │ 500000 │
└──────────────┴──────────────────────┴────────┘
1 row in set. Elapsed: 0.012 sec. Processed 500.00 thousand rows, 16.00 MB (41.00 million rows/s., 1.31 GB/s.)