Postgres with TimescaleDB Source Setup Guide
Background
TimescaleDB is an open-source Postgres extension developed by Timescale Inc that aims to boost the performance of analytics queries without having to move away from Postgres. This is achieved by creating "hypertables" which are managed by the extension and support automatic partitioning into "chunks". Hypertables also support transparent compression and hybrid row-columnar storage (known as "hypercore"), although these features require a version of the extension that has a proprietary license.
Timescale Inc also offers two managed services for TimescaleDB:
Managed Service for Timescale
Timescale Cloud
.
There are third-party vendors offering managed services that allow you to use the TimescaleDB extension, but due to licensing, these vendors only support the open-source version of the extension.
Timescale hypertables behave differently from regular Postgres tables in several ways. This poses some complications to the process of replicating them, which is why the ability to replicate Timescale hypertables should be considered as best effort.
Supported Postgres versions
ClickPipes supports Postgres version 12 and later.
Enable Logical Replication
The steps to be follow depend on how your Postgres instance with TimescaleDB is deployed.
- If you're using a managed service and your provider is listed in the sidebar, please follow the guide for that provider.
- If you're deploying TimescaleDB yourself, follow the generic guide.
For other managed services, please raise a support ticket with your provider to help in enabling logical replication if it isn't already.
Timescale Cloud does not support enabling logical replication, which is needed for Postgres pipes in CDC mode.
As a result, users of Timescale Cloud can only perform a one-time load of their data (Initial Load Only
) with the
Postgres ClickPipe.
Configuration
Timescale hypertables don't store any data inserted into them. Instead, the data is stored in multiple corresponding
"chunk" tables which are in the _timescaledb_internal
schema. For running queries on the hypertables, this is not an
issue. But during logical replication, instead of detecting changes in the hypertable we detect them in the chunk table
instead. The Postgres ClickPipe has logic to automatically remap changes from the chunk tables to the parent hypertable,
but this requires additional steps.
If you'd like to only perform a one-time load of your data (Initial Load Only
), please skip steps 2 onward.
- Create a Postgres user for the pipe and grant it permissions to
SELECT
the tables you wish to replicate.
Make sure to replace clickpipes_user
and clickpipes_password
with your desired username and password.
- As a Postgres superuser/admin user, create a publication on the source instance that has the tables and hypertables
you want to replicate and also includes the entire
_timescaledb_internal
schema. While creating the ClickPipe, you need to select this publication.
We don't recommend creating a publication FOR ALL TABLES
, this leads to more traffic from Postgres to ClickPipes (to sending changes for other tables not in the pipe) and reduces overall efficiency.
Some managed services don't give their admin users the required permissions to create a publication for an entire schema. If this is the case, please raise a support ticket with your provider. Alternatively, you can skip this step and the following steps and perform a one-time load of your data.
- Grant replication permissions to the user created earlier.
After these steps, you should be able to proceed with creating a ClickPipe.
Configure Network Access
If you want to restrict traffic to your Timescale instance, please allowlist the documented static NAT IPs. Instructions to do this will vary across providers, please consult the sidebar if your provider is listed or raise a ticket with them.