BYOC Network Security
Tailscale Private Network
Tailscale provides a zero-trust, private network connection between ClickHouse Cloud's management services and your BYOC deployment. This secure channel enables ClickHouse engineers to perform troubleshooting and management operations without requiring public internet access or complex VPN configurations.
Overview
Tailscale creates an encrypted, private network tunnel between the ClickHouse control plane (in ClickHouse's VPC) and your BYOC data plane (in your VPC). This connection is used exclusively for:
- Management operations: ClickHouse management services coordinating with your BYOC infrastructure
- Troubleshooting access: ClickHouse engineers accessing Kubernetes API servers and ClickHouse system tables for diagnostics
- Metrics access: ClickHouse’s centralized monitoring dashboards access metrics from the Prometheus stack deployed within your BYOC VPC, providing ClickHouse engineers observability into the environment.
Tailscale is used only for management and troubleshooting operations. It is never used for query traffic or customer data access. All customer data remains within your VPC and is never transmitted through Tailscale connections.
How Tailscale Works in BYOC
For each service or endpoint that needs to be accessed via Tailscale, ClickHouse BYOC deploys:
-
Tailnet Address Registration: Each endpoint registers a unique tailnet address (e.g.,
k8s.xxxx.us-east-1.aws.byoc.clickhouse-prd.comfor the Kubernetes API server) -
Tailscale Agent Container: A Tailscale agent container runs in your EKS cluster, responsible for:
- Connecting to the Tailscale coordination server
- Registering services to make them discoverable
- Coordinating network setup with Nginx pods
-
Nginx Pod: An Nginx pod that:
- Terminates TLS traffic from Tailscale
- Routes traffic to the appropriate IPs within your EKS cluster
Network Connection Process
The Tailscale connection establishment follows these steps:
-
Initial Connection:
- Tailscale agents on both ends (ClickHouse engineer's environment and your BYOC EKS cluster) connect to the Tailscale coordination server
- The EKS cluster agent registers the Kubernetes service to make it discoverable
- ClickHouse engineers must escalate internally to gain visibility to the service
-
Connection Mode:
- Direct Mode: Agents attempt to establish a direct connection via NAT traversal tunnel
- Relay Mode: If direct mode fails, communication falls back to relay mode through a Tailscale DERP (Distributed Encrypted Relay Protocol) server
-
Encryption:
- All communication is encrypted end-to-end
- Each Tailscale agent generates its own public-private key pair (similar to PKI)
- Traffic remains encrypted regardless of whether it uses direct or relay mode
Security Features
Outbound-Only Connections:
- Tailscale agents in your EKS cluster initiate outbound connections to the Tailscale coordination/relay servers
- No inbound connections are required—no security group rules need to allow inbound traffic to Tailscale agents
- This reduces the attack surface and simplifies network security configuration
Access Control:
- Access is controlled through ClickHouse's internal approval system
- Engineers must request access through designated approval workflows
- Access is time-bound and automatically expires
- All access is audited and logged
Certificate-Based Authentication:
- For ClickHouse system table access, engineers use temporary, time-bound certificates
- Certificate-based authentication replaces password-based access for all human access in BYOC
- Access is restricted to system tables only (not customer data)
- All access attempts are logged in ClickHouse's
query_logtable
Troubleshooting Access via Tailscale
When ClickHouse engineers need to troubleshoot issues in your BYOC deployment, they use Tailscale to access:
- Kubernetes API Server: For diagnosing EBS mount failures, node-level network issues, and cluster health problems
- ClickHouse System Tables: For query performance analysis and diagnostic queries (read-only access to system tables only)
The troubleshooting access process:
- Access Request: On-call engineers within a designated group request access to the customer ClickHouse instance
- Approval: The request goes through an internal approval system with designated approvers
- Certificate Generation: A time-bound certificate is generated for the approved engineer
- ClickHouse Configuration: The ClickHouse operator configures ClickHouse to accept the certificate
- Connection: Engineers access the instance via Tailscale using the certificate
- Automatic Expiration: Access automatically expires after the set time period
Management Services Access
By default, ClickHouse management services access your BYOC Kubernetes cluster via the EKS API server's public IP address, which is restricted to ClickHouse's NAT gateway IP addresses only.
Optional Private Endpoint Configuration:
- You can configure the EKS API server to use only a private endpoint
- In this case, management services access the API server via Tailscale (similar to human troubleshooting access)
- Public access is kept as a backup mechanism for emergency investigation and support needs
Network Traffic Flow
Tailscale Connection Flow:
- Tailscale agent in EKS cluster → Tailscale coordination server (outbound)
- Tailscale agent on engineer's machine → Tailscale coordination server (outbound)
- Direct or relayed connection established between agents
- Encrypted traffic flows through the established tunnel
- Nginx pod in EKS terminates TLS and routes to internal services
No Customer Data Transmission:
- Tailscale connections are used only for management and troubleshooting
- Query traffic and customer data never flow through Tailscale
- All customer data remains within your VPC
Monitoring and Auditing
Both ClickHouse and customers can audit Tailscale access activity:
- ClickHouse Monitoring: ClickHouse monitors access requests and logs all Tailscale connections
- Customer Auditing: Customers can track activity from ClickHouse engineers within their own systems
- Query Logs: All system table access via Tailscale is logged in ClickHouse's
query_logtable
For more technical details about how Tailscale is implemented in BYOC, see the Building ClickHouse BYOC on AWS blog post.
Network boundaries
This section covers different network traffic to and from the customer BYOC VPC:
- Inbound: Traffic entering the customer BYOC VPC.
- Outbound: Traffic originating from the customer BYOC VPC and sent to an external destination.
- Public: A network endpoint accessible from the public internet.
- Private: A network endpoint accessible only through private connections, such as VPC peering, VPC Private Link, or Tailscale.
Istio ingress is deployed behind an AWS NLB to accept ClickHouse client traffic.
Inbound, Public or Private
The Istio ingress gateway terminates TLS. The certificate, provisioned by CertManager with Let's Encrypt, is stored as a secret within the EKS cluster. Traffic between Istio and ClickHouse is encrypted by AWS since they reside in the same VPC.
By default, ingress is publicly accessible with IP allow list filtering. Customers can configure VPC peering to make it private and disable public connections. We highly recommend setting up an IP filter to restrict access.
Troubleshooting access
Inbound, Private
ClickHouse Cloud engineers require troubleshooting access via Tailscale. They are provisioned with just-in-time certificate-based authentication for BYOC deployments.
Billing scraper
Outbound, Private
The Billing scraper collects billing data from ClickHouse and sends it to an S3 bucket owned by ClickHouse Cloud.
It runs as a sidecar alongside the ClickHouse server container, periodically scraping CPU and memory metrics. Requests within the same region are routed through VPC gateway service endpoints.
Alerts
Outbound, Public
AlertManager is configured to send alerts to ClickHouse Cloud when the customer's ClickHouse cluster is unhealthy.
Metrics and logs are stored within the customer's BYOC VPC. Logs are currently stored locally in EBS. In a future update, they will be stored in LogHouse, a ClickHouse service within the BYOC VPC. Metrics use a Prometheus and Thanos stack, stored locally in the BYOC VPC.
Service state
Outbound, Public
State Exporter sends ClickHouse service state information to an SQS owned by ClickHouse Cloud.