ClickHouse Connect driver API
Passing keyword arguments is recommended for most api methods given the number of possible arguments, most of which are optional.
Methods not documented here are not considered part of the API, and may be removed or changed.
Client Initialization
The clickhouse_connect.driver.client
class provides the primary interface between a Python application and the ClickHouse database server. Use the clickhouse_connect.get_client
function to obtain a Client instance, which accepts the following arguments:
Connection arguments
Parameter | Type | Default | Description |
---|---|---|---|
interface | str | http | Must be http or https. |
host | str | localhost | The hostname or IP address of the ClickHouse server. If not set, localhost will be used. |
port | int | 8123 or 8443 | The ClickHouse HTTP or HTTPS port. If not set will default to 8123, or to 8443 if secure=True or interface=https. |
username | str | default | The ClickHouse user name. If not set, the default ClickHouse user will be used. |
password | str | <empty string> | The password for username. |
database | str | None | The default database for the connection. If not set, ClickHouse Connect will use the default database for username. |
secure | bool | False | Use HTTPS/TLS. This overrides inferred values from the interface or port arguments. |
dsn | str | None | A string in standard DSN (Data Source Name) format. Other connection values (such as host or user) will be extracted from this string if not set otherwise. |
compress | bool or str | True | Enable compression for ClickHouse HTTP inserts and query results. See Additional Options (Compression) |
query_limit | int | 0 (unlimited) | Maximum number of rows to return for any query response. Set this to zero to return unlimited rows. Note that large query limits may result in out of memory exceptions if results are not streamed, as all results are loaded into memory at once. |
query_retries | int | 2 | Maximum number of retries for a query request. Only "retryable" HTTP responses will be retried. command or insert requests are not automatically retried by the driver to prevent unintended duplicate requests. |
connect_timeout | int | 10 | HTTP connection timeout in seconds. |
send_receive_timeout | int | 300 | Send/receive timeout for the HTTP connection in seconds. |
client_name | str | None | client_name prepended to the HTTP User Agent header. Set this to track client queries in the ClickHouse system.query_log. |
pool_mgr | obj | <default PoolManager> | The urllib3 library PoolManager to use. For advanced use cases requiring multiple connection pools to different hosts. |
http_proxy | str | None | HTTP proxy address (equivalent to setting the HTTP_PROXY environment variable). |
https_proxy | str | None | HTTPS proxy address (equivalent to setting the HTTPS_PROXY environment variable). |
apply_server_timezone | bool | True | Use server timezone for timezone aware query results. See Timezone Precedence |
show_clickhouse_errors | bool | True | Include detailed ClickHouse server error messages and exception codes in client exceptions. |
autogenerate_session_id | bool | None | Override the global autogenerate_session_id setting. If True, automatically generate a UUID4 session ID when none is provided. |
proxy_path | str | <empty string> | Optional path prefix to add to the ClickHouse server URL for proxy configurations. |
form_encode_query_params | bool | False | Send query parameters as form-encoded data in the request body instead of URL parameters. Useful for queries with large parameter sets that might exceed URL length limits. |
rename_response_column | str | None | Optional callback function or column name mapping to rename response columns in query results. |
HTTPS/TLS arguments
Parameter | Type | Default | Description |
---|---|---|---|
verify | bool | True | Validate the ClickHouse server TLS/SSL certificate (hostname, expiration, etc.) if using HTTPS/TLS. |
ca_cert | str | None | If verify=True, the file path to Certificate Authority root to validate ClickHouse server certificate, in .pem format. Ignored if verify is False. This is not necessary if the ClickHouse server certificate is a globally trusted root as verified by the operating system. |
client_cert | str | None | File path to a TLS Client certificate in .pem format (for mutual TLS authentication). The file should contain a full certificate chain, including any intermediate certificates. |
client_cert_key | str | None | File path to the private key for the Client Certificate. Required if the private key is not included the Client Certificate key file. |
server_host_name | str | None | The ClickHouse server hostname as identified by the CN or SNI of its TLS certificate. Set this to avoid SSL errors when connecting through a proxy or tunnel with a different hostname |
tls_mode | str | None | Controls advanced TLS behavior. proxy and strict do not invoke ClickHouse mutual TLS connection, but do send client cert and key. mutual assumes ClickHouse mutual TLS auth with a client certificate. None/default behavior is mutual |
Settings argument
Finally, the settings
argument to get_client
is used to pass additional ClickHouse settings to the server for each client request. Note that in most cases, users with readonly=1 access cannot alter settings sent with a query, so ClickHouse Connect will drop such settings in the final request and log a warning. The following settings apply only to HTTP queries/sessions used by ClickHouse Connect, and are not documented as general ClickHouse settings.
Setting | Description |
---|---|
buffer_size | Buffer size (in bytes) used by the ClickHouse server before writing to the HTTP channel. |
session_id | A unique session ID to associate related queries on the server. Required for temporary tables. |
compress | Whether the ClickHouse server should compress the POST response data. This setting should only be used for "raw" queries. |
decompress | Whether the data sent to the ClickHouse server must be decompressed. This setting should only be used for "raw" inserts. |
quota_key | The quota key associated with this request. See the ClickHouse server documentation on quotas. |
session_check | Used to check the session status. |
session_timeout | Number of seconds of inactivity before the session identified by the session ID will time out and no longer be considered valid. Defaults to 60 seconds. |
wait_end_of_query | Buffers the entire response on the ClickHouse server. This setting is required to return summary information, and is set automatically on non-streaming queries. |
role | ClickHouse role to be used for the session. Valid transport setting that can be included in query context. |
For other ClickHouse settings that can be sent with each query, see the ClickHouse documentation.
Client creation examples
- Without any parameters, a ClickHouse Connect client will connect to the default HTTP port on
localhost
with the default user and no password:
- Connecting to a secure (HTTPS) external ClickHouse server
- Connecting with a session ID and other custom connection parameters and ClickHouse settings.
Client Lifecycle and Best Practices
Creating a ClickHouse Connect client is an expensive operation that involves establishing a connection, retrieving server metadata, and initializing settings. Follow these best practices for optimal performance:
Core principles
- Reuse clients: Create clients once at application startup and reuse them throughout the application lifetime
- Avoid frequent creation: Don't create a new client for each query or request (this wastes hundreds of milliseconds per operation)
- Clean up properly: Always close clients when shutting down to release connection pool resources
- Share when possible: A single client can handle many concurrent queries through its connection pool (see threading notes below)
Basic patterns
✅ Good: Reuse a single client
❌ Bad: Creating clients repeatedly
Multi-threaded applications
Client instances are NOT thread-safe when using session IDs. By default, clients have an auto-generated session ID, and concurrent queries within the same session will raise a ProgrammingError
.
To share a client across threads safely:
Alternative for sessions: If you need sessions (e.g., for temporary tables), create a separate client per thread:
Proper cleanup
Always close clients at shutdown. Note that client.close()
disposes the client and closes pooled HTTP connections only when the client owns its pool manager (for example, when created with custom TLS/proxy options). For the default shared pool, use client.close_connections()
to proactively clear sockets; otherwise, connections are reclaimed automatically via idle expiration and at process exit.
Or use a context manager:
When to use multiple clients
Multiple clients are appropriate for:
- Different servers: One client per ClickHouse server or cluster
- Different credentials: Separate clients for different users or access levels
- Different databases: When you need to work with multiple databases
- Isolated sessions: When you need separate sessions for temporary tables or session-specific settings
- Per-thread isolation: When threads need independent sessions (as shown above)
Common method arguments
Several client methods use one or both of the common parameters
and settings
arguments. These keyword arguments are described below.
Parameters argument
ClickHouse Connect Client query*
and command
methods accept an optional parameters
keyword argument used for binding Python expressions to a ClickHouse value expression. Two sorts of binding are available.
Server-side binding
ClickHouse supports server side binding for most query values, where the bound value is sent separate from the query as an HTTP query parameter. ClickHouse Connect will add the appropriate query parameters if it detects a binding expression of the form {<name>:<datatype>}
. For server side binding, the parameters
argument should be a Python dictionary.
- Server-side binding with Python dictionary, DateTime value, and string value
This generates the following query on the server:
Server-side binding is only supported (by the ClickHouse server) for SELECT
queries. It does not work for ALTER
, DELETE
, INSERT
, or other types of queries. This may change in the future; see https://github.com/ClickHouse/ClickHouse/issues/42092.
Client-side binding
ClickHouse Connect also supports client-side parameter binding, which can allow more flexibility in generating templated SQL queries. For client-side binding, the parameters
argument should be a dictionary or a sequence. Client-side binding uses the Python "printf" style string formatting for parameter substitution.
Note that unlike server-side binding, client-side binding does not work for database identifiers such as database, table, or column names, since Python-style formatting cannot distinguish between the different types of strings, and they need to be formatted differently (backticks or double quotes for database identifiers, single quotes for data values).
- Example with Python Dictionary, DateTime value and string escaping
This generates the following query on the server:
- Example with Python Sequence (Tuple), Float64, and IPv4Address
This generates the following query on the server:
To bind DateTime64 arguments (ClickHouse types with sub-second precision) requires one of two custom approaches:
- Wrap the Python
datetime.datetime
value in the new DT64Param class, e.g.- If using a dictionary of parameter values, append the string
_64
to the parameter name
- If using a dictionary of parameter values, append the string
Settings argument
All the key ClickHouse Connect Client "insert" and "select" methods accept an optional settings
keyword argument to pass ClickHouse server user settings for the included SQL statement. The settings
argument should be a dictionary. Each item should be a ClickHouse setting name and its associated value. Note that values will be converted to strings when sent to the server as query parameters.
As with client level settings, ClickHouse Connect will drop any settings that the server marks as readonly=1, with an associated log message. Settings that apply only to queries via the ClickHouse HTTP interface are always valid. Those settings are described under the get_client
API.
Example of using ClickHouse settings:
Client command
Method
Use the Client.command
method to send SQL queries to the ClickHouse server that do not normally return data or that return a single primitive or array value rather than a full dataset. This method takes the following parameters:
Parameter | Type | Default | Description |
---|---|---|---|
cmd | str | Required | A ClickHouse SQL statement that returns a single value or a single row of values. |
parameters | dict or iterable | None | See parameters description. |
data | str or bytes | None | Optional data to include with the command as the POST body. |
settings | dict | None | See settings description. |
use_database | bool | True | Use the client database (specified when creating the client). False means the command will use the default ClickHouse server database for the connected user. |
external_data | ExternalData | None | An ExternalData object containing file or binary data to use with the query. See Advanced Queries (External Data) |
Command examples
DDL statements
Simple queries returning single values
Commands with parameters
Commands with settings
Client query
Method
The Client.query
method is the primary way to retrieve a single "batch" dataset from the ClickHouse server. It utilizes the Native ClickHouse format over HTTP to transmit large datasets (up to approximately one million rows) efficiently. This method takes the following parameters:
Parameter | Type | Default | Description |
---|---|---|---|
query | str | Required | The ClickHouse SQL SELECT or DESCRIBE query. |
parameters | dict or iterable | None | See parameters description. |
settings | dict | None | See settings description. |
query_formats | dict | None | Datatype formatting specification for result values. See Advanced Usage (Read Formats) |
column_formats | dict | None | Datatype formatting per column. See Advanced Usage (Read Formats) |
encoding | str | None | Encoding used to encode ClickHouse String columns into Python strings. Python defaults to UTF-8 if not set. |
use_none | bool | True | Use Python None type for ClickHouse nulls. If False, use a datatype default (such as 0) for ClickHouse nulls. Note - defaults to False for NumPy/Pandas for performance reasons. |
column_oriented | bool | False | Return the results as a sequence of columns rather than a sequence of rows. Helpful for transforming Python data to other column oriented data formats. |
query_tz | str | None | A timezone name from the zoneinfo database. This timezone will be applied to all datetime or Pandas Timestamp objects returned by the query. |
column_tzs | dict | None | A dictionary of column name to timezone name. Like query_tz , but allows specifying different timezones for different columns. |
use_extended_dtypes | bool | True | Use Pandas extended dtypes (like StringArray), and pandas.NA and pandas.NaT for ClickHouse NULL values. Applies only to query_df and query_df_stream methods. |
external_data | ExternalData | None | An ExternalData object containing file or binary data to use with the query. See Advanced Queries (External Data) |
context | QueryContext | None | A reusable QueryContext object can be used to encapsulate the above method arguments. See Advanced Queries (QueryContexts) |
Query examples
Basic query
Accessing query results
Query with client-side parameters
Query with server-side parameters
Query with settings
The QueryResult
object
The base query
method returns a QueryResult
object with the following public properties:
result_rows
-- A matrix of the data returned in the form of a Sequence of rows, with each row element being a sequence of column values.result_columns
-- A matrix of the data returned in the form of a Sequence of columns, with each column element being a sequence of the row values for that columncolumn_names
-- A tuple of strings representing the column names in theresult_set
column_types
-- A tuple of ClickHouseType instances representing the ClickHouse data type for each column in theresult_columns
query_id
-- The ClickHouse query_id (useful for examining the query in thesystem.query_log
table)summary
-- Any data returned by theX-ClickHouse-Summary
HTTP response headerfirst_item
-- A convenience property for retrieving the first row of the response as a dictionary (keys are column names)first_row
-- A convenience property to return the first row of the resultcolumn_block_stream
-- A generator of query results in column oriented format. This property should not be referenced directly (see below).row_block_stream
-- A generator of query results in row oriented format. This property should not be referenced directly (see below).rows_stream
-- A generator of query results that yields a single row per invocation. This property should not be referenced directly (see below).summary
-- As described under thecommand
method, a dictionary of summary information returned by ClickHouse
The *_stream
properties return a Python Context that can be used as an iterator for the returned data. They should only be accessed indirectly using the Client *_stream
methods.
The complete details of streaming query results (using StreamContext objects) are outlined in Advanced Queries (Streaming Queries).
Consuming query results with NumPy, Pandas or Arrow
ClickHouse Connect provides specialized query methods for NumPy, Pandas, and Arrow data formats. For detailed information on using these methods, including examples, streaming capabilities, and advanced type handling, see Advanced Querying (NumPy, Pandas and Arrow Queries).
Client streaming query methods
For streaming large result sets, ClickHouse Connect provides multiple streaming methods. See Advanced Queries (Streaming Queries) for details and examples.
Client insert
Method
For the common use case of inserting multiple records into ClickHouse, there is the Client.insert
method. It takes the following parameters:
Parameter | Type | Default | Description |
---|---|---|---|
table | str | Required | The ClickHouse table to insert into. The full table name (including database) is permitted. |
data | Sequence of Sequences | Required | The matrix of data to insert, either a Sequence of rows, each of which is a sequence of column values, or a Sequence of columns, each of which is a sequence of row values. |
column_names | Sequence of str, or str | '*' | A list of column_names for the data matrix. If '*' is used instead, ClickHouse Connect will execute a "pre-query" to retrieve all of the column names for the table. |
database | str | '' | The target database of the insert. If not specified, the database for the client will be assumed. |
column_types | Sequence of ClickHouseType | None | A list of ClickHouseType instances. If neither column_types or column_type_names is specified, ClickHouse Connect will execute a "pre-query" to retrieve all the column types for the table. |
column_type_names | Sequence of ClickHouse type names | None | A list of ClickHouse datatype names. If neither column_types or column_type_names is specified, ClickHouse Connect will execute a "pre-query" to retrieve all the column types for the table. |
column_oriented | bool | False | If True, the data argument is assumed to be a Sequence of columns (and no "pivot" will be necessary to insert the data). Otherwise data is interpreted as a Sequence of rows. |
settings | dict | None | See settings description. |
context | InsertContext | None | A reusable InsertContext object can be used to encapsulate the above method arguments. See Advanced Inserts (InsertContexts) |
transport_settings | dict | None | Optional dictionary of transport-level settings (HTTP headers, etc.) |
This method returns a "query summary" dictionary as described under the "command" method. An exception will be raised if the insert fails for any reason.
For specialized insert methods that work with Pandas DataFrames, PyArrow Tables, and Arrow-backed DataFrames, see Advanced Inserting (Specialized Insert Methods).
A NumPy array is a valid Sequence of Sequences and can be used as the data
argument to the main insert
method, so a specialized method is not required.
Examples
The examples below assume an existing table users
with schema (id UInt32, name String, age UInt8)
.
Basic row-oriented insert
Column-oriented insert
Insert with explicit column types
Insert into specific database
File Inserts
For inserting data directly from files into ClickHouse tables, see Advanced Inserting (File Inserts).
Raw API
For advanced use cases requiring direct access to ClickHouse HTTP interfaces without type transformations, see Advanced Usage (Raw API).
Utility classes and functions
The following classes and functions are also considered part of the "public" clickhouse-connect
API and are, like the classes and methods documented above, stable across minor releases. Breaking changes to these classes and functions will only occur with a minor (not patch) release and will be available with a deprecated status for at least one minor release.
Exceptions
All custom exceptions (including those defined in the DB API 2.0 specification) are defined in the clickhouse_connect.driver.exceptions
module. Exceptions actually detected by the driver will use one of these types.
ClickHouse SQL utilities
The functions and the DT64Param class in the clickhouse_connect.driver.binding
module can be used to properly build and escape ClickHouse SQL queries. Similarly, the functions in the clickhouse_connect.driver.parser
module can be used to parse ClickHouse datatype names.
Multithreaded, multiprocess, and async/event driven use cases
For information on using ClickHouse Connect in multithreaded, multiprocess, and async/event-driven applications, see Advanced Usage (Multithreaded, multiprocess, and async/event driven use cases).
AsyncClient wrapper
For information on using the AsyncClient wrapper for asyncio environments, see Advanced Usage (AsyncClient wrapper).
Managing ClickHouse Session IDs
For information on managing ClickHouse session IDs in multi-threaded or concurrent applications, see Advanced Usage (Managing ClickHouse Session IDs).
Customizing the HTTP connection pool
For information on customizing the HTTP connection pool for large multi-threaded applications, see Advanced Usage (Customizing the HTTP connection pool).