HTTP Interface
Prerequisites
For the examples in this article you will need:
- to have a running instance of ClickHouse server
- have
curl
installed. On Ubuntu or Debian, runsudo apt install curl
or refer to this documentation for installation instructions.
Overview
The HTTP interface lets you use ClickHouse on any platform from any programming language in the form of a REST API. The HTTP interface is more limited than the native interface, but it has better language support.
By default, clickhouse-server
listens on the following ports:
- port 8123 for HTTP
- port 8443 for HTTPS can be enabled
If you make a GET /
request without any parameters, a 200 response code is returned along with the string "Ok.":
"Ok." is the default value defined in http_server_default_response
and can be changed if desired.
Also see: HTTP response codes caveats.
Web user interface
ClickHouse includes a web user interface, which can be accessed from the following address:
The web UI supports displaying progress during query runtime, query cancellation, and result streaming. It has a secret feature for displaying charts and graphs for query pipelines.
The web UI is designed for professionals like you.
In health-check scripts use the GET /ping
request. This handler always returns "Ok." (with a line feed at the end). Available from version 18.12.13. See also /replicas_status
to check replica's delay.
Querying over HTTP/HTTPS
To query over HTTP/HTTPS there are three options:
- send the request as a URL 'query' parameter
- using the POST method.
- Send the beginning of the query in the 'query' parameter, and the rest using POST
The size of the URL is limited to 1 MiB by default, this can be changed with the http_max_uri_size
setting.
If successful, you receive the 200 response code and the result in the response body. If an error occurs, you receive the 500 response code and an error description text in the response body.
Requests using GET are 'readonly'. This means that for queries that modify data, you can only use the POST method. You can send the query itself either in the POST body or in the URL parameter. Let's look at some examples.
In the example below curl is used to send the query SELECT 1
. Note the use of URL encoding for the space: %20
.
In this example wget is used with the -nv
(non-verbose) and -O-
parameters to output the result to the terminal.
In this case it is not necessary to use URL encoding for the space:
In this example we pipe a raw HTTP request into netcat:
As you can see, the curl
command is somewhat inconvenient in that spaces must be URL escaped.
Although wget
escapes everything itself, we do not recommend using it because it does not work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
If part of the query is sent in the parameter, and part in the POST, a line feed is inserted between these two data parts. For example, this won't work:
By default, data is returned in the TabSeparated
format.
The FORMAT
clause is used in the query to request any other format. For example:
You can use the default_format
URL parameter or the X-ClickHouse-Format
header to specify a default format other than TabSeparated
.
Insert queries over HTTP/HTTPS
The POST
method of transmitting data is necessary for INSERT
queries. In this case, you can write the beginning of the query in the URL parameter, and use POST to pass the data to insert. The data to insert could be, for example, a tab-separated dump from MySQL. In this way, the INSERT
query replaces LOAD DATA LOCAL INFILE
from MySQL.
Examples
To create a table:
To use the familiar INSERT
query for data insertion:
To send data separately from the query:
Any data format can be specified. For example, the 'Values' format, the same format as the one used when writing INSERT INTO t VALUES
, can be specified:
To insert data from a tab-separated dump, specify the corresponding format:
To read the table contents:
Data is output in a random order due to parallel query processing
To delete the table:
For successful requests that do not return a data table, an empty response body is returned.
Compression
Compression can be used to reduce network traffic when transmitting a large amount of data, or for creating dumps that are immediately compressed.
You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you need the clickhouse-compressor
program to work with it. It is installed by default with the clickhouse-client
package.
To increase the efficiency of data insertion, disable server-side checksum verification by using the http_native_compression_disable_checksumming_on_decompress
setting.
If you specify compress=1
in the URL, the server will compress the data it sends to you. If you specify decompress=1
in the URL, the server will decompress the data which you pass in the POST
method.
You can also choose to use HTTP compression. ClickHouse supports the following compression methods:
gzip
br
deflate
xz
zstd
lz4
bz2
snappy
To send a compressed POST
request, append the request header Content-Encoding: compression_method
.
In order for ClickHouse to compress the response, enable compression with the enable_http_compression
setting and append the Accept-Encoding: compression_method
header to the request.
You can configure the data compression level using the http_zlib_compression_level
setting for all compression methods.
Some HTTP clients might decompress data from the server by default (with gzip
and deflate
) and you might get decompressed data even if you use the compression settings correctly.
Examples
To send compressed data to the server:
To receive the compressed data archive from the server:
To receive compressed data from the server, using gunzip to receive decompressed data:
Default Database
You can use the database
URL parameter or the X-ClickHouse-Database
header to specify the default database.
By default, the database that is registered in the server settings is used as the default database. Out of the box, this is the database called default
. Alternatively, you can always specify the database using a dot before the table name.
Authentication
The username and password can be indicated in one of three ways:
- Using HTTP Basic Authentication.
For example:
- In the
user
andpassword
URL parameters
We do not recommend using this method as the parameter might be logged by web proxy and cached in the browser
For example:
- Using the 'X-ClickHouse-User' and 'X-ClickHouse-Key' headers
For example:
If the user name is not specified, then the default
name is used. If the password is not specified, then an empty password is used.
You can also use the URL parameters to specify any settings for processing a single query or entire profiles of settings.
For example:
For more information see:
Using ClickHouse sessions in the HTTP protocol
You can also use ClickHouse sessions in the HTTP protocol. To do this, you need to add the session_id
GET
parameter to the request. You can use any string as the session ID.
By default, the session is terminated after 60 seconds of inactivity. To change this timeout (in seconds), modify the default_session_timeout
setting in the server configuration, or add the session_timeout
GET
parameter to the request.
To check the session status, use the session_check=1
parameter. Only one query at a time can be executed within a single session.
You can receive information about the progress of a query in the X-ClickHouse-Progress
response headers. To do this, enable send_progress_in_http_headers
.
Below is an example of the header sequence:
The possible header fields can be:
Header field | Description |
---|---|
read_rows | Number of rows read. |
read_bytes | Volume of data read in bytes. |
total_rows_to_read | Total number of rows to be read. |
written_rows | Number of rows written. |
written_bytes | Volume of data written in bytes. |
Running requests do not stop automatically if the HTTP connection is lost. Parsing and data formatting are performed on the server-side, and using the network might be ineffective.
The following optional parameters exist:
Parameters | Description |
---|---|
query_id (optional) | Can be passed as the query ID (any string). replace_running_query |
quota_key (optional) | Can be passed as the quota key (any string). "Quotas" |
The HTTP interface allows passing external data (external temporary tables) for querying. For more information, see "External data for query processing".
Response Buffering
Response buffering can be enabled on the server-side. The following URL parameters are provided for this purpose:
buffer_size
wait_end_of_query
The following settings can be used:
buffer_size
determines the number of bytes in the result to buffer in the server memory. If a result body is larger than this threshold, the buffer is written to the HTTP channel, and the remaining data is sent directly to the HTTP channel.
To ensure that the entire response is buffered, set wait_end_of_query=1
. In this case, the data that is not stored in memory will be buffered in a temporary server file.
For example:
Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client-side, the error can only be detected at the parsing stage.
Setting a role with query parameters
This feature was added in ClickHouse 24.4.
In specific scenarios, setting the granted role first might be required before executing the statement itself.
However, it is not possible to send SET ROLE
and the statement together, as multi-statements are not allowed:
The command above results in an error:
To overcome this limitation, use the role
query parameter instead:
This is the equivalent of executing SET ROLE my_role
before the statement.
Additionally, it is possible to specify multiple role
query parameters:
In this case, ?role=my_role&role=my_other_role
works similarly to executing SET ROLE my_role, my_other_role
before the statement.
HTTP response codes caveats
Because of limitations of the HTTP protocol, a HTTP 200 response code does not guarantee that a query was successful.
Here is an example:
The reason for this behavior is the nature of the HTTP protocol. The HTTP header is sent first with an HTTP code of 200, followed by the HTTP body, and then the error is injected into the body as plain text.
This behavior is independent of the format used, whether it's Native
, TSV
, or JSON
; the error message will always be in the middle of the response stream.
You can mitigate this problem by enabling wait_end_of_query=1
(Response Buffering). In this case, sending of the HTTP header is delayed until the entire query is resolved. This however, does not completely solve the problem because the result must still fit within the http_response_buffer_size
, and other settings like send_progress_in_http_headers
can interfere with the delay of the header.
The only way to catch all errors is to analyze the HTTP body before parsing it using the required format.
Queries with Parameters
You can create a query with parameters and pass values for them from the corresponding HTTP request parameters. For more information, see Queries with Parameters for CLI.
Example
Tabs in URL Parameters
Query parameters are parsed from the "escaped" format. This has some benefits, such as the possibility to unambiguously parse nulls as \N
. This means the tab character should be encoded as \t
(or \
and a tab). For example, the following contains an actual tab between abc
and 123
and the input string is split into two values:
However, if you try to encode an actual tab using %09
in a URL parameter, it won't get parsed properly:
If you are using URL parameters, you will need to encode the \t
as %5C%09
. For example:
Predefined HTTP Interface
ClickHouse supports specific queries through the HTTP interface. For example, you can write data to a table as follows:
ClickHouse also supports a Predefined HTTP Interface which can help you more easily integrate with third-party tools like Prometheus exporter. Let's look at an example.
First of all, add this section to your server configuration file.
http_handlers
is configured to contain multiple rule
. ClickHouse will match the HTTP requests received to the predefined type in rule
and the first rule matched runs the handler. Then ClickHouse will execute the corresponding predefined query if the match is successful.
You can now request the URL directly for data in the Prometheus format:
Configuration options for http_handlers
work as follows.
rule
can configure the following parameters:
method
headers
url
handler
Each of these are discussed below:
-
method
is responsible for matching the method part of the HTTP request.method
fully conforms to the definition of [method
]
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods) in the HTTP protocol. It is an optional configuration. If it is not defined in the
configuration file, it does not match the method portion of the HTTP request. -
url
is responsible for matching the URL part of the HTTP request. It is compatible with RE2's regular expressions. It is an optional configuration. If it is not defined in the configuration file, it does not match the URL portion of the HTTP request. -
headers
are responsible for matching the header part of the HTTP request. It is compatible with RE2's regular expressions. It is an optional configuration. If it is not defined in the configuration file, it does not match the header portion of the HTTP request. -
handler
contains the main processing part. Nowhandler
can configuretype
,status
,content_type
,http_response_headers
,response_content
,query
,query_param_name
.type
currently supports three types:predefined_query_handler
,dynamic_query_handler
,static
.query
— use withpredefined_query_handler
type, executes query when the handler is called.query_param_name
— use withdynamic_query_handler
type, extracts and executes the value corresponding to thequery_param_name
value in HTTP request parameters.status
— use withstatic
type, response status code.content_type
— use with any type, response content-type.http_response_headers
— use with any type, response headers map. Could be used to set content type as well.response_content
— use withstatic
type, response content sent to client, when using the prefix 'file://' or 'config://', find the content from the file or configuration sends to client.
The configuration methods for different type
s are discussed next.
predefined_query_handler
predefined_query_handler
supports setting Settings
and query_params
values. You can configure query
in the type of predefined_query_handler
.
query
value is a predefined query of predefined_query_handler
, which is executed by ClickHouse when an HTTP request is matched and the result of the query is returned. It is a must configuration.
The following example defines the values of max_threads
and max_final_threads
settings, then queries the system table to check whether these settings were set successfully.
To keep the default handlers
such as query
, play
, ping
, add the <defaults/>
rule.
For example:
In one predefined_query_handler
only one query
is supported.
dynamic_query_handler
In dynamic_query_handler
, the query is written in the form of parameter of the HTTP request. The difference is that in predefined_query_handler
, the query is written in the configuration file. query_param_name
can be configured in dynamic_query_handler
.
ClickHouse extracts and executes the value corresponding to the query_param_name
value in the URL of the HTTP request. The default value of query_param_name
is /query
. It is an optional configuration. If there is no definition in the configuration file, the parameter is not passed in.
To experiment with this functionality, the following example defines the values of max_threads
and max_final_threads
and queries
whether the settings were set successfully.
Example:
static
static
can return content_type
, status and response_content
. response_content
can return the specified content.
For example, to return a message "Say Hi!":
http_response_headers
could be used to set the content type instead of content_type
.
Find the content from the configuration send to client.
To find the content from the file send to client:
HTTP Response Headers
ClickHouse allows you to configure custom HTTP response headers that can be applied to any kind of handler that can be configured. These headers can be set using the http_response_headers
setting, which accepts key-value pairs representing header names and their values. This feature is particularly useful for implementing custom security headers, CORS policies, or any other HTTP header requirements across your ClickHouse HTTP interface.
For example, you can configure headers for:
- Regular query endpoints
- Web UI
- Health check.
It is also possible to specify common_http_response_headers
. These will be applied to all http handlers defined in the configuration.
The headers will be included in the HTTP response for every configured handler.
In the example below, every server response will contain two custom headers: X-My-Common-Header
and X-My-Custom-Header
.
Valid JSON/XML response on exception during HTTP streaming
While query execution occurs over HTTP an exception can happen when part of the data has already been sent. Usually an exception is sent to the client in plain text.
Even if some specific data format was used to output data and the output may become invalid in terms of specified data format.
To prevent it, you can use setting http_write_exception_in_output_format
(enabled by default) that will tell ClickHouse to write an exception in specified format (currently supported for XML and JSON* formats).
Examples: