Перейти к основному содержанию
Перейти к основному содержанию

Java-клиент

Клиентская библиотека Java для связи с сервером БД через его протоколы. Текущая реализация поддерживает только HTTP-интерфейс. Библиотека предоставляет собственный API для отправки запросов на сервер. Библиотека также предоставляет инструменты для работы с различными форматами бинарных данных (RowBinary* и Native*).

Установка


<dependency>
    <groupId>com.clickhouse</groupId>
    <artifactId>client-v2</artifactId>
    <version>0.9.4</version>
</dependency>

Инициализация

Объект Client инициализируется методом com.clickhouse.client.api.Client.Builder#build(). Каждый клиент имеет свой собственный контекст, и объекты не разделяются между ними. Builder имеет методы конфигурации для удобной настройки.

Пример:

 Client client = new Client.Builder()
                .addEndpoint("https://clickhouse-cloud-instance:8443/")
                .setUsername(user)
                .setPassword(password)
                .build();

Client реализует AutoCloseable и должен быть закрыт, когда больше не нужен.

Аутентификация

Аутентификация настраивается для каждого клиента на этапе инициализации. Поддерживаются три метода аутентификации: по паролю, по токену доступа, по SSL-сертификату клиента.

Аутентификация по паролю требует установки имени пользователя и пароля путем вызова setUsername(String) и setPassword(String):

 Client client = new Client.Builder()
        .addEndpoint("https://clickhouse-cloud-instance:8443/")
        .setUsername(user)
        .setPassword(password)
        .build();

Аутентификация по токену доступа требует установки токена доступа путем вызова setAccessToken(String):

 Client client = new Client.Builder()
        .addEndpoint("https://clickhouse-cloud-instance:8443/")
        .setAccessToken(userAccessToken)
        .build();

Аутентификация по SSL-сертификату клиента требует установки имени пользователя, включения SSL-аутентификации, установки клиентского сертификата и клиентского ключа путем вызова методов setUsername(String), useSSLAuthentication(boolean), setClientCertificate(String) и setClientKey(String) соответственно:

Client client = new Client.Builder()
        .useSSLAuthentication(true)
        .setUsername("some_user")
        .setClientCertificate("some_user.crt")
        .setClientKey("some_user.key")
Примечание

SSL-аутентификацию может быть сложно диагностировать в продакшене, потому что многие ошибки из SSL-библиотек не предоставляют достаточно информации. Например, если клиентский сертификат и ключ не совпадают, сервер немедленно завершит соединение (в случае HTTP это будет этап инициации соединения, где HTTP-запросы еще не отправляются, поэтому ответ не отправляется).

Пожалуйста, используйте инструменты вроде openssl для проверки сертификатов и ключей:

  • проверить целостность ключа: openssl rsa -in [key-file.key] -check -noout
  • проверить, что клиентский сертификат имеет соответствующий CN для пользователя:
    • получить CN из пользовательского сертификата - openssl x509 -noout -subject -in [user.cert]
    • проверить, что то же значение установлено в базе данных select name, auth_type, auth_params from system.users where auth_type = 'ssl_certificate' (запрос выведет auth_params с чем-то вроде {"common_names":["some_user"]})

Конфигурация

Все настройки определяются методами экземпляра (также известными как методы конфигурации), которые делают область действия и контекст каждого значения понятными. Основные параметры конфигурации определены в одной области (клиент или операция) и не переопределяют друг друга.

Конфигурация определяется при создании клиента. См. com.clickhouse.client.api.Client.Builder.

Конфигурация клиента

МетодАргументыОписаниеПо умолчаниюКлюч
addEndpoint(String endpoint)endpoint - адрес сервера в формате URLДобавляет конечную точку сервера в список доступных серверов. В настоящее время поддерживается только одна конечная точка.nonenone
addEndpoint(Protocol protocol, String host, int port, boolean secure)protocol - протокол подключения
host - IP или имя хоста
secure - использовать HTTPS
Добавляет конечную точку сервера в список доступных серверов. В настоящее время поддерживается только одна конечная точка.nonenone
enableConnectionPool(boolean enable)enable - флаг для включения/отключенияУстанавливает, включен ли пул соединенийtrueconnection_pool_enabled
setMaxConnections(int maxConnections)maxConnections - количество соединенийУстанавливает, сколько соединений может открыть клиент к каждой конечной точке сервера.10max_open_connections
setConnectionTTL(long timeout, ChronoUnit unit)timeout - значение тайм-аута
unit - единица времени
Устанавливает TTL соединения, после которого соединение будет считаться неактивным-1connection_ttl
setKeepAliveTimeout(long timeout, ChronoUnit unit)timeout - значение тайм-аута
unit - единица времени
Устанавливает тайм-аут keep-alive HTTP-соединения. Установите 0, чтобы отключить Keep-Alive.-http_keep_alive_timeout
setConnectionReuseStrategy(ConnectionReuseStrategy strategy)strategy - LIFO или FIFOВыбирает стратегию, которую должен использовать пул соединенийFIFOconnection_reuse_strategy
setDefaultDatabase(String database)database - имя базы данныхУстанавливает базу данных по умолчанию.defaultdatabase

Server Settings

Server side settings can be set on the client level once while creation (see serverSetting method of the Builder) and on operation level (see serverSetting for operation settings class).

 try (Client client = new Client.Builder().addEndpoint(Protocol.HTTP, "localhost", mockServer.port(), false)
        .setUsername("default")
        .setPassword(ClickHouseServerForTest.getPassword())
        .compressClientRequest(true)

        // Client level
        .serverSetting("max_threads", "10")
        .serverSetting("async_insert", "1")
        .serverSetting("roles", Arrays.asList("role1", "role2"))

        .build()) {

	// Operation level
	QuerySettings querySettings = new QuerySettings();
	querySettings.serverSetting("session_timezone", "Europe/Zurich");

	...
}

⚠️ When options are set via setOption method (either the Client.Builder or operation settings class) then server settings name should be prefixed with clickhouse_setting_. The com.clickhouse.client.api.ClientConfigProperties#serverSetting() may be handy in this case.

Custom HTTP Header

Custom HTTP headers can be set for all operations (client level) or a single one (operation level).


QuerySettings settings = new QuerySettings()
    .httpHeader(HttpHeaders.REFERER, clientReferer)
    .setQueryId(qId);

When options are set via setOption method (either the Client.Builder or operation settings class) then custom header name should be prefixed with http_header_. Method com.clickhouse.client.api.ClientConfigProperties#httpHeader() may be handy in this case.

Общие определения

ClickHouseFormat

Перечисление поддерживаемых форматов. Включает все форматы, которые поддерживает ClickHouse.

  • raw - пользователь должен перекодировать сырые данные
  • full - клиент может перекодировать данные самостоятельно и принимает поток сырых данных
  • - - операция не поддерживается ClickHouse для этого формата

Эта версия клиента поддерживает:

FormatInputOutput
TabSeparatedrawraw
TabSeparatedRawrawraw
TabSeparatedWithNamesrawraw
TabSeparatedWithNamesAndTypesrawraw
TabSeparatedRawWithNamesrawraw
TabSeparatedRawWithNamesAndTypesrawraw
Templaterawraw
TemplateIgnoreSpacesraw-
CSVrawraw
CSVWithNamesrawraw
CSVWithNamesAndTypesrawraw
CustomSeparatedrawraw
CustomSeparatedWithNamesrawraw
CustomSeparatedWithNamesAndTypesrawraw
SQLInsert-raw
Valuesrawraw
Vertical-raw
JSONrawraw
JSONAsStringraw-
JSONAsObjectraw-
JSONStringsrawraw
JSONColumnsrawraw
JSONColumnsWithMetadatarawraw
JSONCompactrawraw
JSONCompactStrings-raw
JSONCompactColumnsrawraw
JSONEachRowrawraw
PrettyJSONEachRow-raw
JSONEachRowWithProgress-raw
JSONStringsEachRowrawraw
JSONStringsEachRowWithProgress-raw
JSONCompactEachRowrawraw
JSONCompactEachRowWithNamesrawraw
JSONCompactEachRowWithNamesAndTypesrawraw
JSONCompactStringsEachRowrawraw
JSONCompactStringsEachRowWithNamesrawraw
JSONCompactStringsEachRowWithNamesAndTypesrawraw
JSONObjectEachRowrawraw
BSONEachRowrawraw
TSKVrawraw
Pretty-raw
PrettyNoEscapes-raw
PrettyMonoBlock-raw
PrettyNoEscapesMonoBlock-raw
PrettyCompact-raw
PrettyCompactNoEscapes-raw
PrettyCompactMonoBlock-raw
PrettyCompactNoEscapesMonoBlock-raw
PrettySpace-raw
PrettySpaceNoEscapes-raw
PrettySpaceMonoBlock-raw
PrettySpaceNoEscapesMonoBlock-raw
Prometheus-raw
Protobufrawraw
ProtobufSinglerawraw
ProtobufListrawraw
Avrorawraw
AvroConfluentraw-
Parquetrawraw
ParquetMetadataraw-
Arrowrawraw
ArrowStreamrawraw
ORCrawraw
Oneraw-
Npyrawraw
RowBinaryfullfull
RowBinaryWithNamesfullfull
RowBinaryWithNamesAndTypesfullfull
RowBinaryWithDefaultsfull-
Nativefullraw
Null-raw
XML-raw
CapnProtorawraw
LineAsStringrawraw
Regexpraw-
RawBLOBrawraw
MsgPackrawraw
MySQLDumpraw-
DWARFraw-
Markdown-raw
Formraw-

API для вставки

insert(String tableName, InputStream data, ClickHouseFormat format)

Принимает данные как InputStream байтов в указанном формате. Ожидается, что data закодированы в format.

Сигнатуры

CompletableFuture<InsertResponse> insert(String tableName, InputStream data, ClickHouseFormat format, InsertSettings settings)
CompletableFuture<InsertResponse> insert(String tableName, InputStream data, ClickHouseFormat format)

Параметры

tableName - имя целевой таблицы.

data - входной поток закодированных данных.

format - формат, в котором закодированы данные.

settings - настройки запроса.

Возвращаемое значение

Future типа InsertResponse - результат операции и дополнительная информация, такая как метрики на стороне сервера.

Примеры

try (InputStream dataStream = getDataStream()) {
    try (InsertResponse response = client.insert(TABLE_NAME, dataStream, ClickHouseFormat.JSONEachRow,
            insertSettings).get(3, TimeUnit.SECONDS)) {

        log.info("Insert finished: {} rows written", response.getMetrics().getMetric(ServerMetrics.NUM_ROWS_WRITTEN).getLong());
    } catch (Exception e) {
        log.error("Failed to write JSONEachRow data", e);
        throw new RuntimeException(e);
    }
}

insert(String tableName, List<?> data, InsertSettings settings)

Sends a write request to database. The list of objects is converted into an efficient format and then is sent to a server. The class of the list items should be registered up-front using register(Class, TableSchema) method.

Signatures

client.insert(String tableName, List<?> data, InsertSettings settings)
client.insert(String tableName, List<?> data)

Parameters

tableName - name of the target table.

data - collection DTO (Data Transfer Object) objects.

settings - request settings.

Return value

Future of InsertResponse type - the result of the operation and additional information like server side metrics.

Examples

// Important step (done once) - register class to pre-compile object serializer according to the table schema.
client.register(ArticleViewEvent.class, client.getTableSchema(TABLE_NAME));

List<ArticleViewEvent> events = loadBatch();

try (InsertResponse response = client.insert(TABLE_NAME, events).get()) {
    // handle response, then it will be closed and connection that served request will be released.
}

InsertSettings

Configuration options for insert operations.

Configuration methods

MethodDescription
setQueryId(String queryId)Sets query ID that will be assigned to the operation. Default: null.
setDeduplicationToken(String token)Sets the deduplication token. This token will be sent to the server and can be used to identify the query. Default: null.
setInputStreamCopyBufferSize(int size)Copy buffer size. The buffer is used during write operations to copy data from user-provided input stream to an output stream. Default: 8196.
serverSetting(String name, String value)Sets individual server settings for an operation.
serverSetting(String name, Collection values)Sets individual server settings with multiple values for an operation. Items of the collection should be String values.
setDBRoles(Collection dbRoles)Sets DB roles to be set before executing an operation. Items of the collection should be String values.
setOption(String option, Object value)Sets a configuration option in raw format. This is not a server setting.

InsertResponse

Response object that holds result of insert operation. It is only available if the client got response from a server.

Примечание

This object should be closed as soon as possible to release a connection because the connection cannot be re-used until all data of previous response is fully read.

MethodDescription
OperationMetrics getMetrics()Returns object with operation metrics.
String getQueryId()Returns query ID assigned for the operation by the application (through operation settings or by server).

API для запросов

query(String sqlQuery)

Отправляет sqlQuery как есть. Формат ответа задается настройками запроса. QueryResponse будет содержать ссылку на поток ответа, который должен быть обработан читателем для поддерживаемого формата.

Сигнатуры

CompletableFuture<QueryResponse> query(String sqlQuery, QuerySettings settings)
CompletableFuture<QueryResponse> query(String sqlQuery)

Parameters

sqlQuery - a single SQL statement. The Query is sent as is to a server.

settings - request settings.

Return value

Future of QueryResponse type - a result dataset and additional information like server side metrics. The Response object should be closed after consuming the dataset.

Examples

final String sql = "select * from " + TABLE_NAME + " where title <> '' limit 10";

// Default format is RowBinaryWithNamesAndTypesFormatReader so reader have all information about columns
try (QueryResponse response = client.query(sql).get(3, TimeUnit.SECONDS);) {

    // Create a reader to access the data in a convenient way
    ClickHouseBinaryFormatReader reader = client.newBinaryFormatReader(response);

    while (reader.hasNext()) {
        reader.next(); // Read the next record from stream and parse it

        // get values
        double id = reader.getDouble("id");
        String title = reader.getString("title");
        String url = reader.getString("url");

        // collecting data
    }
} catch (Exception e) {
    log.error("Failed to read data", e);
}

// put business logic outside of the reading block to release http connection asap.

query(String sqlQuery, Map<String, Object> queryParams, QuerySettings settings)

Sends sqlQuery as is. Additionally will send query parameters so the server can compile the SQL expression.

Signatures

CompletableFuture<QueryResponse> query(String sqlQuery, Map<String, Object> queryParams, QuerySettings settings)

Parameters

sqlQuery - sql expression with placeholders {}.

queryParams - map of variables to complete the sql expression on server.

settings - request settings.

Return value

Future of QueryResponse type - a result dataset and additional information like server side metrics. The Response object should be closed after consuming the dataset.

Examples


// define parameters. They will be sent to the server along with the request.
Map<String, Object> queryParams = new HashMap<>();
queryParams.put("param1", 2);

try (QueryResponse response =
        client.query("SELECT * FROM " + table + " WHERE col1 >= {param1:UInt32}", queryParams, new QuerySettings()).get()) {

    // Create a reader to access the data in a convenient way
    ClickHouseBinaryFormatReader reader = client.newBinaryFormatReader(response);

    while (reader.hasNext()) {
        reader.next(); // Read the next record from stream and parse it

        // reading data
    }

} catch (Exception e) {
    log.error("Failed to read data", e);
}

queryAll(String sqlQuery)

Queries a data in RowBinaryWithNamesAndTypes format. Returns the result as a collection. Read performance is the same as with the reader but more memory is required to hold the whole dataset.

Signatures

List<GenericRecord> queryAll(String sqlQuery)

Parameters

sqlQuery - sql expression to query data from a server.

Return value

Complete dataset represented by a list of GenericRecord objects that provide access in row style for the result data.

Examples

try {
    log.info("Reading whole table and process record by record");
    final String sql = "select * from " + TABLE_NAME + " where title <> ''";

    // Read whole result set and process it record by record
    client.queryAll(sql).forEach(row -> {
        double id = row.getDouble("id");
        String title = row.getString("title");
        String url = row.getString("url");

        log.info("id: {}, title: {}, url: {}", id, title, url);
    });
} catch (Exception e) {
    log.error("Failed to read data", e);
}

QuerySettings

Configuration options for query operations.

Configuration methods

MethodDescription
setQueryId(String queryId)Sets query ID that will be assigned to the operation.
setFormat(ClickHouseFormat format)Sets response format. See RowBinaryWithNamesAndTypes for the full list.
setMaxExecutionTime(Integer maxExecutionTime)Sets operation execution time on server. Will not affect read timeout.
waitEndOfQuery(Boolean waitEndOfQuery)Requests the server to wait for the end of the query before sending a response.
setUseServerTimeZone(Boolean useServerTimeZone)Server timezone (see client config) will be used to parse date/time types in the result of an operation. Default false.
setUseTimeZone(String timeZone)Requests server to use timeZone for time conversion. See session_timezone.
serverSetting(String name, String value)Sets individual server settings for an operation.
serverSetting(String name, Collection values)Sets individual server settings with multiple values for an operation. Items of the collection should be String values.
setDBRoles(Collection dbRoles)Sets DB roles to be set before executing an operation. Items of the collection should be String values.
setOption(String option, Object value)Sets a configuration option in raw format. This is not a server setting.

QueryResponse

Response object that holds result of query execution. It is only available if the client got a response from a server.

Примечание

This object should be closed as soon as possible to release a connection because the connection cannot be re-used until all data of previous response is fully read.

MethodDescription
ClickHouseFormat getFormat()Returns a format in which data in the response is encoded.
InputStream getInputStream()Returns uncompressed byte stream of data in the specified format.
OperationMetrics getMetrics()Returns object with operation metrics.
String getQueryId()Returns query ID assigned for the operation by the application (through operation settings or by server).
TimeZone getTimeZone()Returns timezone that should be used for handling Date/DateTime types in the response.

Examples

Common API

getTableSchema(String table)

Fetches table schema for the table.

Signatures

TableSchema getTableSchema(String table)
TableSchema getTableSchema(String table, String database)

Parameters

table - table name for which schema data should be fetched.

database - database where the target table is defined.

Return value

Returns a TableSchema object with list of table columns.

getTableSchemaFromQuery(String sql)

Fetches schema from a SQL statement.

Signatures

TableSchema getTableSchemaFromQuery(String sql)

Parameters

sql - "SELECT" SQL statement which schema should be returned.

Return value

Returns a TableSchema object with columns matching the sql expression.

TableSchema

register(Class<?> clazz, TableSchema schema)

Compiles serialization and deserialization layer for the Java Class to use for writing/reading data with schema. The method will create a serializer and deserializer for the pair getter/setter and corresponding column. Column match is found by extracting its name from a method name. For example, getFirstName will be for the column first_name or firstname.

Signatures

void register(Class<?> clazz, TableSchema schema)

Parameters

clazz - Class representing the POJO used to read/write data.

schema - Data schema to use for matching with POJO properties.

Examples

client.register(ArticleViewEvent.class, client.getTableSchema(TABLE_NAME));

Примеры использования

Полный код примеров хранится в репозитории в папке 'example':

  • client-v2 - основной набор примеров.
  • demo-service - пример использования клиента в приложении Spring Boot.
  • demo-kotlin-service - пример использования клиента в приложении Ktor (Kotlin).

Руководство по миграции

Old client (V1) was using com.clickhouse.client.ClickHouseClient#builder as start point. The new client (V2) uses similar pattern with com.clickhouse.client.api.Client.Builder. Main differences are:

  • no service loader is used to grab implementation. The com.clickhouse.client.api.Client is facade class for all kinds of implementation in the future.
  • a fewer sources of configuration: one is provided to the builder and one is with operation settings (QuerySettings, InsertSettings). Previous version had configuration per node and was loading env. variables in some cases.

Configuration Parameters Match

There are 3 enum classes related to configuration in V1:

  • com.clickhouse.client.config.ClickHouseDefaults - configuration parameters that supposed to be set in most use cases. Like USER and PASSWORD.
  • com.clickhouse.client.config.ClickHouseClientOption - configuration parameters specific for the client. Like HEALTH_CHECK_INTERVAL.
  • com.clickhouse.client.http.config.ClickHouseHttpOption - configuration parameters specific for HTTP interface. Like RECEIVE_QUERY_PROGRESS.

They were designed to group parameters and provide clear separation. However in some cases it lead to a confusion (is there a difference between com.clickhouse.client.config.ClickHouseDefaults#ASYNC and com.clickhouse.client.config.ClickHouseClientOption#ASYNC). The new V2 client uses com.clickhouse.client.api.Client.Builder as single dictionary of all possible client configuration options.There is com.clickhouse.client.api.ClientConfigProperties where all configuration parameter names are listed.

Table below shows what old options are supported in the new client and their new meaning.

Legend: ✔ = supported, ✗ = dropped

V1 ConfigurationV2 Builder MethodComments
ClickHouseDefaults#HOSTClient.Builder#addEndpoint
ClickHouseDefaults#PROTOCOLOnly HTTP supported in V2
ClickHouseDefaults#DATABASE
ClickHouseClientOption#DATABASE
Client.Builder#setDefaultDatabase
ClickHouseDefaults#USERClient.Builder#setUsername
ClickHouseDefaults#PASSWORDClient.Builder#setPassword
ClickHouseClientOption#CONNECTION_TIMEOUTClient.Builder#setConnectTimeout
ClickHouseClientOption#CONNECTION_TTLClient.Builder#setConnectionTTL
ClickHouseHttpOption#MAX_OPEN_CONNECTIONSClient.Builder#setMaxConnections
ClickHouseHttpOption#KEEP_ALIVE
ClickHouseHttpOption#KEEP_ALIVE_TIMEOUT
Client.Builder#setKeepAliveTimeout
ClickHouseHttpOption#CONNECTION_REUSE_STRATEGYClient.Builder#setConnectionReuseStrategy
ClickHouseHttpOption#USE_BASIC_AUTHENTICATIONClient.Builder#useHTTPBasicAuth

General Differences

  • Client V2 uses less proprietary classes to increase portability. For example, V2 works with any implementation of java.io.InputStream for writing data to a server.
  • Client V2 async settings is off by default. It means no extra threads and more application control over client. This setting should be off for majority of use cases. Enabling async will create a separate thread for a request. It only make sense when using application controlled executor (see com.clickhouse.client.api.Client.Builder#setSharedOperationExecutor)

Writing Data

  • use any implementation of java.io.InputStream. V1 com.clickhouse.data.ClickHouseInputStream is supported but NOT recommended.
  • once end of input stream is detected it handled accordingly. Previously output stream of a request should be closed.

V1 Insert TSV formatted data.

InputStream inData = getInData();
ClickHouseRequest.Mutation request = client.read(server)
        .write()
        .table(tableName)
        .format(ClickHouseFormat.TSV);
ClickHouseConfig config = request.getConfig();
CompletableFuture<ClickHouseResponse> future;
try (ClickHousePipedOutputStream requestBody = ClickHouseDataStreamFactory.getInstance()
        .createPipedOutputStream(config)) {
    // start the worker thread which transfer data from the input into ClickHouse
    future = request.data(requestBody.getInputStream()).execute();

    // Copy data from inData stream to requestBody stream

    // We need to close the stream before getting a response
    requestBody.close();

    try (ClickHouseResponse response = future.get()) {
        ClickHouseResponseSummary summary = response.getSummary();
        Assert.assertEquals(summary.getWrittenRows(), numRows, "Num of written rows");
    }
}

V2 Insert TSV formatted data.

InputStream inData = getInData();
InsertSettings settings = new InsertSettings().setInputStreamCopyBufferSize(8198 * 2); // set copy buffer size
try (InsertResponse response = client.insert(tableName, inData, ClickHouseFormat.TSV, settings).get(30, TimeUnit.SECONDS)) {

  // Insert is complete at this point

} catch (Exception e) {
 // Handle exception
}
  • there is a single method to call. No need to create an additional request object.
  • request body stream is closed automatically when all data is copied.
  • new low-level API is available com.clickhouse.client.api.Client#insert(java.lang.String, java.util.List<java.lang.String>, com.clickhouse.client.api.DataStreamWriter, com.clickhouse.data.ClickHouseFormat, com.clickhouse.client.api.insert.InsertSettings). com.clickhouse.client.api.DataStreamWriter is designed to implement custom data writing logic. For instance, reading data from a queue.

Reading Data

  • Data is read in RowBinaryWithNamesAndTypes format by default. Currently only this format is supported when data binding is required.
  • Data can be read as a collection of records using List<GenericRecord> com.clickhouse.client.api.Client#queryAll(java.lang.String) method. It will read data to a memory and release connection. No need for extra handling. GenericRecord gives access to data, implements some conversions.
Collection<GenericRecord> records = client.queryAll("SELECT * FROM table");
for (GenericRecord record : records) {
    int rowId = record.getInteger("rowID");
    String name = record.getString("name");
    LocalDateTime ts = record.getLocalDateTime("ts");
}

Java client library to communicate with a DB server through its protocols. Current implementation supports only HTTP interface. The library provides own API to send requests to a server.

Deprecation

This library will be deprecated soon. Use the latest Java Client for new projects

Setup

<!-- https://mvnrepository.com/artifact/com.clickhouse/clickhouse-http-client -->
<dependency>
    <groupId>com.clickhouse</groupId>
    <artifactId>clickhouse-http-client</artifactId>
    <version>0.7.2</version>
</dependency>

Since version 0.5.0, the driver uses a new client http library that needs to be added as a dependency.

<!-- https://mvnrepository.com/artifact/org.apache.httpcomponents.client5/httpclient5 -->
<dependency>
    <groupId>org.apache.httpcomponents.client5</groupId>
    <artifactId>httpclient5</artifactId>
    <version>5.3.1</version>
</dependency>

Initialization

Connection URL Format: protocol://host[:port][/database][?param[=value][&param[=value]][#tag[,tag]], for example:

Connect to a single node:

ClickHouseNode server = ClickHouseNode.of("http://localhost:8123/default?compress=0");

Connect to a cluster with multiple nodes:

ClickHouseNodes servers = ClickHouseNodes.of(
    "jdbc:ch:http://server1.domain,server2.domain,server3.domain/my_db"
    + "?load_balancing_policy=random&health_check_interval=5000&failover=2");

Query API

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("select * from numbers limit :limit")
        .params(1000)
        .executeAndWait()) {
            ClickHouseResponseSummary summary = response.getSummary();
            long totalRows = summary.getTotalRowsToRead();
}

Streaming Query API

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("select * from numbers limit :limit")
        .params(1000)
        .executeAndWait()) {
            for (ClickHouseRecord r : response.records()) {
            int num = r.getValue(0).asInteger();
            // type conversion
            String str = r.getValue(0).asString();
            LocalDate date = r.getValue(0).asDate();
        }
}

See complete code example in the repo.

Insert API


try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers).write()
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("insert into my_table select c2, c3 from input('c1 UInt8, c2 String, c3 Int32')")
        .data(myInputStream) // `myInputStream` is source of data in RowBinary format
        .executeAndWait()) {
            ClickHouseResponseSummary summary = response.getSummary();
            summary.getWrittenRows();
}

See complete code example in the repo.

RowBinary Encoding

RowBinary format is described on its page.

There is an example of code.

Features

Compression

The client will by default use LZ4 compression, which requires this dependency:

<!-- https://mvnrepository.com/artifact/org.lz4/lz4-java -->
<dependency>
    <groupId>org.lz4</groupId>
    <artifactId>lz4-java</artifactId>
    <version>1.8.0</version>
</dependency>

You can choose to use gzip instead by setting compress_algorithm=gzip in the connection URL.

Alternatively, you can disable compression a few ways.

  1. Disable by setting compress=0 in the connection URL: http://localhost:8123/default?compress=0
  2. Disable via the client configuration:
ClickHouseClient client = ClickHouseClient.builder()
   .config(new ClickHouseConfig(Map.of(ClickHouseClientOption.COMPRESS, false)))
   .nodeSelector(ClickHouseNodeSelector.of(ClickHouseProtocol.HTTP))
   .build();

See the compression documentation to learn more about different compression options.

Multiple queries

Execute multiple queries in a worker thread one after another within same session:

CompletableFuture<List<ClickHouseResponseSummary>> future = ClickHouseClient.send(servers.apply(servers.getNodeSelector()),
    "create database if not exists my_base",
    "use my_base",
    "create table if not exists test_table(s String) engine=Memory",
    "insert into test_table values('1')('2')('3')",
    "select * from test_table limit 1",
    "truncate table test_table",
    "drop table if exists test_table");
List<ClickHouseResponseSummary> results = future.get();

Named Parameters

You can pass parameters by name rather than relying solely on their position in the parameter list. This capability is available using params function.

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("select * from my_table where name=:name limit :limit")
        .params("Ben", 1000)
        .executeAndWait()) {
            //...
        }
}
Parameters

All params signatures involving String type (String, String[], Map<String, String>) assume the keys being passed are valid ClickHouse SQL strings. For instance:

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("select * from my_table where name=:name")
        .params(Map.of("name","'Ben'"))
        .executeAndWait()) {
            //...
        }
}

If you prefer not to parse String objects to ClickHouse SQL manually, you can use the helper function ClickHouseValues.convertToSqlExpression located at com.clickhouse.data:

try (ClickHouseClient client = ClickHouseClient.newInstance(ClickHouseProtocol.HTTP);
     ClickHouseResponse response = client.read(servers)
        .format(ClickHouseFormat.RowBinaryWithNamesAndTypes)
        .query("select * from my_table where name=:name")
        .params(Map.of("name", ClickHouseValues.convertToSqlExpression("Ben's")))
        .executeAndWait()) {
            //...
        }
}

In the example above, ClickHouseValues.convertToSqlExpression will escape the inner single quote, and surround the variable with a valid single quotes.

Other types, such as Integer, UUID, Array and Enum will be converted automatically inside params.

Node Discovery

Java client provides the ability to discover ClickHouse nodes automatically. Auto-discovery is disabled by default. To manually enable it, set auto_discovery to true:

properties.setProperty("auto_discovery", "true");

Or in the connection URL:

jdbc:ch://my-server/system?auto_discovery=true

If auto-discovery is enabled, there is no need to specify all ClickHouse nodes in the connection URL. Nodes specified in the URL will be treated as seeds, and the Java client will automatically discover more nodes from system tables and/or clickhouse-keeper or zookeeper.

The following options are responsible for auto-discovery configuration:

PropertyDefaultDescription
auto_discoveryfalseWhether the client should discover more nodes from system tables and/or clickhouse-keeper/zookeeper.
node_discovery_interval0Node discovery interval in milliseconds, zero or negative value means one-time discovery.
node_discovery_limit100Maximum number of nodes that can be discovered at a time; zero or negative value means no limit.

Load Balancing

The Java client chooses a ClickHouse node to send requests to, according to the load-balancing policy. In general, the load-balancing policy is responsible for the following things:

  1. Get a node from a managed node list.
  2. Managing node's status.
  3. Optionally schedule a background process for node discovery (if auto-discovery is enabled) and run a health check.

Here is a list of options to configure load balancing:

PropertyDefaultDescription
load_balancing_policy""The load-balancing policy can be one of:
  • firstAlive - request is sent to the first healthy node from the managed node list
  • random - request is sent to a random node from the managed node list
  • roundRobin - request is sent to each node from the managed node list, in turn.
  • full qualified class name implementing ClickHouseLoadBalancingPolicy - custom load balancing policy
  • If it is not specified the request is sent to the first node from the managed node list
    load_balancing_tags""Load balancing tags for filtering out nodes. Requests are sent only to nodes that have the specified tags
    health_check_interval0Health check interval in milliseconds, zero or negative value means one-time.
    health_check_methodClickHouseHealthCheckMethod.SELECT_ONEHealth check method. Can be one of:
  • ClickHouseHealthCheckMethod.SELECT_ONE - check with select 1 query
  • ClickHouseHealthCheckMethod.PING - protocol-specific check, which is generally faster
  • node_check_interval0Node check interval in milliseconds, negative number is treated as zero. The node status is checked if the specified amount of time has passed since the last check.
    The difference between health_check_interval and node_check_interval is that the health_check_interval option schedules the background job, which checks the status for the list of nodes (all or faulty), but node_check_interval specifies the amount of time has passed since the last check for the particular node
    check_all_nodesfalseWhether to perform a health check against all nodes or just faulty ones.

    Failover and retry

    Java client provides configuration options to set up failover and retry behavior for failed queries:

    PropertyDefaultDescription
    failover0Maximum number of times a failover can happen for a request. Zero or a negative value means no failover. Failover sends the failed request to a different node (according to the load-balancing policy) in order to recover from failover.
    retry0Maximum number of times retry can happen for a request. Zero or a negative value means no retry. Retry sends a request to the same node and only if the ClickHouse server returns the NETWORK_ERROR error code
    repeat_on_session_locktrueWhether to repeat execution when the session is locked until timed out(according to session_timeout or connect_timeout). The failed request is repeated if the ClickHouse server returns the SESSION_IS_LOCKED error code

    Adding custom http headers

    Java client support HTTP/S transport layer in case we want to add custom HTTP headers to the request. We should use the custom_http_headers property, and the headers need to be , separated. The header key/value should be divided using =

    Java Client support

    options.put("custom_http_headers", "X-ClickHouse-Quota=test, X-ClickHouse-Test=test");
    

    JDBC Driver

    properties.setProperty("custom_http_headers", "X-ClickHouse-Quota=test, X-ClickHouse-Test=test");