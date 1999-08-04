On this page

2020 Changelog

Disable write with AIO during merges because it can lead to extremely rare data corruption of primary key columns during merge. #18481 (alesapin).

Fixed value is too short error when executing toType(...) functions ( toDate , toUInt32 , etc) with argument of type Nullable(String) . Now such functions return NULL on parsing errors instead of throwing exception. Fixes #7673. #18445 (tavplubix).

Restrict merges from wide to compact parts. In case of vertical merge it led to broken result part. #18381 (Anton Popov).

Fix filling table system.settings_profile_elements . This PR fixes #18231. #18379 (Vitaly Baranov).

Fix possible crashes in aggregate functions with combinator Distinct , while using two-level aggregation. Fixes #17682. #18365 (Anton Popov).

, while using two-level aggregation. Fixes #17682. #18365 (Anton Popov). Fix error when query MODIFY COLUMN ... REMOVE TTL does not actually remove column TTL. #18130 (alesapin).

Update timezones info to 2020e. #18531 (alesapin).

Enable use_compact_format_in_distributed_parts_names by default (see the documentation for the reference). #16728 (Azat Khuzhin).

by default (see the documentation for the reference). #16728 (Azat Khuzhin). Accept user settings related to file formats (e.g. format_csv_delimiter ) in the SETTINGS clause when creating a table that uses File engine, and use these settings in all INSERT s and SELECT s. The file format settings changed in the current user session, or in the SETTINGS clause of a DML query itself, no longer affect the query. #16591 (Alexander Kuzmenkov).

Use Floyd-Rivest algorithm, it is the best for the ClickHouse use case of partial sorting. Bechmarks are in https://github.com/danlark1/miniselect and here. #16825 (Danila Kutenin).

Now ReplicatedMergeTree tree engines family uses a separate thread pool for replicated fetches. Size of the pool limited by setting background_fetches_pool_size which can be tuned with a server restart. The default value of the setting is 3 and it means that the maximum amount of parallel fetches is equal to 3 (and it allows to utilize 10G network). Fixes #520. #16390 (alesapin).

Fixed uncontrolled growth of the state of quantileTDigest . #16680 (hrissan).

. #16680 (hrissan). Add VIEW subquery description to EXPLAIN . Limit push down optimisation for VIEW . Add local replicas of Distributed to query plan. #14936 (Nikolai Kochetov).

subquery description to . Limit push down optimisation for . Add local replicas of to query plan. #14936 (Nikolai Kochetov). Fix optimize_read_in_order/optimize_aggregation_in_order with max_threads > 0 and expression in ORDER BY. #16637 (Azat Khuzhin).

Fix performance of reading from Merge tables over huge number of MergeTree tables. Fixes #7748. #16988 (Anton Popov).

Now we can safely prune partitions with exact match. Useful case: Suppose table is partitioned by intHash64(x) % 100 and the query has condition on intHash64(x) % 100 verbatim, not on x. #16253 (Amos Bird).

Add EmbeddedRocksDB table engine (can be used for dictionaries). #15073 (sundyli).

Improvements in test coverage building images. #17233 (alesapin).

Update embedded timezone data to version 2020d (also update cctz to the latest master). #17204 (filimonov).

Fix UBSan report in Poco. This closes #12719. #16765 (alexey-milovidov).

Do not instrument 3rd-party libraries with UBSan. #16764 (alexey-milovidov).

Fix UBSan report in cache dictionaries. This closes #12641. #16763 (alexey-milovidov).

Fix UBSan report when trying to convert infinite floating point number to integer. This closes #14190. #16677 (alexey-milovidov).

Explicitly set uid / gid of clickhouse user & group to the fixed values (101) in clickhouse-server images. #19096 (filimonov).

Update timezones info to 2020e. #18531 (alesapin).

If some profile was specified in distributed_ddl config section, then this profile could overwrite settings of default profile on server startup. It's fixed, now settings of distributed DDL queries should not affect global server settings. #16635 (tavplubix).

Restrict to use of non-comparable data types (like AggregateFunction ) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin).

) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin). Remove ANALYZE and AST queries, and make the setting enable_debug_queries obsolete since now it is the part of full featured EXPLAIN query. #16536 (Ivan).

and queries, and make the setting obsolete since now it is the part of full featured query. #16536 (Ivan). Aggregate functions boundingRatio , rankCorr , retention , timeSeriesGroupSum , timeSeriesGroupRateSum , windowFunnel were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. #16407 (alexey-milovidov).

Make rankCorr function return nan on insufficient data #16124. #16135 (hexiaoting).

function return nan on insufficient data #16124. #16135 (hexiaoting). When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

New functions encrypt , aes_encrypt_mysql , decrypt , aes_decrypt_mysql . These functions are working slowly, so we consider it as an experimental feature. #11844 (Vasily Nemkov).

Do not merge parts across partitions in SELECT FINAL. #15938 (Kruglov Pavel).

Improve performance of -OrNull and -OrDefault aggregate functions. #16661 (alexey-milovidov).

and aggregate functions. #16661 (alexey-milovidov). Improve performance of quantileMerge . In previous versions it was obnoxiously slow. This closes #1463. #16643 (alexey-milovidov).

. In previous versions it was obnoxiously slow. This closes #1463. #16643 (alexey-milovidov). Improve performance of logical functions a little. #16347 (alexey-milovidov).

Improved performance of merges assignment in MergeTree table engines. Shouldn't be visible for the user. #16191 (alesapin).

Speedup hashed/sparse_hashed dictionary loading by preallocating the hash table. #15454 (Azat Khuzhin).

Now trivial count optimization becomes slightly non-trivial. Predicates that contain exact partition expr can be optimized too. This also fixes #11092 which returns wrong count when max_parallel_replicas > 1 . #15074 (Amos Bird).

Add flaky check for stateless tests. It will detect potentially flaky functional tests in advance, before they are merged. #16238 (alesapin).

Use proper version for croaring instead of amalgamation. #16285 (sundyli).

Improve generation of build files for ya.make build system (Arcadia). #16700 (alexey-milovidov).

build system (Arcadia). #16700 (alexey-milovidov). Add MySQL BinLog file check tool for MaterializeMySQL database engine. MaterializeMySQL is an experimental feature. #16223 (Winter Zhang).

database engine. is an experimental feature. #16223 (Winter Zhang). Check for executable bit on non-executable files. People often accidentially commit executable files from Windows. #15843 (alexey-milovidov).

Check for #pragma once in headers. #15818 (alexey-milovidov).

in headers. #15818 (alexey-milovidov). Fix illegal code style &vector[idx] in libhdfs3. This fixes libcxx debug build. See also https://github.com/ClickHouse-Extras/libhdfs3/pull/8 . #15815 (Amos Bird).

in libhdfs3. This fixes libcxx debug build. See also https://github.com/ClickHouse-Extras/libhdfs3/pull/8 . #15815 (Amos Bird). Fix build of one miscellaneous example tool on Mac OS. Note that we don't build examples on Mac OS in our CI (we build only ClickHouse binary), so there is zero chance it will not break again. This fixes #15804. #15808 (alexey-milovidov).

Simplify Sys/V init script. #14135 (alexey-milovidov).

Added boost::program_options to db_generator in order to increase its usability. This closes #15940. #15973 (Nikita Mikhaylov).

Workaround for use S3 with nginx server as proxy. Nginx currenty does not accept urls with empty path like http://domain.com?delete, but vanilla aws-sdk-cpp produces this kind of urls. This commit uses patched aws-sdk-cpp version, which makes urls with "/" as path in this cases, like http://domain.com/?delete. #16813 (ianton-ru).

Make multiple_joins_rewriter_version obsolete. Remove first version of joins rewriter. #15472 (Artem Zuikov).

Change default value of format_regexp_escaping_rule setting (it's related to Regexp format) to Raw (it means - read whole subpattern as a value) to make the behaviour more like to what users expect. #15426 (alexey-milovidov).

setting (it's related to format) to (it means - read whole subpattern as a value) to make the behaviour more like to what users expect. #15426 (alexey-milovidov). Add support for nested multiline comments /* comment /* comment */ */ in SQL. This conforms to the SQL standard. #14655 (alexey-milovidov).

in SQL. This conforms to the SQL standard. #14655 (alexey-milovidov). Added MergeTree settings ( max_replicated_merges_with_ttl_in_queue and max_number_of_merges_with_ttl_in_pool ) to control the number of merges with TTL in the background pool and replicated queue. This change breaks compatibility with older versions only if you use delete TTL. Otherwise, replication will stay compatible. You can avoid incompatibility issues if you update all shard replicas at once or execute SYSTEM STOP TTL MERGES until you finish the update of all replicas. If you'll get an incompatible entry in the replication queue, first of all, execute SYSTEM STOP TTL MERGES and after ALTER TABLE ... DETACH PARTITION ... the partition where incompatible TTL merge was assigned. Attach it back on a single replica. #14490 (alesapin).

and ) to control the number of merges with TTL in the background pool and replicated queue. This change breaks compatibility with older versions only if you use delete TTL. Otherwise, replication will stay compatible. You can avoid incompatibility issues if you update all shard replicas at once or execute until you finish the update of all replicas. If you'll get an incompatible entry in the replication queue, first of all, execute and after the partition where incompatible TTL merge was assigned. Attach it back on a single replica. #14490 (alesapin). When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

Make it possible to connect to clickhouse-server secure endpoint which requires SNI. This is possible when clickhouse-server is hosted behind TLS proxy. #16938 (filimonov).

Conditional aggregate functions (for example: avgIf , sumIf , maxIf ) should return NULL when miss rows and use nullable arguments. #13964 (Winter Zhang).

Added column transformers EXCEPT , REPLACE , APPLY , which can be applied to the list of selected columns (after * or COLUMNS(...) ). For example, you can write SELECT * EXCEPT(URL) REPLACE(number + 1 AS number) . Another example: select * apply(length) apply(max) from wide_string_table to find out the maxium length of all string columns. #14233 (Amos Bird).

, , , which can be applied to the list of selected columns (after or ). For example, you can write . Another example: to find out the maxium length of all string columns. #14233 (Amos Bird). Added an aggregate function rankCorr which computes a rank correlation coefficient. #11769 (antikvist) #14411 (Nikita Mikhaylov).

which computes a rank correlation coefficient. #11769 (antikvist) #14411 (Nikita Mikhaylov). Added table function view which turns a subquery into a table object. This helps passing queries around. For instance, it can be used in remote/cluster table functions. #12567 (Amos Bird).

Added db-generator tool for random database generation by given SELECT queries. It may faciliate reproducing issues when there is only incomplete bug report from the user. #14442 (Nikita Mikhaylov) #10973 (ZeDRoman).

Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key (under optimize_skip_unused_shards and optimize_distributed_group_by_sharding_key ). #10373 (Azat Khuzhin).

and ). #10373 (Azat Khuzhin). Creating sets for multiple JOIN and IN in parallel. It may slightly improve performance for queries with several different IN subquery expressions. #14412 (Nikolai Kochetov).

and in parallel. It may slightly improve performance for queries with several different expressions. #14412 (Nikolai Kochetov). Improve Kafka engine performance by providing independent thread for each consumer. Separate thread pool for streaming engines (like Kafka). #13939 (fastio).

Lower binary size in debug build by removing debug info from Functions . This is needed only for one internal project in Yandex who is using very old linker. #14549 (alexey-milovidov).

. This is needed only for one internal project in Yandex who is using very old linker. #14549 (alexey-milovidov). Prepare for build with clang 11. #14455 (alexey-milovidov).

Fix the logic in backport script. In previous versions it was triggered for any labels of 100% red color. It was strange. #14433 (alexey-milovidov).

Integration tests use default base config. All config changes are explicit with main_configs, user_configs and dictionaries parameters for instance. #13647 (Ilya Yatsishin).

Unfold {database} , {table} and {uuid} macros in ReplicatedMergeTree arguments on table creation. #16159 (tavplubix).

Fix the issue when some invocations of extractAllGroups function may trigger "Memory limit exceeded" error. This fixes #13383. #14889 (alexey-milovidov).

function may trigger "Memory limit exceeded" error. This fixes #13383. #14889 (alexey-milovidov). Fix SIGSEGV for an attempt to INSERT into StorageFile(fd). #14887 (Azat Khuzhin).

Fix rare error in SELECT queries when the queried column has DEFAULT expression which depends on the other column which also has DEFAULT and not present in select query and not exists on disk. Partially fixes #14531. #14845 (alesapin).

Fixed missed default database name in metadata of materialized view when executing ALTER ... MODIFY QUERY . #14664 (tavplubix).

. #14664 (tavplubix). Fix bug when ALTER UPDATE mutation with Nullable column in assignment expression and constant value (like UPDATE x = 42 ) leads to incorrect value in column or segfault. Fixes #13634, #14045. #14646 (alesapin).

mutation with Nullable column in assignment expression and constant value (like ) leads to incorrect value in column or segfault. Fixes #13634, #14045. #14646 (alesapin). Fix wrong Decimal multiplication result caused wrong decimal scale of result column. #14603 (Artem Zuikov).

Added the checker as neither calling lc->isNullable() nor calling ls->getDictionaryPtr()->isNullable() would return the correct result. #14591 (myrrc).

Cleanup data directory after Zookeeper exceptions during CreateQuery for StorageReplicatedMergeTree Engine. #14563 (Bharat Nallan).

Fix rare segfaults in functions with combinator -Resample, which could appear in result of overflow with very large parameters. #14562 (Anton Popov).

Now OPTIMIZE FINAL query does not recalculate TTL for parts that were added before TTL was created. Use ALTER TABLE ... MATERIALIZE TTL once to calculate them, after that OPTIMIZE FINAL will evaluate TTL's properly. This behavior never worked for replicated tables. #14220 (alesapin).

query does not recalculate TTL for parts that were added before TTL was created. Use once to calculate them, after that will evaluate TTL's properly. This behavior never worked for replicated tables. #14220 (alesapin). Extend parallel_distributed_insert_select setting, adding an option to run INSERT into local table. The setting changes type from Bool to UInt64 , so the values false and true are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with 0 and 1 , respectively. #14060 (Azat Khuzhin).

setting, adding an option to run into local table. The setting changes type from to , so the values and are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with and , respectively. #14060 (Azat Khuzhin). Remove support for the ODBCDriver input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the ODBCDriver2 format. Resolves #13629. #13847 (hexiaoting).

input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the format. Resolves #13629. #13847 (hexiaoting). When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

Slightly optimize very short queries with LowCardinality . #14129 (Anton Popov).

. #14129 (Anton Popov). Enable parallel INSERTs for table engines Null , Memory , Distributed and Buffer when the setting max_insert_threads is set. #14120 (alexey-milovidov).

, , and when the setting is set. #14120 (alexey-milovidov). Fail fast if max_rows_to_read limit is exceeded on parts scan. The motivation behind this change is to skip ranges scan for all selected parts if it is clear that max_rows_to_read is already exceeded. The change is quite noticeable for queries over big number of parts. #13677 (Roman Khavronenko).

limit is exceeded on parts scan. The motivation behind this change is to skip ranges scan for all selected parts if it is clear that is already exceeded. The change is quite noticeable for queries over big number of parts. #13677 (Roman Khavronenko). Slightly improve performance of aggregation by UInt8/UInt16 keys. #13099 (alexey-milovidov).

Optimize has() , indexOf() and countEqual() functions for Array(LowCardinality(T)) and constant right arguments. #12550 (myrrc).

, and functions for and constant right arguments. #12550 (myrrc). When performing trivial INSERT SELECT queries, automatically set max_threads to 1 or max_insert_threads , and set max_block_size to min_insert_block_size_rows . Related to #5907. #12195 (flynn).

ClickHouse can work as MySQL replica - it is implemented by MaterializeMySQL database engine. Implements #4006. #10851 (Winter Zhang).

Add types Int128 , Int256 , UInt256 and related functions for them. Extend Decimals with Decimal256 (precision up to 76 digits). New types are under the setting allow_experimental_bigint_types . It is working extremely slow and bad. The implementation is incomplete. Please don't use this feature. #13097 (Artem Zuikov).

Function modulo (operator % ) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias mod . This closes #7323. #12585 (alexey-milovidov).

(operator ) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias . This closes #7323. #12585 (alexey-milovidov). Deprecate special printing of zero Date/DateTime values as 0000-00-00 and 0000-00-00 00:00:00 . #12442 (alexey-milovidov).

and . #12442 (alexey-milovidov). The function groupArrayMoving* was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function groupArrayMovingAvg was returning integer number that was inconsistent with the avg function. This fixes #12568. #12622 (alexey-milovidov).

was not working for distributed queries. It's result was calculated within incorrect data type (without promotion to the largest type). The function was returning integer number that was inconsistent with the function. This fixes #12568. #12622 (alexey-milovidov). Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. #13153 (alexey-milovidov).

Protect from the cases when user may set background_pool_size to value lower than number_of_free_entries_in_pool_to_execute_mutation or number_of_free_entries_in_pool_to_lower_max_size_of_merge . In these cases ALTERs won't work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes #10897. #12728 (alexey-milovidov).

to value lower than or . In these cases ALTERs won't work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes #10897. #12728 (alexey-milovidov). When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

Polygon dictionary type that provides efficient "reverse geocoding" lookups - to find the region by coordinates in a dictionary of many polygons (world map). It is using carefully optimized algorithm with recursive grids to maintain low CPU and memory usage. #9278 (achulkov2).

Added support of LDAP authentication for preconfigured users ("Simple Bind" method). #11234 (Denis Glazachev).

Introduce setting alter_partition_verbose_result which outputs information about touched parts for some types of ALTER TABLE ... PARTITION ... queries (currently ATTACH and FREEZE ). Closes #8076. #13017 (alesapin).

which outputs information about touched parts for some types of queries (currently and ). Closes #8076. #13017 (alesapin). Add bayesAB function for bayesian-ab-testing. #12327 (achimbab).

function for bayesian-ab-testing. #12327 (achimbab). Added system.crash_log table into which stack traces for fatal errors are collected. This table should be empty. #12316 (alexey-milovidov).

table into which stack traces for fatal errors are collected. This table should be empty. #12316 (alexey-milovidov). Added http headers X-ClickHouse-Database and X-ClickHouse-Format which may be used to set default database and output format. #12981 (hcz).

and which may be used to set default database and output format. #12981 (hcz). Add minMap and maxMap functions support to SimpleAggregateFunction . #12662 (Ildus Kurbangaliev).

and functions support to . #12662 (Ildus Kurbangaliev). Add setting allow_non_metadata_alters which restricts to execute ALTER queries which modify data on disk. Disabled be default. Closes #11547. #12635 (alesapin).

which restricts to execute queries which modify data on disk. Disabled be default. Closes #11547. #12635 (alesapin). A function formatRow is added to support turning arbitrary expressions into a string via given format. It's useful for manipulating SQL outputs and is quite versatile combined with the columns function. #12574 (Amos Bird).

is added to support turning arbitrary expressions into a string via given format. It's useful for manipulating SQL outputs and is quite versatile combined with the function. #12574 (Amos Bird). Add FROM_UNIXTIME function for compatibility with MySQL, related to 12149. #12484 (flynn).

Allow Nullable types as keys in MergeTree tables if allow_nullable_key table setting is enabled. Closes #5319. #12433 (Amos Bird).

table setting is enabled. Closes #5319. #12433 (Amos Bird). Integration with COS. #12386 (fastio).

Add mapAdd and mapSubtract functions for adding/subtracting key-mapped values. #11735 (Ildus Kurbangaliev).

When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

Allowed to set JOIN kind and type in more standard way: LEFT SEMI JOIN instead of SEMI LEFT JOIN . For now both are correct. #12520 (Artem Zuikov).

kind and type in more standard way: instead of . For now both are correct. #12520 (Artem Zuikov). lifetime_rows/lifetime_bytes for Buffer engine. #12421 (Azat Khuzhin).

Write the detail exception message to the client instead of 'MySQL server has gone away'. #12383 (BohuTANG).

Allows to change a charset which is used for printing grids borders. Available charsets are following: UTF-8, ASCII. Setting output_format_pretty_grid_charset enables this feature. #12372 (Sabyanin Maxim).

enables this feature. #12372 (Sabyanin Maxim). Supported MySQL 'SELECT DATABASE()' #9336 2. Add MySQL replacement query integration test. #12314 (BohuTANG).

Added KILL QUERY [connection_id] for the MySQL client/driver to cancel the long query, issue #12038. #12152 (BohuTANG).

for the MySQL client/driver to cancel the long query, issue #12038. #12152 (BohuTANG). Added support for %g (two digit ISO year) and %G (four digit ISO year) substitutions in formatDateTime function. #12136 (vivarum).

(two digit ISO year) and (four digit ISO year) substitutions in function. #12136 (vivarum). Added 'type' column in system.disks. #12115 (ianton-ru).

Improved REVOKE command: now it requires grant/admin option for only access which will be revoked. For example, to execute REVOKE ALL ON *.* FROM user1 now it does not require to have full access rights granted with grant option. Added command REVOKE ALL FROM user1 - it revokes all granted roles from user1 . #12083 (Vitaly Baranov).

command: now it requires grant/admin option for only access which will be revoked. For example, to execute now it does not require to have full access rights granted with grant option. Added command - it revokes all granted roles from . #12083 (Vitaly Baranov). Added replica priority for load_balancing (for manual prioritization of the load balancing). #11995 (Azat Khuzhin).

Switched paths in S3 metadata to relative which allows to handle S3 blobs more easily. #11892 (Vladimir Chebotarev).

Improved performace of 'ORDER BY' and 'GROUP BY' by prefix of sorting key (enabled with optimize_aggregation_in_order setting, disabled by default). #11696 (Anton Popov).

setting, disabled by default). #11696 (Anton Popov). Removed injective functions inside uniq*() if set optimize_injective_functions_inside_uniq=1 . #12337 (Ruslan Kamalov).

if . #12337 (Ruslan Kamalov). Index not used for IN operator with literals, performance regression introduced around v19.3. This fixes #10574. #12062 (nvartolomei).

Implemented single part uploads for DiskS3 (experimental feature). #12026 (Vladimir Chebotarev).

Added new in-memory format of parts in MergeTree -family tables, which stores data in memory. Parts are written on disk at first merge. Part will be created in in-memory format if its size in rows or bytes is below thresholds min_rows_for_compact_part and min_bytes_for_compact_part . Also optional support of Write-Ahead-Log is available, which is enabled by default and is controlled by setting in_memory_parts_enable_wal . #10697 (Anton Popov).

Install ca-certificates before the first apt-get update in Dockerfile. #12095 (Ivan Blinkov).

Return non-Nullable result from COUNT(DISTINCT), and uniq aggregate functions family. If all passed values are NULL, return zero instead. This improves SQL compatibility. #11661 (alexey-milovidov).

Added a check for the case when user-level setting is specified in a wrong place. User-level settings should be specified in users.xml inside <profile> section for specific user profile (or in <default> for default settings). The server won't start with exception message in log. This fixes #9051. If you want to skip the check, you can either move settings to the appropriate place or add <skip_check_for_incorrect_settings>1</skip_check_for_incorrect_settings> to config.xml. #11449 (alexey-milovidov).

inside section for specific user profile (or in for default settings). The server won't start with exception message in log. This fixes #9051. If you want to skip the check, you can either move settings to the appropriate place or add to config.xml. #11449 (alexey-milovidov). The setting input_format_with_names_use_header is enabled by default. It will affect parsing of input formats -WithNames and -WithNamesAndTypes . #10937 (alexey-milovidov).

is enabled by default. It will affect parsing of input formats and . #10937 (alexey-milovidov). Remove experimental_use_processors setting. It is enabled by default. #10924 (Nikolai Kochetov).

setting. It is enabled by default. #10924 (Nikolai Kochetov). Update zstd to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages Data after merge is not byte-identical to data on another replicas. with explanation. These messages are Ok and you should not worry. This change is backward compatible but we list it here in changelog in case you will wonder about these messages. #10663 (alexey-milovidov).

to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages with explanation. These messages are Ok and you should not worry. This change is backward compatible but we list it here in changelog in case you will wonder about these messages. #10663 (alexey-milovidov). Added a check for meaningless codecs and a setting allow_suspicious_codecs to control this check. This closes #4966. #10645 (alexey-milovidov).

to control this check. This closes #4966. #10645 (alexey-milovidov). Several Kafka setting changes their defaults. See #11388.

When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

Add data type Point (Tuple(Float64, Float64)) and Polygon (Array(Array(Tuple(Float64, Float64))). #10678 (Alexey Ilyukhov).

(Tuple(Float64, Float64)) and (Array(Array(Tuple(Float64, Float64))). #10678 (Alexey Ilyukhov). Add's a hasSubstr function that allows for look for subsequences in arrays. Note: this function is likely to be renamed without further notice. #11071 (Ryad Zenine).

function that allows for look for subsequences in arrays. Note: this function is likely to be renamed without further notice. #11071 (Ryad Zenine). Added OpenCL support and bitonic sort algorithm, which can be used for sorting integer types of data in single column. Needs to be build with flag -DENABLE_OPENCL=1 . For using bitonic sort algorithm instead of others you need to set bitonic_sort for Setting's option special_sort and make sure that OpenCL is available. This feature does not improve performance or anything else, it is only provided as an example and for demonstration purposes. It is likely to be removed in near future if there will be no further development in this direction. #10232 (Ri).

Fix several non significant errors in unit tests. #11262 (alesapin).

Fix (false) MSan report in MergeTreeIndexFullText. The issue first appeared in #9968. #10801 (alexey-milovidov).

Fix several flaky integration tests. #11355 (alesapin).

No changes compared to v20.4.3.16-stable.

System tables (e.g. system.query_log, system.trace_log, system.metric_log) are using compact data part format for parts smaller than 10 MiB in size. Compact data part format is supported since version 20.3. If you are going to downgrade to version less than 20.3, you should manually delete table data for system logs in /var/lib/clickhouse/data/system/ .

. When string comparison involves FixedString and compared arguments are of different sizes, do comparison as if smaller string is padded to the length of the larger. This is intented for SQL compatibility if we imagine that FixedString data type corresponds to SQL CHAR. This closes #9272. #10363 (alexey-milovidov)

Make SHOW CREATE TABLE multiline. Now it is more readable and more like MySQL. #10049 (Azat Khuzhin)

Added a setting validate_polygons that is used in pointInPolygon function and enabled by default. #9857 (alexey-milovidov)

Added experimental database engine Atomic. It supports non-blocking DROP and RENAME TABLE queries and atomic EXCHANGE TABLES t1 AND t2 query #7512 (tavplubix)

and queries and atomic query #7512 (tavplubix) Initial support for ReplicatedMergeTree over S3 (it works in suboptimal way) #10126 (Pavel Kovalenko)

Fix dictGet in sharding_key (and similar places, i.e. when the function context is stored permanently). #16205 (Azat Khuzhin).

Fix incorrect empty result for query from Distributed table if query has WHERE , PREWHERE and GLOBAL IN . Fixes #15792. #15933 (Nikolai Kochetov).

table if query has , and . Fixes #15792. #15933 (Nikolai Kochetov). Fix missing or excessive headers in TSV/CSVWithNames formats. This fixes #12504. #13343 (Azat Khuzhin).

Mutation might hang waiting for some non-existent part after MOVE or REPLACE PARTITION or, in rare cases, after DETACH or DROP PARTITION . It's fixed. #15724, #15537 (tavplubix).

or or, in rare cases, after or . It's fixed. #15724, #15537 (tavplubix). Fix hang of queries with a lot of subqueries to same table of MySQL engine. Previously, if there were more than 16 subqueries to same MySQL table in query, it hang forever. #15299 (Anton Popov).

engine. Previously, if there were more than 16 subqueries to same table in query, it hang forever. #15299 (Anton Popov). Fix 'Unknown identifier' in GROUP BY when query has JOIN over Merge table. #15242 (Artem Zuikov).

Fix to make predicate push down work when subquery contains finalizeAggregation function. Fixes #14847. #14937 (filimonov).

Concurrent ALTER ... REPLACE/MOVE PARTITION ... queries might cause deadlock. It's fixed. #13626 (tavplubix).

Fix rare error in SELECT queries when the queried column has DEFAULT expression which depends on the other column which also has DEFAULT and not present in select query and not exists on disk. Partially fixes #14531. #14845 (alesapin).

queries when the queried column has expression which depends on the other column which also has and not present in select query and not exists on disk. Partially fixes #14531. #14845 (alesapin). Fix bug when ALTER UPDATE mutation with Nullable column in assignment expression and constant value (like UPDATE x = 42 ) leads to incorrect value in column or segfault. Fixes #13634, #14045. #14646 (alesapin).

mutation with Nullable column in assignment expression and constant value (like ) leads to incorrect value in column or segfault. Fixes #13634, #14045. #14646 (alesapin). Fix wrong Decimal multiplication result caused wrong decimal scale of result column. #14603 (Artem Zuikov).

Support custom codecs in compact parts. #12183 (Anton Popov).

Fix wrong error for long queries. It was possible to get syntax error other than Max query size exceeded for correct query. #13928 (Nikolai Kochetov).

for correct query. #13928 (Nikolai Kochetov). Return NULL/zero when value is not parsed completely in parseDateTimeBestEffortOrNull/Zero functions. This fixes #7876. #11653 (alexey-milovidov).

Slightly optimize very short queries with LowCardinality. #14129 (Anton Popov).

Fix UBSan report (adding zero to nullptr) in HashTable that appeared after migration to clang-10. #10638 (alexey-milovidov).

Fix crash in JOIN with StorageMerge and set enable_optimize_predicate_expression=1 . #13679 (Artem Zuikov).

. #13679 (Artem Zuikov). Fix invalid return type for comparison of tuples with NULL elements. Fixes #12461. #13420 (Nikolai Kochetov).

elements. Fixes #12461. #13420 (Nikolai Kochetov). Fix queries with constant columns and ORDER BY prefix of primary key. #13396 (Anton Popov).

prefix of primary key. #13396 (Anton Popov). Return passed number for numbers with MSB set in roundUpToPowerOfTwoOrZero(). #13234 (Azat Khuzhin).

Now ClickHouse controls timeouts of dictionary sources on its side. Two new settings added to cache dictionary configuration: strict_max_lifetime_seconds , which is max_lifetime by default and query_wait_timeout_milliseconds , which is one minute by default. The first settings is also useful with allow_read_expired_keys settings (to forbid reading very expired keys). #10337 (Nikita Mikhaylov).

Get dictionary and check access rights only once per each call of any function reading external dictionaries. #10928 (Vitaly Baranov).

Fix UBSan report in LZ4 library. #10631 (alexey-milovidov).

Fix clang-10 build. #10238. #10370 (Amos Bird).

Added failing tests about max_rows_to_sort setting. #10268 (alexey-milovidov).

setting. #10268 (alexey-milovidov). Added some improvements in printing diagnostic info in input formats. Fixes #10204. #10418 (tavplubix).

Added CA certificates to clickhouse-server docker image. #10476 (filimonov).

Fix error the BloomFilter false positive must be a double number between 0 and 1 #10551. #10569 (Winter Zhang).

Improved performance of queries with explicitly defined sets at right side of IN operator and tuples in the left side. This fixes performance regression in version 20.3. #9740, #10385 (Anton Popov)

Fix Logical error: CROSS JOIN has expressions error for queries with comma and names joins mix. #10311 (Artem Zuikov).

error for queries with comma and names joins mix. #10311 (Artem Zuikov). Fix queries with max_bytes_before_external_group_by . #10302 (Artem Zuikov).

. #10302 (Artem Zuikov). Fix move-to-prewhere optimization in presense of arrayJoin functions (in certain cases). This fixes #10092. #10195 (alexey-milovidov).

Add the ability to relax the restriction on non-deterministic functions usage in mutations with allow_nondeterministic_mutations setting. #10186 (filimonov).

Added function isConstant . This function checks whether its argument is constant expression and returns 1 or 0. It is intended for development, debugging and demonstration purposes. #10198 (alexey-milovidov).

Remove order by stage from mutations because we read from a single ordered part in a single thread. Also add check that the order of rows in mutation is ordered in sorting key order and this order is not violated. #9886 (alesapin).

This release also contains all bug fixes from 20.1.8.41

Fix missing rows_before_limit_at_least for queries over http (with processors pipeline). This fixes #9730. #9757 (Nikolai Kochetov)

This release also contains all bug fixes from 20.1.7.38

Fix bug in a replication that does not allow replication to work if the user has executed mutations on the previous version. This fixes #9645. #9652 (alesapin). It makes version 20.3 backward compatible again.

Add setting use_compact_format_in_distributed_parts_names which allows to write files for INSERT queries into Distributed table with more compact format. This fixes #9647. #9653 (alesapin). It makes version 20.3 backward compatible again.

Fixed the issue file name too long when sending data for Distributed tables for a large number of replicas. Fixed the issue that replica credentials were exposed in the server log. The format of directory name on disk was changed to [shard{shard_index}[_replica{replica_index}]] . #8911 (Mikhail Korotov) After you upgrade to the new version, you will not be able to downgrade without manual intervention, because old server version does not recognize the new directory format. If you want to downgrade, you have to manually rename the corresponding directories to the old format. This change is relevant only if you have used asynchronous INSERT s to Distributed tables. In the version 20.3.3 we will introduce a setting that will allow you to enable the new format gradually.

Changed the format of replication log entries for mutation commands. You have to wait for old mutations to process before installing the new version.

Implement simple memory profiler that dumps stacktraces to system.trace_log every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of system.trace_log was renamed from timer_type to trace_type . This will require changes in third-party performance analysis and flamegraph processing tools.

every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of was renamed from to . This will require changes in third-party performance analysis and flamegraph processing tools. Use OS thread id everywhere instead of internal thread number. This fixes #7477 Old clickhouse-client cannot receive logs that are send from the server when the setting send_logs_level is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don't use the send_logs_level setting, you should not care. #8954 (alexey-milovidov)

cannot receive logs that are send from the server when the setting is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don't use the setting, you should not care. #8954 (alexey-milovidov) Remove indexHint function #9542 (alexey-milovidov)

function #9542 (alexey-milovidov) Remove findClusterIndex , findClusterValue functions. This fixes #8641. If you were using these functions, send an email to [email protected] #9543 (alexey-milovidov)

, functions. This fixes #8641. If you were using these functions, send an email to #9543 (alexey-milovidov) Now it's not allowed to create columns or add columns with SELECT subquery as default expression. #9481 (alesapin)

subquery as default expression. #9481 (alesapin) Require aliases for subqueries in JOIN. #9274 (Artem Zuikov)

Improved ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression does not change type of column and MODIFY type does not loose default expression value. Fixes #8669. #9227 (alesapin)

queries logic. Now you cannot column without type, default expression does not change type of column and type does not loose default expression value. Fixes #8669. #9227 (alesapin) Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)

The setting experimental_use_processors is enabled by default. This setting enables usage of the new query pipeline. This is internal refactoring and we expect no visible changes. If you will see any issues, set it to back zero. #8768 (alexey-milovidov)

Add new compact format of parts in MergeTree -family tables in which all columns are stored in one file. It helps to increase performance of small and frequent inserts. The old format (one file per column) is now called wide. Data storing format is controlled by settings min_bytes_for_wide_part and min_rows_for_wide_part . #8290 (Anton Popov)

-family tables in which all columns are stored in one file. It helps to increase performance of small and frequent inserts. The old format (one file per column) is now called wide. Data storing format is controlled by settings and . #8290 (Anton Popov) Support for S3 storage for Log , TinyLog and StripeLog tables. #8862 (Pavel Kovalenko)

Fix excess lock for structure during alter. #11790 (alesapin).

Fix error Size of offsets does not match size of column for queries with PREWHERE column in (subquery) and ARRAY JOIN . #11580 (Nikolai Kochetov).