Fix inconsistent formatting caused by an incorrect substitution of aliases in the formatter. This closes #82833. This closes #82832. This closes #68296. This change is potentially backward incompatible: when the analyzer is disabled, certain CREATE VIEW queries with IN referencing an alias cannot be processed. To prevent the incompatibility, enable the analyzer (it is enabled by default since 24.3). #82838 (Alexey Milovidov).
Codecs DEFLATE_QPL and ZSTD_QAT were removed. Users are advised to convert existing data compressed with DEFLATE_QPL or ZSTD_QAT to another codec before upgrade. Note that in order to use the codecs, settings enable_deflate_qpl_codec and enable_zstd_qat_codec had to be enabled. #92150 (Robert Schulze).
Improve UDF debugging by enabling stderr capture in system.query_log.exception. Previously, UDF stderr was only logged to files and not exposed in query logs, making debugging impossible. Now stderr triggers exceptions by default and is fully accumulated (up to 1MB) before throwing, so complete Python tracebacks and error messages appear in system.query_log.exception for effective troubleshooting. #92209 (Xu Jia).
Empty column list in JOIN USING () clause is now considered a syntax error. Previously it was supposed to be INVALID_JOIN_ON_EXPRESSION during query execution. In some cases such as joining with Join storage it led to LOGICAL_ERROR, close #82502. #92371 (Vladimir Cherkasov).
Revert "Allow INSERT into simple ALIAS columns" (Reverts ClickHouse/ClickHouse#84154). It does not work with custom formats, and is not guarded with a setting. #92849 (Azat Khuzhin).
Setting to throw an error if a data lake catalog doesn't have access to object storage. #93606 (Konstantin Vedernikov).
Remove the transposed_with_wide_view mode of the metric_log - it is unusable due to a bug. It is no longer possible to define system.metric_log with this mode. This partially reverts #78412. #93867 (Alexey Milovidov).
CPU scheduling for workloads is now preemptive by default. See cpu_slot_preemption server setting. #94060 (Sergei Trifonov).
Escape index filenames to prevent broken parts. With this change ClickHouse will fail to load indices with non-ascii characters in their name created by previous versions. To handle it you can use the merge tree setting escape_index_filenames. #94079 (Raúl Marín).
Format settings exact_rows_before_limit, rows_before_aggregation, cross_to_inner_join_rewrite, regexp_dict_allow_hyperscan, regexp_dict_flag_case_insensitive, regexp_dict_flag_dotall and dictionary_use_async_executor were changed to be regular (non-format) settings now. This is a purely internal change without user-visible side effects except in the (unlikely) case that you specified any of these settings in Iceberg or DeltaLake or Kafka or S3 or S3Queue or Azure or Hive or RabbitMQ or Set or FileLog or NATS table engine definitions. In these cases, these settings were previously ignored, now such definitions throw an error. #94106 (Robert Schulze).
The joinGet/joinGetOrNull functions now enforce SELECT privileges on the underlying Join table. After this change, executing joinGet('db.table', 'column', key) requires the user to have SELECT privilege on both the key columns defined in the Join table and the attribute column being retrieved. Queries lacking these privileges will fail with ACCESS_DENIED. To migrate, grant the necessary permissions using GRANT SELECT ON db.join_table TO user for full table access, or GRANT SELECT(key_col, attr_col) ON db.join_table TO user for column-level access. This change affects all users and applications relying on joinGet/joinGetOrNull where explicit SELECT grants were not previously configured. #94307 (Vladimir Cherkasov).
Check SHOW COLUMNS for CREATE TABLE ... AS ... queries. Previously, it checked SHOW TABLES, which is an incorrect grant for this type of permission check. #94556 (pufit).
Make the Hash output format independent of block sizes. #94503 (Alexey Milovidov). Note that this changes the output hash values compared to previous versions.
HTTP API and embedded Web UI for ClickHouse Keeper. #78181 (pufit).
Async insert deduplication now works with dependent materialized views. When collision by block_id occurs, the original block is filtered to remove rows associated with the block_id, and the remaining rows are transformed with all relevant materialized views select queries, this rebuilds original block without conflicting rows. #89140 (Sema Checherinda). It is allowed to use deduplication with async inserts when materialized views are involved. #93957 (Sema Checherinda).
Enable use_variant_as_common_type by default, which lets you use incompatible types inside an Array, in UNION queries, and in branches of if/multiIf/case. #90677 (Alexey Milovidov).
Adds a ClickHouse_Info metric to the Prometheus /metrics endpoint containing mainly version information so it's possible to build charts tracking detailed version information over time. #91125 (Christoph Wurm).
Introduce a new four letter rcfg command for keeper which allows to change cluster configuration. This command provides broader possibilities for configuration changes than standard reconfigure request. Command takes json string as an argument. The whole set of bytes sent to TCP interface should look like this: rcfg{json_string_length_big_endian}{json_string}. Some examples of command may look like this: {"preconditions": {"leaders": [1, 2], "members": [1, 2, 3, 4, 5]}, "actions": [{"transfer_leadership": [3]}, {"remove_members": [1, 2]}, {"set_priority": [{"id": 4, "priority": 100}, {"id": 5, "priority": 100}]}, {"transfer_leadership": [4, 5]}, {"set_priority": [{"id": 3, "priority": 0}]}]}. #91354 (alesapin).
Add function reverseBySeparator which reverses the order of substrings in a string separated by a specified separator. Close #91463. #91780 (Xuewei Wang).
Adds new setting max_insert_block_size_bytes which control the formation of inserted blocks in finer detail. #92833 (Kirill Kopnev).
It is possible to execute DDL queries with ON CLUSTER clause for a Replicated database if the ignore_on_cluster_for_replicated_database setting is enabled. In this case, the cluster name will be ignored. #92872 (Kirill).
Add files column to system.parts table that shows the number of files in each data part. #94337 (Match).
Adds a max-min fair scheduler for concurrency control. Provides better fairness under high oversubscription, where many queries compete for limited CPU slots. Short-running queries are not penalized by long-running queries that have accumulated more slots over time. Enabled by the concurrent_threads_scheduler server setting max_min_fair value. #94732 (Sergei Trifonov).
Added the ability for ClickHouse client to override TLS SNI when connecting to the server. #89761 (Matt Klein).
Setting use_skip_indexes_on_data_read is now enabled by default. This setting allows filtering in a streaming fashion, at the same time as reading, improving query performance and startup time. #93407 (Shankar Iyer).
Support more cases for push down from join ON condition when the filter uses inputs only from one side. Support ANY, SEMI, ANTI joins. #92584 (Dmitry Novik).
Allow using equivalent sets to push down filters for SEMI JOIN. Closes #85239. #92837 (Dmitry Novik).
Skip reading left side of hash join when right side is empty. Previously we were reading left side until first non-empty block, which might do a lot of work in case when there is heavy filtering or aggregation. #94062 (Alexander Gololobov).
Using the "fastrange" (Daniel Lemire) method for partitioning data inside the query pipeline. This could improve parallel sorting and JOINs. #93080 (Alexey Milovidov).
Improve performance of window functions when PARTITION BY matches or is a prefix of the sorting key. #87299 (Nikita Taranov).
Outer filter is pushed down into views which allows applying PREWHERE on local and remote nodes. Resolves #88189. #88316 (Igor Nikonov).
If a skip index used in a FINAL query is on a column that is part of the primary key, the additional step to check for primary key intersection in other parts is unnecessary and now not performed. Resolves #85897. #93899 (Shankar Iyer).
Optimize performance and memory usage for fractional LIMIT and OFFSET. #91167 (Ahmed Gouda).
Fix using of faster random read logic for Parquet Reader V3 prefetcher. Closes #90890. #91435 (Arsen Muk).
Don't filter by virtual columns on constant filters. #91588 (c-end).
Reduce INSERT/merges memory usage with wide parts for very wide tables by enabling adaptive write buffers. Add support of adaptive write buffers for encrypted disks. #92250 (Azat Khuzhin).
Improved performance of full text search with text index and sparseGrams tokenizer by reducing the number of searched tokens in the index. #93078 (Anton Popov).
Function isValidASCII was optimized for positive outcomes, i.e. all-ASCII input values. #93347 (Robert Schulze).
The read-in-order optimization now recognizes when ORDER BY columns are constant due to WHERE conditions, enabling efficient reverse-order reads. This benefits multi-tenant queries like WHERE tenant='42' ORDER BY tenant, event_time DESC which can now use InReverseOrder instead of requiring a full sort.". #94103 (matanper).
Introduce Enum AST specialized class to store value parameters in (string, integer) pairs instead of ASTLiteral children to optimize memory consumption. #94178 (Ilya Yatsishin).
Distributed index analysis on multiple replicas. Beneficial for shared storage and huge amount of data in cluster. This is applicable for SharedMergeTree (ClickHouse Cloud) and could be applicable for other types of MergeTree tables on a shared storage. #86786 (Azat Khuzhin).
Reduce overhead of join runtime filters by disabling them in the following cases: - too many bits are set in the bloom filter - too few rows are filtered out at runtime. #91578 (Alexander Gololobov).
Use an in-memory buffer for correlated subqueries input to avoid evaluating it multiple times. Part of #79890. #91205 (Dmitry Novik).
Allow all replicas to steal orphaned ranges in parallel replicas reading. This improves load balancing and reduces long-tail latency. #91374 (zoomxi).
External aggregation/sorting/join now respects query setting temporary_files_codec in all contexts. Fixed missing profile events for grace hash join. #92388 (Vladimir Cherkasov).
Make query memory usage detection for spilling to disk during aggregation/sorting more robust. #92500 (Azat Khuzhin).
Estimate total rows count and NDV (number of distinct values) statistics of aggregation key columns. #92812 (Alexander Gololobov).
Optimize postings list compression with simdcomp. #92871 (Peng Jian).
Refactor S3Queue Ordered mode processing with buckets. This should also improve performance, reducing the number of keeper requests. #92889 (Kseniia Sumarokova).
Functions mapContainsKeyLike and mapContainsValueLike can now leverage a text index on mapKeys() or mapValues(), respectively. #93049 (Michael Jarrett).
Reduce memory usage on non-Linux systems (enable immediate purging of jemalloc dirty pages). #93360 (Eduard Karacharov).
Force purging of jemalloc arenas in case the ratio of dirty pages size to max_server_memory_usage exceeds memory_worker_purge_dirty_pages_threshold_ratio. #93500 (Eduard Karacharov).
Parse lower and upper bounds of file names corresponding to position deletes from Iceberg manifest file entries for better selection of corresponding data files. #93980 (Daniil Ivanik).
Add two more settings to control maximum number of dynamic subcolumns in JSON column. First is MergeTree setting merge_max_dynamic_subcolumns_in_compact_part (similar to already added merge_max_dynamic_subcolumns_in_wide_part) that limits number of dynamic subcolumns created during merge into a Compact part. Second is query level setting max_dynamic_subcolumns_in_json_type_parsing that limits number of dynamic subcolumns created during parsing of JSON data, it will allow to specify the limit on insert. #94184 (Pavel Kruglov).
Slightly optimize squashing of JSON columns for some cases. #94247 (Pavel Kruglov).
Lower the thread pool queue sizes based on the production experience. Add an explicit memory consumption check before reading any data from the MergeTree. #94692 (Nikita Mikhaylov).
Make sure the scheduler would prefer MemoryWorker thread under the CPU starvation, because it protects ClickHouse process from an existential threat. #94864 (Nikita Mikhaylov).
Run purging of jemalloc dirty pages in a different thread from main thread of MemoryWorker. If purging is slow, it could delay updates of RSS usage which could lead to out of memory kills of the process. Introduce new config memory_worker_purge_total_memory_threshold_ratio to start purging dirty pages based on ratio of total memory usage. #94902 (Antonio Andelic).
system.blob_storage_log is now available for Azure Blob Storage. #93105 (Alexey Milovidov).
Implement blob_storage_log for Local and HDFS. Fix an error when S3Queue used something other than the disk name for logging in blob_storage_log. Add error_code column to blob_storage_log. Split the test configuration file to simplify local testing. #93106 (Alexey Milovidov).
clickhouse-client and clickhouse-local will highlight digit groups (thousands, millions, etc.) inside numeric literals while typing. This closes #93100. #93108 (Alexey Milovidov).
Adds support in clickhouse-client for command-line arguments with a space surrounding the equals sign. Closes #93077. #93174 (Cole Smith).
With <interactive_history_legacy_keymap>true</interactive_history_legacy_keymap>, the CLI client can now fall back to Ctrl-R for regular search like before, while Ctrl-T does fuzzy search. #87785 (Larry Snizek).
The statement to clear caches SYSTEM DROP [...] CACHE gave the false impression that the statement disables the cache. ClickHouse now supports statement SYSTEM CLEAR [...] CACHE which is more obvious. The old syntax remains available. #93727 (Pranav Tiwari).
Support multiple columns as primary key in EmbeddedRocksDB. Closes #32819. #33917 (usurai).
It is now possible to use non-constant IN for scalars (queries like val1 NOT IN if(cond, val2, val3)). #93495 (Yarik Briukhovetskyi).
Prevent x-amz-server-side-encryption headers from being propagated to HeadObject, UploadPart & CompleteMultipartUpload S3 requests as they're not supported. #64577 (Francisco J. Jurado Moreno).
Added syntax ALTER TABLE <table> ATTACH PART <part_name> FROM <directory_name> for ALTER query. It allows the attachment of parts from the arbitrary subdirectory of the detached/ directory. It can be useful for attaching parts with custom prefixes (such as broken-on-start, unexpected, etc.) that were detached by mistake and needed only to be attached back without manual intervention. Previously, manual renaming of directories on the filesystem was required. #74816 (Anton Popov).
Optimize space reservation in filesystem cache. FileCache::collectCandidatesForEviction will be executed without unique lock. #82764 (Kseniia Sumarokova).
Support composite rotation strategy (size + time) for server log. #87620 (Jianmei Zhang).
CLI client can now specify <warnings>false</warnings> instead of the command line --no-warnings. #87783 (Larry Snizek).
Add support for the avg aggregate function with Date, DateTime and Time values as arguments. Closes #82267. #87845 (Yarik Briukhovetskyi).
The optimization use_join_disjunctions_push_down is enabled by default. #89313 (Alexey Milovidov).
Support more table engines and data source kinds in the correlated subqueries. Closes #80775. #90175 (Dmitry Novik).
Correctly handle the gap in Keeper log entries if logs are before the last committed index. #90403 (Antonio Andelic).
Improve min_free_disk_bytes_to_perform_insert setting to work correctly with JBOD volumes. #90878 (Aleksandr Musorin).
Make it possible to specify storage_class_name setting in named collections for S3 table engine and s3 table function. #91926 (János Benjamin Antal).
Support inserting auxiliary zookeeper by system.zookeeper. #92092 (RinChanNOW).
Add new metrics for the keeper: KeeperChangelogWrittenBytes, KeeperChangelogFileSyncMicroseconds, KeeperSnapshotWrittenBytes and KeeperSnapshotFileSyncMicroseconds profile events as well as KeeperBatchSizeElements and KeeperBatchSizeBytes histogram metrics. #92149 (Miсhael Stetsyuk).
Add a new setting, trace_profile_events_list, which limits tracing with trace_profile_event to the specified list of event names. This allows more precise data collection on large workloads. #92298 (Alexey Milovidov).
Support SYSTEM NOTIFY FAILPOINT for pausable failpoints. - Support SYSTEM WAIT FAILPOINT fp PAUSE/RESUME. #92368 (Shaohua Wang).
Add creation (implicit/explicit) column to system.data_skipping_indices. #92378 (Raúl Marín).
Allow passing the description of columns for YTsaurus dyn tables to the dictionary source. #92391 (MikhailBurdukov).
In #63985, we made it possible to specify all the parameters needed for TLS configuration on a per-port basis (see composable protocols), so we don't have to rely on global TLS config. However, the implementation still implicitly requires a global openSSL.server config section to exist, which conflicts with setups where different TLS configurations are needed for different ports. For example, in keeper-in-server deployments, we need separate TLS configs for inter-keeper communication and clickhouse client connections. #92457 (Miсhael Stetsyuk).
Introduce a new setting input_format_binary_max_type_complexity that limits the total number of type nodes that can be decoded in binary format to prevent malicious payloads. #92519 (Raufs Dunamalijevs).
Reflect running tasks in system.background_schedule_pool{,_log}. Add documentation. #92587 (Azat Khuzhin).
Execute current query in Ctrl+R search in client if no history match found. #92749 (Azat Khuzhin).
Support EXPLAIN indices = 1 as an alias for EXPLAIN indexes = 1. Closes #92483. #92774 (Pranav Tiwari).
Parquet reader now allows reading Tuple or Map columns as JSON: select x from file(f.parquet, auto, 'x JSON') works even if the type of column x in f.parquet is tuple or map. #92864 (Michael Kolupaev).
Fallback to read-write copy for Azure Blob Storage when native copy fails with BadRequest (e.g. invalid block list). Previously this was only done for Unauthorized error which was seen while copying blob to different storage accounts. But we also sometimes see "The specified block list is invalid" error. So now updated the condition to fallback to read & write for all native copy fails. #92888 (Smita Kulkarni).
Fix EC2 metadata endpoint throttling when running many concurrent S3 queries with EC2 instance profile credentials. Previously, each query created its own AWSInstanceProfileCredentialsProvider, causing concurrent requests to the EC2 metadata service which could result in timeouts and HTTP response code: 403 errors. Now the credentials provider is cached and shared across all queries. #92891 (Sav).
Rework insert_select_deduplicate setting to add an ability to keep backward compatibility. #92951 (Sema Checherinda).
Log background tasks that are slower than average (background_schedule_pool_log.duration_threshold_milliseconds=30) to avoid excessive tasks logging. #92965 (Azat Khuzhin).
In previous versions, some of C++ function names were displayed incorrectly ("mangled") in the system.trace_log and system.symbols, and the demangle function didn't process them well. Closes #93074. #93075 (Alexey Milovidov).
Introduced the backup_data_from_refreshable_materialized_view_targets backup setting to skip back up of refreshable materialized views. RMVs with APPEND refresh strategy which are always backed up. #93076 (Julia Kartseva). #93658 (Julia Kartseva)
Use minimal debug info instead of no debug info for heavy translation units, such as functions. #93079 (Alexey Milovidov).
Added MinIO compatibility support to AWS S3 C++ SDK by implementing error code mapping for MinIO-specific errors. This change allows ClickHouse to properly handle and retry MinIO server errors when using MinIO deployments instead of AWS S3, improving reliability for users running object storage on self-hosted MinIO clusters. #93082 (XiaoBinMu).
Write symbolized jemalloc profiles (eliminating the need for a binary during heap profile generation). #93099 (Azat Khuzhin).
Improve the UX of SYSTEM INSTRUMENT ADD/REMOVE: use String literals for function names, patch all functions that match and allow using function_name in REMOVE. #93345 (Pablo Marcos).
Add a new setting materialize_statistics_on_merge which enables/disables materializing statistics during merge. The default value is 1. #93379 (Han Fei).
ClickHouse can now parse SELECT without parentheses around DESCRIBE SELECT queries. Closes #58382. #93429 (Yarik Briukhovetskyi).
Add setting type_json_allow_duplicated_key_with_literal_and_nested_object to allow duplicated paths in JSON where one is a literal and another is a nested object, e.g. {"a" : 42, "a" : {"b" : 42}}. Some data could be created before this restriction on duplicated paths was added in https://github.com/ClickHouse/ClickHouse/pull/79317 and further manipulation with this data can lead to errors now. With this setting, such old data cane still be used with no errors. #93604 (Pavel Kruglov).
Don't print values of simple types on separate lines in Pretty JSON. #93836 (Pavel Kruglov).
When there are many alter table ... modify setting ... statements, it's possible not to acquire lock for 5 seconds. Better to return timeout than logical error. #93856 (Han Fei).
Prevent excessive output on a syntax error. Before this change, it output the whole SQL script, which could contain a lot of queries. #93876 (Alexey Milovidov).
Do proper byte size calculation of the check request with stats in Keeper. #93907 (Mikhail Artemenko).
Added use_hash_table_stats_for_join_reordering setting to control whether runtime hash table size statistics are used for join reordering. This setting is enabled by default, preserving the existing behavior of collect_hash_table_stats_during_joins. #93912 (Vladimir Cherkasov).
Users can now partially view nested global server settings in the system.server_settings table (e.g. logger.level). This only covers settings with a fixed structure (no lists, enumerations, repetitions etc.). #94001 (Hechem Selmi).
When Keeper detects broken snapshot or inconsistent changelogs, throw exception instead of manually aborting or cleaning up files automatically. This should lead to a safer behaviour of Keeper relying on manual intervention. #94168 (Antonio Andelic).
Fix leaving possible leftovers in case of CREATE TABLE fails. #94174 (Azat Khuzhin).
Fix uninitialized memory access (a bug in OpenSSL) when password protected TLS key is used. #94182 (Konstantin Bogdanov).
Support more generic partitioning for S3Queue ordered mode. #94321 (Bharat Nallan).
Added alias use_statistics for setting allow_statistics_optimize. This is more consistent with existing settings use_primary_key and use_skip_indexes. #94366 (Robert Schulze).
Enabled setting input_format_numbers_enum_on_conversion_error for conversion from Numbers to Enums to check whether the element exists. #94384 (Elmi Ahmadov).
In S3(Azure)Queue ordered mode clean up failed nodes by tracking limits (before that was done only in Unordered mode for both failed and processed, so now this will also be done for Ordered but only for failed nodes). #94412 (Kseniia Sumarokova).
Enable access management for default user in clickhouse-local. The default user in clickhouse-local was missing the access_management privilege, which caused operations like DROP ROW POLICY IF EXISTS to fail with ACCESS_DENIED error, even though the user should be unrestricted. #94501 (Alexey Milovidov).
Enable named collection for YTsaurus dictionaries and tables. #94582 (MikhailBurdukov).
Add support for SQL-defined named collections in BACKUP/RESTORE for S3 and Azure Blob Storage. Closes #94604. #94605 (Pablo Marcos).
Support bucketing based on partition key for S3Queue in ordered mode. #94698 (Bharat Nallan).
Add an asynchronous metric with the longest running merge elapsed time. #94825 (Raúl Marín).
Add belonging file check before apply position delete using IcebergBitmapPositionDeleteTransform. #94897 (Yang Jiang).
Now view_duration_ms shows the time when group was active, not the sum of the threads duration in it. #94966 (Sema Checherinda).
Remove limit of the max number of search tokens in hasAnyTokens and hasAllTokens functions which was limited to 64. Example: SELECT count() FROM table WHERE hasAllTokens(text, ['token_1', 'token_2', [...], 'token_65']]); The query would result in a BAD_ARGUMENTS error because there are 65 search tokens. With this PR, the limit has been removed completely and the same query would run without an error. #95152 (Elmi Ahmadov).
Add a setting input_format_numbers_enum_on_conversion_error for conversion from Numbers to Enums to check whether the element exists. Closes: #56144. #56240 (Nikolay Degterinsky).
Share format parser resources between data file and position delete file reading in Iceberg tables to reduce memory allocations. #94701 (Yang Jiang).
Bug Fix (user-visible misbehavior in an official stable release)
Fixes a bug where predefined query handlers would have trailing whitespace interpreted as data during inserts. #83604 (Fabian Ponce).
Fix INCOMPATIBLE_TYPE_OF_JOIN error for Join storage and outer to inner join optimization applied. Resolves #80794. #84292 (Vladimir Cherkasov).
Fix exception "Invalid number of rows in Chunk" when using hash join with allow_experimental_join_right_table_sorting enabled. #86440 (yanglongwei).
Always replace file names to hash in MergeTree if filesystem is case insensitive. Previously on systems with case insensitive filesystem (like MacOS) it could lead to data corruption when several column/subcolumn names differs only in the case. #86559 (Pavel Kruglov).
Add a full permissions check on the create stage for the underlying query inside a materialized view. #89180 (pufit).
Cache schema only for the file it was inferred from in globs instead of all files during schema inference. Closes #91745. #92006 (Pavel Kruglov).
Fix the Couldn't pack tar archive: Failed to write all bytes error caused by an incorrect archive entry size header. Fixes #89075. #92122 (Julia Kartseva).
Release request stream in insert select to prevent closing http connection. #92175 (Sema Checherinda).
Fix logical error for queries with multiple JOINs with USING clause and join_use_nulls. #92251 (Vladimir Cherkasov).
Fix count_distinct_optimization pass over window functions and over multiple arguments. #92376 (Raúl Marín).
Fix "Cannot write to finalized buffer" error when using certain aggregate functions with window functions. Closes #91415. #92395 (Jimmy Aguilar Mena).
Fix logical error with CREATE TABLE ... AS urlCluster() and database engine Replicated. Closes #92216. #92418 (Kseniia Sumarokova).
Inherit source part serialization info settings during mutation in MergeTree. It fixes possible incorrect result of the query over mutated part after changes in data types serialization. #92419 (Pavel Kruglov).
Fix possible conflict in column and subcolumn with the same name leading in using wrong serialization and query failures. Closes #90219. Closes #85161. #92453 (Pavel Kruglov).
Fix a LOGICAL_ERRORs that caused by not wanted modification of query plan when converting outer join to inner join. Also relax the requirements of optimization to be able to apply it in cases when injective functions are applied to the aggregating keys during joins. #92503 (János Benjamin Antal).
Fix possible error SIZES_OF_COLUMNS_DOESNT_MATCH during sorting of emty tuple column. Closes #92422. #92520 (Pavel Kruglov).
Fix deadlock for SHOW CREATE DATABASE for Backup database. #92541 (Azat Khuzhin).
Use proper error code when validating hypothesis index. #92559 (Raúl Marín).
Fix dynamic subcolumns resolution in column aliases in analyzer. Previously dynamic subcolumn in column alias was wrapped in getSubcolumn and in some cases could be not resolved at all. Closes #91434. #92583 (Pavel Kruglov).
Prevent crash in tokens() with null second argument. #92586 (Raúl Marín).
Fix potential crash caused by in place mutation of underlying const PREWHERE columns. This could've happened at column shrinking (IColumn::shrinkToFit) or filtering (IColumn::filter), which could've triggered concurrently from several threads. #92588 (Arsen Muk).
Creating and materializing text indexes on tables containing large parts (over 4,294,967,295 rows) is temporarily disabled. This limitation prevents incorrect query results, as the current index implementation does not yet support such large parts. #92644 (Anton Popov).
Fixes a logical error Too large size (A) passed to allocator while executing JOINs. Closes #92043. #92667 (Yarik Briukhovetskyi).
Remove a bug that ngrambf_v1 indexes with ngram length (1st parameter) > 8 would throw an exception. #92672 (Robert Schulze).
Reworks incorrect logic in access grant checks for wildcard grants. The previous attempt https://github.com/ClickHouse/ClickHouse/pull/90928 addressed a critical vulnerability but ended up being too restrictive, resulting in some wildcard GRANT statements failing due to unrelated revokes. #92725 (pufit).
Fix bug in data skipping logic when not match(...) is used in WHERE causing incorrect results. Closes #92492. #92726 (Nihal Z. Miaji).
Do not attempt to delete temporary directories at startup if a MergeTree table is created over a read-only disk. #92748 (Alexey Milovidov).
Fix "Cannot add action to empty ExpressionActionsChain" for ALTER TABLE REWRITE PARTS (v2). #92754 (Azat Khuzhin).
Fix logical error Failed to set file processing within 100 retries in storgae S3Queue in Ordered mode. It is now replaced with a warning. This error could happen before 25.10 version if keeper session expired, however it will still be a warning in 25.10+ versions, as it is still theoretically possible to get this error in case of high processing concurrency in Ordered mode. #92814 (Kseniia Sumarokova).
In the S3 table engine, we should avoid caching the partition key if there are non-deterministic functions. #92844 (Miсhael Stetsyuk).
Fix possible error FILE_DOESNT_EXIST after mutation of a sparse column with ratio_of_defaults_for_sparse_serialization=0.0. Closes #92633. #92860 (Pavel Kruglov).
Fix parquet schema inference in the old parquet reader (not used by default) when a JSON column comes after a Tupe column. Fix the old parquet reader (not used by default) failing on empty tuples. #92867 (Michael Kolupaev).
Fix logical error with multiple joins on constant condition and join_use_nulls, close #92640. #92892 (Vladimir Cherkasov).
Fix possible error NOT_FOUND_COLUMN_IN_BLOCK during insert into a table with subcolumn in partition expression. Closes #93210. Closes #83406. #92905 (Pavel Kruglov).
Fix error NO_SUCH_COLUMN_IN_TABLE in Merge engine over tables with aliases. Closes #88665. #92910 (Pavel Kruglov).
Fix NULL != NULL case for full_sorting_join on LowCardinality(Nullable(T)) column. #92924 (Vladimir Cherkasov).
Fixed several crashes during merges of text indexes in MergeTree tables. #92925 (Anton Popov).
Restore LowCardinality wrappers on SET expression results if needed during TTL aggregation to prevent exceptions during table optimization. #92971 (Seva Potapov).
Fix logical error during index analysis when empty array is used in has function. Closes #92906. #92995 (Nihal Z. Miaji).
Fix possible hung on terminating background schedule pool (may lead to server hungs on shutdown). #93008 (Azat Khuzhin).
Fix possible error FILE_DOESNT_EXIST after sparse column mutation when setting ratio_of_defaults_for_sparse_serialization was changed to 1.0 via alter. #93016 (Pavel Kruglov).
Fix bug in data skipping logic when not materialize(...) or not CAST(...) is used in WHERE causing incorrect results. Closes #88536. #93017 (Nihal Z. Miaji).
Fix possible usage of outdated parts due to TOCTOU race for shared parts. #93022 (Azat Khuzhin).
Fix crash when deserialising malformed groupConcat aggregate state with out-of-bounds offsets. #93028 (Raufs Dunamalijevs).
Fix leaving connection in a broken state after preliminary cancellation distributed queries. #93029 (Azat Khuzhin).
Fix join results when the right-side join key is a sparse column. This closes #92920. I can only reproduce the bug with set compatibility='23.3'. Not sure if it should be backported. #93038 (Amos Bird).
Fix possible Cannot finalize buffer after cancellation in estimateCompressionRatio(). Fixes: #87380. #93068 (Azat Khuzhin).
Fixed merges of text indexes built on top of the complex expressions (such as concat(col1, col2)). #93073 (Anton Popov).
Fix logical error in some cases triggered when join runtime filters are added to query plan. It was caused by incorrectly returning duplicated const columns from one of join sides. #93144 (Alexander Gololobov).
Special function __applyFilter used by join runtime filters was returning ILLEGAL_TYPE_OF_ARGUMENT in some valid cases. #93187 (Alexander Gololobov).
Prevent different interpolated columns from collapse into the same column in a block when interpolated columns are effectively aliases of the same column. #93197 (Yakov Olkhovskiy).
Do not add runtime filter when joining with already filled right table. #93211 (Alexander Gololobov).
Remove unused columns when the projection is rebuilt during the merge. It reduces memory usage and creates fewer temporary parts. #93233 (Nikolai Kochetov).
Fix unused columns removal from subqueries in the presence of a scalar correlated subquery. Before the fix column could have been removed if it was used only in the correlated subquery, and the query would fail with NOT_FOUND_COLUMN_IN_BLOCK error. #93273 (Dmitry Novik).
Fix possible missing subcolumn in MV during alter of source table. Closes #93231. #93276 (Pavel Kruglov).
Fix the Merge table engine query planning with the analyzer that could throw ILLEGAL_COLUMN for hostName() when merging local and remote/Distributed tables. Closes #92059. #93286 (Jinlin).
Fixes a case where NOT IN with non-constant array arguments was returning the wrong value + Support for non-constant Array functions. Closes #14980. #93314 (Yarik Briukhovetskyi).
Fixed rebuilding of text indexes created on top of subcolumns. #93326 (Anton Popov).
Fixed handling of empty array as a second argument in hasAllTokens and hasAnyTokens functions. #93328 (Anton Popov).
Fix logical error when runtime filters are used in a query with totals for right side table. #93330 (Alexander Gololobov).
The server no longer crashes if function tokens is called with non-const tokenizer parameters (the 2th, 3rd, 4th parameter), e.g., SELECT tokens(NULL, 1, materialize(1)). #93383 (Robert Schulze).
Fixed integer overflow vulnerability in groupConcat state deserialisation that could cause memory safety issues with crafted aggregate states. #93426 (Raufs Dunamalijevs).
Fixed text index analysis on array columns when the index contains no tokens (all arrays are empty or all tokens are skipped by the tokenizer). #93457 (Anton Popov).
Avoids oauth login in ClickHouse Client when username/password are within the connection string. #93459 (Krishna Mannem).
Fix Azure ADLS Gen2 vended credentials support in DataLakeCatalog - parse adls.sas-token.* keys from Iceberg REST catalogs and fix ABFSS URL parsing. #93477 (Karun Anantharaman).
Fix GLOBAL IN support with analyzer (previously set was created on the remote node again). #93507 (Azat Khuzhin).
Fix extracting subcolumn during deserialization directly into Sparse columns. #93512 (Pavel Kruglov).
Fixed direct reading from text index with duplicate search queries. #93516 (Anton Popov).
Fix for NOT_FOUND_COLUMN_IN_BLOCK error when runtime filter is enabled and joined tables have the same column returned multiple times (e.g. SELECT a, a, a FROM t). #93526 (Alexander Gololobov).
Fix a bug where clickhouse-client would ask for password twice when connecting using ssh. #93547 (Isak Ellmer).
Make sure that zookeeper is finalized on shutdown (fix possible hung on shutdown in very unlikely cases). #93602 (Azat Khuzhin).
Fix LOGICAL_ERROR when restoring ReplicatedMergeTree with deduplication race. #93612 (Pablo Marcos).
Fix using Sparse column for TTL update during direct deserialization into Sparse columns in some input formats. It fixes possible logical error Unexpected type of result TTL column. #93619 (Pavel Kruglov).
Fixed h3 index functions sometimes crashing or getting stuck when called on invalid inputs. #93657 (Michael Kolupaev).
The usage of ngram_bf index on a non-UTF-8 data led to an uninitialized memory read, with values that could reside in the resulting index structure. Closes #92576. #93663 (Alexey Milovidov).
Validate that the decompressed buffer size is as expected. #93690 (Raúl Marín).
Prevent users to get the list of columns from a table without checking SHOW COLUMNS permission using the merge table engine. #93695 (János Benjamin Antal).
Fixed materialization of skip indexes created on top of subcolumns. #93708 (Anton Popov).
We store storages' shared pointers in QueryPipeline::resources::storage_holders to make sure that the IStorage objects are not destroyed while PipelineExecutor is alive. #93746 (Miсhael Stetsyuk).
Fix attaching Replicated DBs when the interserver host changed after restarting. #93779 (Tuan Pham Anh).
Fix assert !read_until_position in ReadBufferFromS3 which happened when cache is enabled. #93809 (Kseniia Sumarokova).
Fix logical error in a rare case when empty tuple is used with Map column. Closes #93784. #93814 (Nihal Z. Miaji).
Fixed _part_offset corruption when projections are rebuilt during merges, and optimized projection processing by avoiding unnecessary reads of the _part_offset column and skipping unneeded columns in projection calculations. This continues the optimizations introduced in #93233. #93827 (Amos Bird).
In https://github.com/ClickHouse/ClickHouse/pull/89173, we added an extra field to the structure that TraceSender sends through an internal pipe. However, the buffer size was not updated (here), therefore we are writing more data to buffer than buffer_size which results in multiple flushes. And because TraceSender::send is called from different threads, different threads' flushes may interleave which breaks the invariant that the receiving end (TraceCollector) relies on. #93966 (Miсhael Stetsyuk).
Fix type conversion to super type during the join operation of the storage Join with USING clause. Fixes #91672. Fixes #78572. #94000 (Dmitry Novik).
Fix for FilterStep not properly added when join runtime filter is applied over Merge table. #94021 (Alexander Gololobov).
A SELECT query containing a predicate on multiple columns with bloom filter skip indexes and both OR and NOT conditions are present could return inconsistent results. That is fixed now. #94026 (Shankar Iyer).
Fix the crash during filter analysis in the presence of OUTER JOIN. Fixes #90979. #94080 (Dmitry Novik).
Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel (max_threads > 1 - default). #94095 (Azat Khuzhin).
Fix crash caused by exception thrown from a socket.setBlocking(true) call inside SCOPE_EXIT. #94100 (Miсhael Stetsyuk).
Fix data loss when DROP PARTITION removes parts created by later log entries in ReplicatedMergeTree. #94123 (Tuan Pham Anh).
Fixed parquet reader v3 incorrectly handling arrays that cross page boundaries. This happens e.g. for files written by Arrow without enabling page statistics or page index. Affects only columns of Array data type. Likely symptom is that one array every ~1 MB of data gets truncated. Before this fix, use this setting as workaround: input_format_parquet_use_native_reader_v3 = 0. #94125 (Michael Kolupaev).
Fix too many watches in ReplicatedMergeTree while waiting for log entry. #94133 (Azat Khuzhin).
Functions arrayShuffle, arrayPartialShuffle and arrayRandomSample to materialize const columns - so that different rows get different results. #94134 (Joanna Hulboj).
Fix data race in evaluating table functions in materialized views. #94171 (Alexey Milovidov).
Fix nullptr dereference in PostgreSQL database engines (when the query is incorrect). Closes #92887. #94180 (Alexey Milovidov).
Fix memory leak in refreshable materialized views using SELECT queries with multiple subqueries. #94200 (Antonio Andelic).
Remove the wrong noexcept specifier at HashTable copy assignment that may lead to crash (std::terminate) on memory exceptions. #94275 (Nikita Taranov).
Previously, creating a projection with duplicate columns in GROUP BY (e.g., GROUP BY c0, c0) and inserting data caused a std::length_error if optimize_row_order is enabled. Closes #94065. #94277 (Alexey Milovidov).
Fix obscure bug in ZooKeeper client on connect which leads to hungs and crashes. #94320 (Azat Khuzhin).
Fix function to subcolumns optimization not applied to subcolumns. #94323 (Pavel Kruglov).
Fix possibly incorrect result in nested RIGHT JOINs when enable_lazy_columns_replication is enabled. The bug caused all rows in replicated columns to incorrectly return the same value instead of their distinct values. Close #93891. #94339 (Vladimir Cherkasov).
Fix filter pushdown for SEMI JOIN using equivalence sets. Do not push the filter down if argument types have changed. Fixes #93264. #94340 (Dmitry Novik).
Fix usage of DeltaLake CDF with database DataLake database engine (delta lake catalogs integration). Closes #94122. #94342 (Kseniia Sumarokova).
Fix incorrect value of current metric FilesystemCacheSizeLimit in case SLRU cache policy was used. #94363 (Kseniia Sumarokova).
Creating a Backup database engine with less than two arguments now returns a more descriptive error message (Wrong number of arguments instead of std::out_of_range: InlinedVector::at(size_type) const failed bounds check.). #94374 (Robert Schulze).
Ignores impossible revokes of global grants on the database level for grants with grant option. #94386 (pufit).
Fix nullptr dereference with disabled send_profile_events. This feature was introduced recently for the ClickHouse Python driver. Closes #92488. #94466 (Alexey Milovidov).
Fix text index .mrk incompatibility during merges. #94494 (Peng Jian).
When read_in_order_use_virtual_row is enabled, the code was accessing index columns based on the full primary key size without checking if the index was truncated, leading to use-after-free / uninitialized memory. Closes #85596. #94500 (Alexey Milovidov).
Fix an error due to a type mismatch when sending external tables for subqueries with GLOBAL IN if the types are Nullable. Closes #94097. #94511 (Alexey Milovidov).
In previous versions, queries with multiple index conditions over the same expression may erroneously throw an exception Not found column. Closes #60660. #94515 (Alexey Milovidov).
Creating a workload in another workload that is currently in use no longer causes a crash. #94599 (Sergei Trifonov).
Fix a crash during ANY LEFT JOIN optimization when isNotNull is evaluated on a missing column. #94600 (Molly).
Fix default expression evaluation when referencing other columns with computed defaults. #94615 (Alexey Milovidov).
Fix permission issues in BACKUP/RESTORE operations. #94617 (Pablo Marcos).
Fix crash due to incorrect type cast when the data type is Nullable(DateTime64). #94627 (Miсhael Stetsyuk).
Fixes a bug where certain distributed queries with ORDER BY could return ALIAS columns with swapped values (i.e., column a showing column b’s data and vice versa). #94644 (filimonov).
Preserve constant index granularity (use_const_adaptive_granularity) after Vertical merges. #94725 (Azat Khuzhin).
Fix mutation bug with scalar subqueries and table dependencies. If a table had dependencies (index or projections) over a column, scalar subqueries might be evaluated and cached without data and lead to incorrect changes. #94731 (Raúl Marín).
Fix AsynchronousMetrics cpu_pressure fallback on error. #94827 (Raúl Marín).
The getURLHostRFC function was missing bounds checks before dereferencing pointers. When an empty string was passed to domainRFC, it would read uninitialized memory, triggering MSan errors. #94851 (Alexey Milovidov).
Fix logical error in fractional LIMIT/OFFSET when using the old analyzer with Distributed tables. Closes #94712. #94999 (Ahmed Gouda).
Fix crash under some conditions when join runtime filters are enabled by default. #95000 (Alexander Gololobov).
Improve masking passwords in url used in table engine URL() and table function url(). #95006 (Vitaly Baranov).
Function toStartOfInterval now works in the same way as toStartOfX, where X is Day, Week, Month, Quarter, Year when the enable_extended_results_for_datetime_functions is on. #95011 (Kirill Kopnev).
Fix constant string comparisons not respecting the settings cast_string_to_date_time_mode, bool_true_representation, bool_false_representation, and input_format_null_as_default. Closes #91681. #95040 (Nihal Z. Miaji).
Converting from DateTime/integers to Time64 extracts the time-of-day component using toTime, which is not monotonic. The ToDateTimeMonotonicity template incorrectly claimed this conversion was monotonic, causing "Invalid binary search result in MergeTreeSetIndex" exception in debug builds. #95125 (Alexey Milovidov).
Recreated list of manifest file entries only if necessary (previously it was done on each iteration). #95162 (Daniil Ivanik).
Add a set of tools for profiling memory allocations in the ClickHouse SQL parser using jemalloc's heap profiling capabilities. #94072 (Ilya Yatsishin).
Added a tool that simplifies debugging of memory allocations in parser. It uses jemalloc stats.allocated metric before and after we parse query to AST representation to show what is allocated. Also it supports memory profiling mode that dumps profile before and after to build reports where allocations occurred. #93523 (Ilya Yatsishin).