ClickHouse Release 26.5

ClickHouse
Jun 1, 2026 Β· 19 minutes read

Another month goes by, which means it’s time for another release!

The ClickHouse 26.5 release contains 38 new features 🌹 51 performance optimizations πŸ¦‹ 224 bug fixes 🐞

This release sees a record number of performance optimizations, with highlights including ORDER BY … LIMIT pushdown through joins (up to 20Γ— faster), a new GROUP BY … LIMIT shortcut that avoids building unnecessary groups, a new filesystem table function for running SQL directly against your local file system, and more!

New contributors #

A special welcome to all the new contributors in 26.5! The growth of ClickHouse's community is humbling, and we are always grateful for the contributions that have made ClickHouse so popular.

Below are the names of the new contributors:

Abhinav Agarwal, Ahaan, Alex Kuleshov, Ashrith Bandla, Asish Kumar, Callum C, Felix Bernhard, Flavio Malavazi, Ian Rakhmatullin, Ilya Perstenev, JackFielding, Joe Redfern, Larry Snizek, Luc Leray, Rahul Nair, Roy Sindre Norangshol, Venkata Vineel, Vincent Voyer, Yue, Yue Ni, functioncrafter, ibrahim karimeddin, mohaidoss, perst20, peter15914, sayondeep, zhangzhibiao, zxuhan7

Hint: if you’re curious how we generate this list… here.

You can also view the slides from the presentation.

Push ORDER BY … LIMIT through JOIN #

Contributed by Alexey Milovidov #

β€œWe optimize ClickHouse in every version, we optimize it more, and there is no end in optimizations” – Alexey Milovidov during the ClickHouse release 26.5 webinar

Moving more work before joins #

In recent releases, ClickHouse has been steadily moving more work before joins, so less data has to pass through them. For example, ClickHouse already pushes down complex OR conditions in JOIN queries to filter each table earlier, before the join happens. It also supports runtime filters, which are created from the right-hand side of a join and applied to the left-hand side before the join runs.

This release continues that theme, but pushes down a different kind of work: not a WHERE predicate, but the ORDER BY … LIMIT clause, a pattern that appears frequently in analytical workloads.

From β€œjoin then limit” to β€œlimit then join” #

If the outermost SELECT of a LEFT JOIN query ends with ORDER BY … LIMIT, and the sort key depends only on columns from the left table, ClickHouse can push that ORDER BY … LIMIT below the join.

The same applies to RIGHT JOIN queries when the sort key depends only on columns from the right table.

For example, this query running over TPC-H tables asks for the 100 most recent orders, enriched with customer information:

1SELECT
2    o_orderkey,
3    o_orderdate,
4    o_totalprice,
5    c_name,
6    c_mktsegment
7FROM orders
8LEFT JOIN customer ON o_custkey = c_custkey
9ORDER BY
10    o_orderdate DESC,
11    o_orderkey DESC
12LIMIT 100;

Here, the ORDER BY uses only columns from orders, the preserved side of the LEFT JOIN. That means ClickHouse does not need to join every order with its customer before applying the limit.

Without the optimization, the plan is forced to do the expensive join first:

Blog-release-26.05.001.png

With the new optimization, ClickHouse can flip the work around: it can first find the top 100 rows from orders, and then join only those few rows with customer.

Blog-release-26.05.002.png

You can also see the change in the query plan obtained via EXPLAIN. With the optimization enabled, the plan contains a Limit and Sorting step on the orders table side, before the join with the customer table:

Join

  ...

    Limit

      Sorting

        ReadFromMergeTree (sf100.orders)

  ...

ReadFromMergeTree (sf100.customer)

A nice side effect is that ClickHouse already treats the pushed-down ORDER BY … LIMIT part as a first-class query pattern. As covered in our dedicated Top-N optimization post, ClickHouse has accumulated several engine-level optimizations for this pattern.

This optimization is controlled by the new query_plan_top_k_through_join setting, which is enabled by default.

Benchmark: 20Γ— faster and 175Γ— less memory #

To evaluate the impact, we created and loaded the TPC-H schema with a scale factor of 100 on an AWS EC2 m6i.8xlarge instance with 32 vCPUs and 128 GiB of RAM.

First, we ran the query with the new ORDER BY … LIMIT pushdown disabled by setting query_plan_top_k_through_join = 0. We executed the query three times and used the fastest run as the baseline:

Elapsed: 2.153 sec. Processed 165.00 million rows, 3.23 GB (76.65 million rows/s., 1.50 GB/s.)
Peak memory usage: 1.87 GiB.

Elapsed: 1.878 sec. Processed 165.00 million rows, 3.23 GB (87.87 million rows/s., 1.72 GB/s.)
Peak memory usage: 1.88 GiB.

Elapsed: 2.197 sec. Processed 165.00 million rows, 3.23 GB (75.10 million rows/s., 1.47 GB/s.)
Peak memory usage: 1.87 GiB.

Then we ran the same query with the optimization enabled by setting query_plan_top_k_through_join = 1:

Elapsed: 0.093 sec. Processed 165.22 million rows, 2.18 GB (1.78 billion rows/s., 23.45 GB/s.)
Peak memory usage: 11.46 MiB.

Elapsed: 0.092 sec. Processed 165.22 million rows, 2.18 GB (1.80 billion rows/s., 23.70 GB/s.)
Peak memory usage: 13.72 MiB.


Elapsed: 0.092 sec. Processed 165.22 million rows, 2.18 GB (1.79 billion rows/s., 23.53 GB/s.)
Peak memory usage: 10.98 MiB.

Using the fastest run from each configuration, the difference is significant:

SettingFastest runtimePeak memoryData read
Pushdown disabled1.878 sec1.88 GiB3.23 GB
Pushdown enabled0.092 sec10.98 MiB2.18 GB
Improvement20.4Γ— faster~175Γ— less memory1.5Γ— less data read

This benchmark already shows a 20.4Γ— runtime improvement and around 175Γ— lower peak memory usage.

These numbers are not a fixed ceiling. The benefit depends on the size of the input tables, the width of the joined rows, the selected columns, and the LIMIT value.

GROUP BY … LIMIT with no ORDER BY #

Contributed by Amos Bird #

Extending Top-N optimizations to GROUP BY #

ClickHouse already treats Top-N queries as a first-class query pattern. As covered in our dedicated Top-N optimization post, ClickHouse has accumulated several engine-level optimizations for queries with ORDER BY … LIMIT, including streaming execution, read-in-order, lazy reading, and data-skipping-based Top-N pruning.

This release extends the same idea to another shape: GROUP BY … LIMIT queries without ORDER BY.

Consider a query that groups by a key and then applies LIMIT, but has no ORDER BY, no HAVING clause, and no window function. In that case, the query does not ask for the smallest keys, the largest keys, the most frequent keys, or keys in any particular order. It only asks for any N distinct grouping keys.

For example, because we already had the TPC-H dataset loaded for the previous section’s benchmark, we can reuse it here. This query asks for any 100 distinct order keys from the lineitem table:

1SELECT l_orderkey
2FROM lineitem
3GROUP BY l_orderkey
4LIMIT 100;

From β€œgroup everything, then limit” to β€œkeep only N groups” #

In TPC-H scale factor 100, lineitem contains 600 million rows and 150 million distinct l_orderkey values.

Without the new optimization, ClickHouse treats the query like a regular GROUP BY: as it scans the input, every new l_orderkey creates a new entry in the aggregation hash table. Only after the aggregation result has been built does LIMIT 100 reduce the output to 100 rows.

Blog-release-26.05.003.png

With this release, ClickHouse recognizes this special pattern and avoids building groups that cannot affect the result. The optimization is controlled by the new optimize_trivial_group_by_limit_query setting, which is enabled by default.

For eligible queries, ClickHouse internally sets the aggregation limit to LIMIT + OFFSET and uses group_by_overflow_mode = 'any'. In practice, this means that once the aggregation hash table contains the first 100 distinct l_orderkey values, new keys are ignored instead of being added as new groups.

Blog-release-26.05.004.png

The scan still processes the input, but the aggregation state in main memory stays tiny: 100 groups instead of growing toward 150 million.

Benchmark: 11.9Γ— faster and 185Γ— less memory #

To evaluate the impact, we ran the query again on an AWS EC2 m6i.8xlarge instance with 32 vCPUs and 128 GiB RAM. First, we disabled the optimization by setting optimize_trivial_group_by_limit_query = 0 and used the fastest of three runs as the baseline:

Elapsed: 0.853 sec. Processed 600.04 million rows, 2.40 GB (703.29 million rows/s., 2.81 GB/s.)
Peak memory usage: 8.60 GiB.

Elapsed: 0.806 sec. Processed 600.04 million rows, 2.40 GB (744.07 million rows/s., 2.98 GB/s.)
Peak memory usage: 8.58 GiB.

Elapsed: 0.809 sec. Processed 600.04 million rows, 2.40 GB (742.06 million rows/s., 2.97 GB/s.)
Peak memory usage: 8.57 GiB.

Then we ran the same query with the optimization enabled by setting optimize_trivial_group_by_limit_query = 1:

Elapsed: 0.069 sec. Processed 600.04 million rows, 2.40 GB (8.76 billion rows/s., 35.03 GB/s.)
Peak memory usage: 47.54 MiB.

Elapsed: 0.070 sec. Processed 600.04 million rows, 2.40 GB (8.54 billion rows/s., 34.16 GB/s.)
Peak memory usage: 47.54 MiB.

Elapsed: 0.068 sec. Processed 600.04 million rows, 2.40 GB (8.79 billion rows/s., 35.17 GB/s.)
Peak memory usage: 47.55 MiB.

Using the fastest run from each configuration:

SettingFastest runtimeRows processedData readPeak memory
Optimization disabled0.806 sec600.04 million2.40 GB8.58 GiB
Optimization enabled0.068 sec600.04 million2.40 GB47.55 MiB
Improvement11.9Γ— fastersamesame~185Γ— less memory

The optimized query is 11.9Γ— faster and uses about 185Γ— less peak memory.

The filesystem table function #

Contributed by Ilya Perstenev, Ilya Yatsishin, Alexey Milovidov #

ClickHouse 25.6 also introduces the filesystem table function, which lets us list and analyze a directory as a queryable table.

The full schema exposed by filesystem covers everything you'd expect for filesystem introspection:

1DESCRIBE filesystem();
β”Œβ”€name──────────────┬─type───────────────────────────────────────────────┐
β”‚ path              β”‚ String                                             β”‚
β”‚ name              β”‚ String                                             β”‚
β”‚ type              β”‚ Enum8('none' = 0, 'not_found' = 1, 'regular' = 2, β‹―β”‚
β”‚ size              β”‚ Nullable(UInt64)                                   β”‚
β”‚ depth             β”‚ UInt16                                             β”‚
β”‚ modification_time β”‚ Nullable(DateTime64(6))                            β”‚
β”‚ is_symlink        β”‚ Bool                                               β”‚
β”‚ content           β”‚ Nullable(String)                                   β”‚
β”‚ owner_read        β”‚ Bool                                               β”‚
β”‚ owner_write       β”‚ Bool                                               β”‚
β”‚ owner_exec        β”‚ Bool                                               β”‚
β”‚ group_read        β”‚ Bool                                               β”‚
β”‚ group_write       β”‚ Bool                                               β”‚
β”‚ group_exec        β”‚ Bool                                               β”‚
β”‚ others_read       β”‚ Bool                                               β”‚
β”‚ others_write      β”‚ Bool                                               β”‚
β”‚ others_exec       β”‚ Bool                                               β”‚
β”‚ set_gid           β”‚ Bool                                               β”‚
β”‚ set_uid           β”‚ Bool                                               β”‚
β”‚ sticky_bit        β”‚ Bool                                               β”‚
β”‚ file              β”‚ String                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

If we call it with no arguments, using clickhouse-local, it will list files in the current directory:

1SELECT path, name FROM filesystem();
β”Œβ”€path──────────────────────────────────────────────┬─name──────────────────────┐
β”‚ /Users/markhneedham/projects/release-posts/26.5   β”‚ clickhouse                β”‚
β”‚ /Users/markhneedham/projects/release-posts/26.5   β”‚ .claude                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

It has access to the same parts of the file system as the user who launched ClickHouse. If you call it via ClickHouse Server, it will list the files in the user_files directory.

I have a lot of large video files on my machine, and I (or rather Claude!) usually have to run a bunch of Unix commands to find them. With this new function, it’s as simple as the following query:

1SELECT path, name, formatReadableSize(size), modification_time
2FROM filesystem('/Users/markhneedham/projects/videos')
3WHERE type = 'regular' AND name LIKE '%.braw'
4ORDER BY size DESC
5LIMIT 3
6FORMAT Vertical;
Row 1:
──────
path:                     /Users/markhneedham/projects/videos/20260212-Sample
name:                     A001_10150625_C183 2.braw
formatReadableSize(size): 26.75 GiB
modification_time:        2025-10-15 06:25:08.529999

Row 2:
──────
path:                     /Users/markhneedham/projects/videos/20260217-AsyncInserts
name:                     A001_09290151_C176.braw
formatReadableSize(size): 21.70 GiB
modification_time:        2025-09-29 01:51:47.820000

Row 3:
──────
path:                     /Users/markhneedham/projects/videos/20260123-PGCHStack
name:                     A001_08021314_C119.braw
formatReadableSize(size): 21.54 GiB
modification_time:        2025-08-02 13:14:33.260000

And I’ve wrapped this query up into a skill that Claude can use to more quickly find files to delete to free up space.

url_base for the url table function #

Contributed by Alexey Milovidov #

If you use the url table function regularly, you've probably typed the same base URL dozens of times. The new url_base setting lets you set it once and use relative paths everywhere instead.

Working with the Amazon customer review dataset, we could set the URL base like this:

1SET url_base = 'https://datasets-documentation.s3.eu-west-3.amazonaws.com/amazon_reviews/';

We could then query the 2014 reviews like this:

1SELECT
2    count(),
3    round(avg(star_rating), 2) AS stars,
4    round(avg(helpful_votes), 2) AS votes
5FROM url('amazon_reviews_2014.snappy.parquet')
β”Œβ”€β”€count()─┬─stars─┬─votes─┐
β”‚ 44127569 β”‚  4.23 β”‚  0.96 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜

And if we want to query 2015:

1SELECT
2    count(),
3    round(avg(star_rating), 2) AS stars,
4    round(avg(helpful_votes), 2) AS votes
5FROM url('amazon_reviews_2015.snappy.parquet')
β”Œβ”€β”€count()─┬─stars─┬─votes─┐
β”‚ 41905631 β”‚  4.25 β”‚  0.74 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”˜

Negative LIMIT BY #

Contributed by Nihal Z. Miaji #

The 26.5 release also adds negative limit by, which lets us pick rows from the end of each group, rather than the beginning.

We’ll use my favorite UK property prices dataset to demonstrate how it works, starting with the following query that finds the median price by district for all the counties that contain the term Yorkshire:

1SELECT county, district, median(price)
2FROM uk_price_paid
3WHERE county ILIKE '%Yorkshire%'
4GROUP BY ALL
5ORDER BY median(price) DESC;
β”Œβ”€county───────────────────┬─district─────────────────┬─median(price)─┐
β”‚ NORTH YORKSHIRE          β”‚ NORTH YORKSHIRE          β”‚        263000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ HARROGATE                β”‚        185000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ HAMBLETON                β”‚        170000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ RYEDALE                  β”‚        160000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ RICHMONDSHIRE            β”‚        150000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ CRAVEN                   β”‚        149250 β”‚
β”‚ NORTH YORKSHIRE          β”‚ SELBY                    β”‚        144995 β”‚
β”‚ EAST RIDING OF YORKSHIRE β”‚ EAST RIDING OF YORKSHIRE β”‚        132000 β”‚
β”‚ WEST YORKSHIRE           β”‚ LEEDS                    β”‚        129997 β”‚
β”‚ NORTH YORKSHIRE          β”‚ SCARBOROUGH              β”‚        120000 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ SHEFFIELD                β”‚        115000 β”‚
β”‚ WEST YORKSHIRE           β”‚ KIRKLEES                 β”‚        114950 β”‚
β”‚ WEST YORKSHIRE           β”‚ WAKEFIELD                β”‚      112997.5 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ ROTHERHAM                β”‚        102500 β”‚
β”‚ WEST YORKSHIRE           β”‚ CALDERDALE               β”‚        101000 β”‚
β”‚ WEST YORKSHIRE           β”‚ BRADFORD                 β”‚        100000 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ DONCASTER                β”‚         98500 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ BARNSLEY                 β”‚         95000 β”‚
β”‚ WEST YORKSHIRE           β”‚ EAST YORKSHIRE           β”‚         94950 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

We could already select the first two rows per county group, i.e., the two districts with the highest median price per county:

1SELECT county, district, median(price)
2FROM uk_price_paid
3WHERE county ILIKE '%Yorkshire%'
4GROUP BY ALL
5ORDER BY median(price) DESC
6LIMIT 2 BY county
β”Œβ”€county───────────────────┬─district─────────────────┬─median(price)─┐
β”‚ NORTH YORKSHIRE          β”‚ NORTH YORKSHIRE          β”‚        262000 β”‚
β”‚ NORTH YORKSHIRE          β”‚ HARROGATE                β”‚        185000 β”‚
β”‚ EAST RIDING OF YORKSHIRE β”‚ EAST RIDING OF YORKSHIRE β”‚      130972.5 β”‚
β”‚ WEST YORKSHIRE           β”‚ LEEDS                    β”‚        130000 β”‚
β”‚ WEST YORKSHIRE           β”‚ KIRKLEES                 β”‚        115000 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ SHEFFIELD                β”‚        115000 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ ROTHERHAM                β”‚        105000 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

But with negative limit by, we can also select the last two rows per county group, i.e., the two districts with the lowest median price per county.

1SELECT county, district, median(price)
2FROM uk_price_paid
3WHERE county ILIKE '%Yorkshire%'
4GROUP BY ALL
5ORDER BY median(price) DESC
6LIMIT -2 BY county;
β”Œβ”€county───────────────────┬─district─────────────────┬─median(price)─┐
β”‚ NORTH YORKSHIRE          β”‚ SELBY                    β”‚        145000 β”‚
β”‚ EAST RIDING OF YORKSHIRE β”‚ EAST RIDING OF YORKSHIRE β”‚        132500 β”‚
β”‚ NORTH YORKSHIRE          β”‚ SCARBOROUGH              β”‚        122000 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ DONCASTER                β”‚         99000 β”‚
β”‚ WEST YORKSHIRE           β”‚ BRADFORD                 β”‚         97500 β”‚
β”‚ SOUTH YORKSHIRE          β”‚ BARNSLEY                 β”‚         94950 β”‚
β”‚ WEST YORKSHIRE           β”‚ EAST YORKSHIRE           β”‚         94950 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Multi-path SQL/JSON #

Contributed by Kevinyhzou, Alexey Milovidov #

When using the JSON_VALUE and JSON_QUERY functions, we can now pass a tuple or array of paths and receive a tuple or array of strings, with JSON parsed only once.

We’re going to work with a JSON string representing the Open House conference, printed out using the new prettyPrintJSON function:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT prettyPrintJSON(conf)FORMAT Raw;
{
    "name": "Open House 2026",
    "tagline": "The real-time database for AI conference",
    "dates": {
        "workshops": "2026-05-26",
        "conference": [
            "2026-05-27",
            "2026-05-28"
        ]
    },
    "venue": {
        "name": "Convene 100 Stockton",
        "address": "40 O'Farrell St, San Francisco, CA 94108"
    }
}

1 row in set. Elapsed: 0.003 sec.

To return strings, for example, if we want to return a tuple containing the name and venue, we use the JSON_VALUE function:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT JSON_VALUE(conf, ('$.name', '$.venue.name'));
β”Œβ”€JSON_VALUE(conf, ('$.name', '$.venue.name'))─┐
β”‚ ('Open House 2026','Convene 100 Stockton')   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

We can also pass in the JSON paths as an array rather than a tuple:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT JSON_VALUE(conf, ['$.name', '$.venue.name']);
β”Œβ”€JSON_VALUE(conf, ['$.name', '$.venue.name'])─┐
β”‚ ['Open House 2026','Convene 100 Stockton']   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

But dates.conference is an array, so if we try to retrieve that using JSON_VALUE, we’ll return an empty string:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT JSON_VALUE(conf, ('$.name', '$.dates.conference'));
β”Œβ”€JSON_VALUE(cβ‹―nference'))─┐
β”‚ ('Open House 2026','')   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

We can read the individual values from that array using zero-based array indexing:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT JSON_VALUE(conf, ('$.dates.conference[0]', '$.dates.conference[1]'));
β”Œβ”€JSON_VALUE(coβ‹―ference[1]'))─┐
β”‚ ('2026-05-27','2026-05-28') β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Alternatively, if we want to return the dates as an array and the whole venue object, we should rather use JSON_QUERY:

1WITH '{
2  "name": "Open House 2026",
3  "tagline": "The real-time database for AI conference",
4  "dates": {
5    "workshops": "2026-05-26",
6    "conference": ["2026-05-27", "2026-05-28"]
7  },
8  "venue": {
9    "name": "Convene 100 Stockton",
10    "address": "40 O''Farrell St, San Francisco, CA 94108"
11  }
12}' AS conf
13SELECT JSON_QUERY(conf, ('$.dates.conference', '$.venue'))
14FORMAT Raw;

The output, formatted for readability, is shown below:

(
  '[["2026-05-27","2026-05-28"]]',
  '[{"name":"Convene 100 Stockton","address":"40 O\'Farrell St, San Francisco, CA 94108"}]'
)

Note that JSON_QUERY always wraps its result in [], so an array value gets double-wrapped.

Web Terminal #

Contributed by Alexey Milovidov #

The 26.5 release also sees the introduction of an experimental in-browser clickhouse-client. You can enabled it by adding the following to a config file:

config.d/webterminal.yaml

1allow_experimental_webterminal: true

You can then navigate to http://localhost:8123/webterminal, where you'll see something like this:

Screenshot 2026-06-01 at 11.06.21.png

Query cache for subqueries #

Contributed by Nikita Barannik, Vincent Voyer #

It's now possible to control query caching on a per-subquery basis.

It's also been possible to enabled the query cache fo the outmost query, using the use_query_cache setting like this:

1SELECT * FROM (SELECT * FROM table) 
2SETTINGS use_query_cache = 1;

If we want to to enable query cache for subquery, from 26.5, we can use that setting as a suffix to the subquery:

1SELECT * 
2FROM (
3  SELECT * 
4  FROM table 
5  SETTINGS use_query_cache = 1
6);

We can also enable propagation of the query cache into all subqueries using the use_query_cache_for_subqueries setting:

1SELECT * FROM (SELECT * FROM table)
2SETTINGS use_query_cache_for_subqueries = 1;

Or, we could enable propagation of query cache into all subqueries but disable it in one of them:

1SELECT * 
2FROM (SELECT * FROM table1) t1
3NATURAL JOIN (SELECT * FROM table2 SETTINGS use_query_cache = 0) t2
4SETTINGS use_query_cache_for_subqueries = 1;

Get started today

Interested in seeing how ClickHouse works on your data? Get started with ClickHouse Cloud in minutes and receive $300 in free credits.
Share this post

Subscribe to our newsletter

Stay informed on feature releases, product roadmap, support, and cloud offerings!
Loading form...