Our ClickHouse 25.6 release post was already packed, but it turns out we left one very cool new feature in the cooler. π
So hereβs your summer bonus blog, just in time for your beach read: a dedicated spotlight on CoalescingMergeTree, a brand new table engine designed to consolidate sparse updates and reduce your row count without sacrificing fidelity.
If youβre the type to check ClickHouse updates between sunscreen applications, this oneβs for you. ποΈ
Letβs dive in.
CoalescingMergeTree table engine #
Contributed by Konstantin Vedernikov #
Itβs not every day the MergeTree table engine family gets a new member, so when it does, itβs worth celebrating with a dedicated post.
The newest addition, CoalescingMergeTree, gradually consolidates sparse records over time and is especially useful when:
- You want to efficiently retain only the most complete version of each entity.
- Youβre okay with eventual (on-disk) consolidation during merges.
- You want to avoid full row overwrites (e.g. via ReplacingMergeTree) and just fill in missing values.
A great example: IoT device state or configuration snapshots. (Think: observability for a car fleet, e.g., Tesla)
In modern connected vehicles like electric cars, telemetry updates are fragmented across subsystems:
- The battery reports its charge
- The GPS module sends location
- The software updater reports firmware
- Sensors periodically update temperature and speed
We want to combine these incremental, sparse updates into a complete per-vehicle view using CoalescingMergeTree.
Table Definition #
CREATE TABLE electric_vehicle_state
(
    vin String, -- vehicle identification number
    last_update DateTime64 Materialized now64(), -- optional (used with argMax)
    battery_level Nullable(UInt8), -- in %
    lat Nullable(Float64), -- latitude (Β°)
    lon Nullable(Float64), -- longitude (Β°)
    firmware_version Nullable(String),
    cabin_temperature Nullable(Float32), -- in Β°C
    speed_kmh Nullable(Float32) -- from sensor
)
ENGINE = CoalescingMergeTree
ORDER BY vin;
Example Inserts #
-- β Initial battery and firmware readings INSERT INTO electric_vehicle_state VALUES ('5YJ3E1EA7KF000001', 82, NULL, NULL, '2024.14.5', NULL, NULL); -- β‘ GPS reports in later INSERT INTO electric_vehicle_state VALUES ('5YJ3E1EA7KF000001', NULL, 37.7749, -122.4194, NULL, NULL, NULL); -- β’ Sensor update: temperature + speed INSERT INTO electric_vehicle_state VALUES ('5YJ3E1EA7KF000001', NULL, NULL, NULL, NULL, 22.5, 67.3); -- β£ Battery drops to 78% INSERT INTO electric_vehicle_state VALUES ('5YJ3E1EA7KF000001', 78, NULL, NULL, NULL, NULL, NULL); -- β€ Another car, initial firmware and temp readings INSERT INTO electric_vehicle_state VALUES ('5YJ3E1EA7KF000099', NULL, NULL, NULL, '2024.14.5', 19.2, NULL);
Why not just use UPDATE? #
In many databases, handling state changes like these would involve updating the existing row, for example, issuing an UPDATE to set the latest temperature or firmware version. ClickHouse itself does support fast and frequent single-row UPDATEs, watch our update (ha!) on that at the recent Open House 2025 user conference, and for some workloads, theyβre perfectly viable.
But in high-throughput IoT scenarios, that pattern can be inefficient: an UPDATE requires locating the row with the previous state, rewriting it, and often locking or rewriting more data than necessary.
ClickHouse encourages a simpler, append-only model: just insert the new fields as they arrive. CoalescingMergeTree takes that one step further by letting the engine coalesce those sparse inserts into a complete, compact record automatically, during background merges (we will dive into that below). Itβs a better fit for high-ingest, high-cardinality use cases where updates are frequent but partial.
ClickHouse is especially optimized for insert workloads: inserts are fully isolated, run in parallel without interfering with each other, and hit disk at full speed, because there are no global structures to lock or update (e.g., a global B++ index). Inserts are kept lightweight by deferring additional workβsuch as record consolidationβto background merges.
This architecture supports extremely high-throughput ingestion; in one production deployment, ClickHouse sustained over 1 billion rows per second with stable memory and CPU usage.
The catch: trillions of rows #
In high-volume IoT setups, data arrives constantly from thousands (or millions) of devices. Each update often modifies only one or two fields, leading to massive tables full of sparse, redundant rows.
To reconstruct the latest full state per device, you need to pull the most recent non-null value for each column. ClickHouse already supports this using the argMax() aggregate function, even if the table uses a regular MergeTree engine.
Hereβs the current content of the electric_vehicle_state table after a few sparse updates (assuming no part merges have occurred yet, weβll explain this further below). last_update timestamps are shortened for brevity:
SELECT vin, right(toString(last_update), 12) AS last_update, battery_level AS batt, lat, lon, firmware_version AS fw, cabin_temperature AS temp, speed_kmh AS speed FROM electric_vehicle_state ORDER BY vin ASC;
ββvinββββββββββββββββ¬βlast_updateβββ¬βbattββ¬βββββlatββ¬βββββββlonββ¬βfwβββββββββ¬βtempββ¬βspeedββ β 5YJ3E1EA7KF000001 β 10:41:37.731 β 82 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β β 5YJ3E1EA7KF000001 β 10:41:37.734 β α΄Ία΅α΄Έα΄Έ β 37.7749 β -122.4194 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β β 5YJ3E1EA7KF000001 β 10:41:37.737 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 22.5 β 67.3 β β 5YJ3E1EA7KF000001 β 10:41:37.739 β 78 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β β 5YJ3E1EA7KF000099 β 10:41:37.742 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β 19.2 β α΄Ία΅α΄Έα΄Έ β βββββββββββββββββββββ΄βββββββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββ΄ββββββββ
And hereβs how youβd query the latest state per device using argMax():
SELECT vin, argMax(battery_level, last_update) AS batt, argMax(lat, last_update) AS lat, argMax(lon, last_update) AS lon, argMax(firmware_version, last_update) AS fw, argMax(cabin_temperature, last_update) AS temp, argMax(speed_kmh, last_update) AS speed FROM electric_vehicle_state GROUP BY vin ORDER BY vin;
ββvinββββββββββββββββ¬βbattββ¬βββββlatββ¬βββββββlonββ¬βfwβββββββββ¬βtempββ¬βspeedββ β 5YJ3E1EA7KF000001 β 78 β 37.7749 β -122.4194 β 2024.14.5 β 22.5 β 67.3 β β 5YJ3E1EA7KF000099 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β 19.2 β α΄Ία΅α΄Έα΄Έ β βββββββββββββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββ΄ββββββββ
argMax works well here because it keeps only the latest non-null value per column.
Specifically,argMax(x, t)returns the value ofxfrom the row with the largestt(e.g., latest timestampt) where x is not null. If rows with even largertvalues (e.g., newer rows) exist but x is null in those, they are skipped.
However, relying purely on regular MergeTree and argMax() means:
- All rows must be scanned
- Aggregation must be computed on the fly, every time
This becomes inefficient at scale, especially when the table grows to billions or trillions of rows.
Why CoalescingMergeTree helps #
With CoalescingMergeTree, ClickHouse physically consolidates rows that share the same sorting key (e.g. VIN). During background part merges, it keeps only the latest non-null value per column, effectively pre-aggregating the state on disk:

The diagram above shows how sparse updates from different subsystems (battery, GPS, sensors) are gradually merged into a single consolidated state by CoalescingMergeTree. Each update only modifies a subset of columns, and ClickHouse automatically combines rows that share the same sorting key (vin in this example) during background merges to form the latest complete state per vehicle.
We visualized seven data parts: the original inserts (β ββ£), intermediate merged parts (β€, β₯), and the final active part β¦. After merging, parts β ββ₯ become inactive and are automatically removed. All queries now run efficiently over the single, compact part β¦.
Note: βLatestβ here is not determined by a row's timestamp column, but by the insertion order of rows. During merges,
CoalescingMergeTreeuses the physical on-disk order of rows from the parts being merged. For each column, it keeps the latest non-null value for a given sorting key, where βlatestβ means the value from the last (i.e., newest) applicable row in the most recently written part among the parts being merged.
For example, in the diagram above, parts β ββ¦ are shown in write order, with β being the earliest and β¦ the latest.
This drastically reduces:
- The number of rows per device
- The amount of data scanned during queries
Reading the latest state: argMax or FINAL #
In high-ingest scenarios like IoT, background merges are perpetually behind (i.e., merges run continuously, but inserts outpace them, think of it as a lagging consolidation window).
To maintain accuracy despite that, we still use GROUP BY and argMax(), but now over far fewer rows, since background merges have already handled most of the consolidation.
(In the diagram above, we skipped the
last_updatecolumn for clarity. It plays no role in CoalescingMergeTreeβs merge logic. Like all other columns, its latest non-null value is retained automatically during merges. However, if youβre usingargMax(), a timestamp likelast_updateis required to determine which row is considered βlatestβ at query time.)
SELECT vin, argMax(battery_level, last_update) AS batt, argMax(lat, last_update) AS lat, argMax(lon, last_update) AS lon, argMax(firmware_version, last_update) AS fw, argMax(cabin_temperature, last_update) AS temp, argMax(speed_kmh, last_update) AS speed FROM electric_vehicle_state GROUP BY vin ORDER BY vin;
ββvinββββββββββββββββ¬βbattββ¬βββββlatββ¬βββββββlonββ¬βfwβββββββββ¬βtempββ¬βspeedββ β 5YJ3E1EA7KF000001 β 78 β 37.7749 β -122.4194 β 2024.14.5 β 22.5 β 67.3 β β 5YJ3E1EA7KF000099 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β 19.2 β α΄Ία΅α΄Έα΄Έ β βββββββββββββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββ΄ββββββββ
Just for completeness: if all parts have been fully merged and no new rows are being inserted, a simple SELECT * without GROUP BY is sufficient:
OPTIMIZE TABLE electric_vehicle_state FINAL; -- force merge all parts into a single part SELECT -- select from the fully consolidated, merged part vin, battery_level AS batt, lat AS lat, lon AS lon, firmware_version AS fw, cabin_temperature AS temp, speed_kmh AS speed FROM electric_vehicle_state ORDER BY vin;
ββvinββββββββββββββββ¬βbattββ¬βββββlatββ¬βββββββlonββ¬βfwβββββββββ¬βtempββ¬βspeedββ β 5YJ3E1EA7KF000001 β 78 β 37.7749 β -122.4194 β 2024.14.5 β 22.5 β 67.3 β β 5YJ3E1EA7KF000099 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β 19.2 β α΄Ία΅α΄Έα΄Έ β βββββββββββββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββ΄ββββββββ
Alternatively, in scenarios with a constant stream of updates, you can avoid aggregating at query time by applying CoalescingMergeTree's coalescing logic (described above with the diagram) ephemerally using the FINAL modifier. This doesnβt finalize merges on diskβit performs an in-memory merge of all relevant parts as they existed when the query began, producing fully consolidated rows without needing GROUP BY or argMax():
SELECT vin, battery_level AS batt, lat AS lat, lon AS lon, firmware_version AS fw, cabin_temperature AS temp, speed_kmh AS speed FROM electric_vehicle_state FINAL ORDER BY vin;
ββvinββββββββββββββββ¬βbattββ¬βββββlatββ¬βββββββlonββ¬βfwβββββββββ¬βtempββ¬βspeedββ β 5YJ3E1EA7KF000001 β 78 β 37.7749 β -122.4194 β 2024.14.5 β 22.5 β 67.3 β β 5YJ3E1EA7KF000099 β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β α΄Ία΅α΄Έα΄Έ β 2024.14.5 β 19.2 β α΄Ία΅α΄Έα΄Έ β βββββββββββββββββββββ΄βββββββ΄ββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββ΄ββββββββ
Note: With the
FINALmodifier in the query, the tableβslast_updatefield becomes optional becauseFINALapplies CoalescingMergeTreeβs own merge logic at query time. That logic determines recency based on the insertion order of rows, not a timestamp column.
However, if you query withoutFINALand rely onargMax()instead, a timestamp likelast_updateis still needed to correctly identify the latest non-null values at query time when parts havenβt fully mergedβbecause at that point, the query engine doesnβt have direct access to the write timestamps of data parts or physical row numbers to reliably infer insertion order.
Recommended storage pattern: raw + coalesced tables #
We recommend using CoalescingMergeTree together with a regular MergeTree table (typically via a materialized view):
- 
Use the MergeTree table to store the raw event stream, keeping the full history of all incoming updates from each vehicle or device. 
- 
Use a CoalescingMergeTree table to store a coalesced view, for example, the latest known state per vehicle, ideal for dashboards, periodic reports, or analytical queries. 
This dual-table pattern ensures you retain all granular data while still benefiting from the efficiency of physical row consolidation. It also provides a safety net: if the sorting key doesnβt uniquely identify each event, your raw table still holds the complete record for recovery or reprocessing.
More use cases #
Beyond IoT snapshots, CoalescingMergeTree is helpful anywhere sparse, append-only updates gradually enrich a record. For example:
- 
User profile enrichment β fields like email, phone, or location get filled in as the user interacts. 
- 
Security audit trails β events slowly add context (e.g., actor identity, affected system). 
- 
ETL pipelines with late-arriving dimensions β enrichment steps populate missing fields. 
- 
Patient health records β lab results, doctor notes, and vitals arrive over time from different systems. 
- 
Ad or campaign tracking β impressions and clicks arrive from different systems at different times. 
- 
Customer support cases β tickets evolve as more info (severity, resolution notes) is gathered. 
In all these cases, CoalescingMergeTree reduces storage costs and query latency without sacrificing completeness.
Thatβs it, back to your regularly scheduled summer #
Now that your ClickHouse knowledge is refreshed, itβs time to refresh yourself.
Weβll be here when you get back. ποΈπΉ
Happy summer, and happy querying!




