Skip to main content
Skip to main content

uniqCombined

Calculates the approximate number of different argument values.

The uniqCombined function is a good choice for calculating the number of different values.

Arguments

  • HLL_precision: The base-2 logarithm of the number of cells in HyperLogLog. Optional, you can use the function as uniqCombined(x[, ...]). The default value for HLL_precision is 17, which is effectively 96 KiB of space (2^17 cells, 6 bits each).
  • X: A variable number of parameters. Parameters can be Tuple, Array, Date, DateTime, String, or numeric types.

Returned value

  • A number UInt64-type number.

Implementation details

The uniqCombined function:

  • Calculates a hash (64-bit hash for String and 32-bit otherwise) for all parameters in the aggregate, then uses it in calculations.
  • Uses a combination of three algorithms: array, hash table, and HyperLogLog with an error correction table.
    • For a small number of distinct elements, an array is used.
    • When the set size is larger, a hash table is used.
    • For a larger number of elements, HyperLogLog is used, which will occupy a fixed amount of memory.
  • Provides the result deterministically (it does not depend on the query processing order).
note

Since it uses a 32-bit hash for non-String types, the result will have very high error for cardinalities significantly larger than UINT_MAX (error will raise quickly after a few tens of billions of distinct values), hence in this case you should use uniqCombined64.

Compared to the uniq function, the uniqCombined function:

  • Consumes several times less memory.
  • Calculates with several times higher accuracy.
  • Usually has slightly lower performance. In some scenarios, uniqCombined can perform better than uniq, for example, with distributed queries that transmit a large number of aggregation states over the network.

Example

Query:

Result:

See the example section of uniqCombined64 for an example of the difference between uniqCombined and uniqCombined64 for much larger inputs.

See Also

Try ClickHouse Cloud for FREE

Separation of storage and compute, automatic scaling, built-in SQL console, and lots more. $300 in free credits when signing up.

Try it for Free