Aggregate Functions

ClickHouse supports all standard SQL aggregate functions (sum, avg, min, max, count), as well as a wide range of other aggregate functions.

Page	Description
aggThrow	This function can be used for the purpose of testing exception safety. It will throw an exception on creation with the specified probability.
analysisOfVariance	Provides a statistical test for one-way analysis of variance (ANOVA test). It is a test over several groups of normally distributed observations to find out whether all groups have the same mean or not.
any	Selects the first encountered value of a column.
anyHeavy	Selects a frequently occurring value using the heavy hitters algorithm. If there is a value that occurs more than in half the cases in each of the query execution threads, this value is returned. Normally, the result is nondeterministic.
anyLast	Selects the last encountered value of a column.
approx_top_k	Returns an array of the approximately most frequent values and their counts in the specified column.
approx_top_sum	Returns an array of the approximately most frequent values and their counts in the specified column.
argMax	Calculates the `arg` value for a maximum `val` value.
argMin	Calculates the `arg` value for a minimum `val` value. If there are multiple rows with equal `val` being the maximum, which of the associated `arg` is returned is not deterministic.
avg	Calculates the arithmetic mean.
avgWeighted	Calculates the weighted arithmetic mean.
boundingRatio	Aggregate function that calculates the slope between the leftmost and rightmost points across a group of values.
categoricalInformationValue	Calculates the value of `(P(tag = 1) - P(tag = 0))(log(P(tag = 1)) - log(P(tag = 0)))` for each category.
contingency	The `contingency` function calculates the contingency coefficient, a value that measures the association between two columns in a table. The computation is similar to the `cramersV` function but with a different denominator in the square root.
corr	Calculates the Pearson correlation coefficient.
corrMatrix	Computes the correlation matrix over N variables.
corrStable	Calculates the Pearson correlation coefficient, but uses a numerically stable algorithm.
count	Counts the number of rows or not-NULL values.
covarPop	Calculates the population covariance
covarPopMatrix	Returns the population covariance matrix over N variables.
covarPopStable	Calculates the value of the population covariance
covarSamp	Calculates the value of `Σ((x - x̅)(y - y̅)) / (n - 1)`
covarSampMatrix	Returns the sample covariance matrix over N variables.
covarSampStable	Similar to covarSamp but works slower while providing a lower computational error.
cramersV	The result of the `cramersV` function ranges from 0 (corresponding to no association between the variables) to 1 and can reach 1 only when each value is completely determined by the other. It may be viewed as the association between two variables as a percentage of their maximum possible variation.
cramersVBiasCorrected	Calculates Cramer's V, but uses a bias correction.
deltaSum	Sums the arithmetic difference between consecutive rows.
deltaSumTimestamp	Adds the difference between consecutive rows. If the difference is negative, it is ignored.
distinctDynamicTypes	Calculates the list of distinct data types stored in Dynamic column.
distinctJSONPaths	Calculates a list of distinct paths stored in a JSON column.
entropy	Calculates Shannon entropy of for a column of values.
estimateCompressionRatio	Estimates the compression ratio of a given column without compressing it.
exponentialMovingAverage	Calculates the exponential moving average of values for the determined time.
exponentialTimeDecayedAvg	Returns the exponentially smoothed weighted moving average of values of a time series at point `t` in time.
exponentialTimeDecayedCount	Returns the cumulative exponential decay over a time series at the index `t` in time.
exponentialTimeDecayedMax	Returns the maximum of the computed exponentially smoothed moving average at index `t` in time with that at `t-1`.
exponentialTimeDecayedSum	Returns the sum of exponentially smoothed moving average values of a time series at the index `t` in time.
first_value	It is an alias for any but it was introduced for compatibility with Window Functions, where sometimes it is necessary to process `NULL` values (by default all ClickHouse aggregate functions ignore NULL values).
flameGraph	Aggregate function which builds a flamegraph using the list of stacktraces.
groupArray	Creates an array of argument values. Values can be added to the array in any (indeterminate) order.
groupArrayArray	Aggregates arrays into a larger array of those arrays.
groupArrayInsertAt	Inserts a value into the array at the specified position.
groupArrayIntersect	Return an intersection of given arrays (Return all items of arrays, that are in all given arrays).
groupArrayLast	Creates an array of the last argument values.
groupArrayMovingAvg	Calculates the moving average of input values.
groupArrayMovingSum	Calculates the moving sum of input values.
groupArraySample	Creates an array of sample argument values. The size of the resulting array is limited to `max_size` elements. Argument values are selected and added to the array randomly.
groupArraySorted	Returns an array with the first N items in ascending order.
groupBitAnd	Applies bit-wise `AND` for series of numbers.
groupBitmap	Bitmap or Aggregate calculations from a unsigned integer column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object
groupBitmapAnd	Calculations the AND of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object.
groupBitmapOr	Calculations the OR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object. This is equivalent to `groupBitmapMerge`.
groupBitmapXor	Calculates the XOR of a bitmap column, and returns the cardinality of type UInt64, if used with suffix -State, then it returns a bitmap object
groupBitOr	Applies bit-wise `OR` to a series of numbers.
groupBitXor	Applies bit-wise `XOR` for series of numbers.
groupConcat	Calculates a concatenated string from a group of strings, optionally separated by a delimiter, and optionally limited by a maximum number of elements.
groupUniqArray	Creates an array from different argument values.
intervalLengthSum	Calculates the total length of union of all ranges (segments on numeric axis).
kolmogorovSmirnovTest	Applies Kolmogorov-Smirnov's test to samples from two populations.
kurtPop	Computes the kurtosis of a sequence.
kurtSamp	Computes the sample kurtosis of a sequence.
largestTriangleThreeBuckets	Applies the Largest-Triangle-Three-Buckets algorithm to the input data.
last_value	Selects the last encountered value, similar to `anyLast`, but could accept NULL.
mannWhitneyUTest	Applies the Mann-Whitney rank test to samples from two populations.
max	Aggregate function that calculates the maximum across a group of values.
maxIntersections	Aggregate function that calculates the maximum number of times that a group of intervals intersects each other (if all the intervals intersect at least once).
maxIntersectionsPosition	Aggregate function that calculates the positions of the occurrences of the maxIntersections function.
maxMap	Calculates the maximum from `value` array according to the keys specified in the `key` array.
meanZTest	Applies mean z-test to samples from two populations.
median	The `median` functions are the aliases for the corresponding `quantile` functions. They calculate median of a numeric data sample.
min	Aggregate function that calculates the minimum across a group of values.
minMap	Calculates the minimum from `value` array according to the keys specified in the `key` array.
quantile	Computes an approximate quantile of a numeric data sequence.
quantileGK	Computes the quantile of a numeric data sequence using the Greenwald-Khanna algorithm.
quantileBFloat16	Computes an approximate quantile of a sample consisting of bfloat16 numbers.
quantileDD	Computes an approximate quantile of a sample with relative-error guarantees.
quantileDeterministic	Computes an approximate quantile of a numeric data sequence.
quantileExact Functions	quantileExact, quantileExactLow, quantileExactHigh, quantileExactExclusive, quantileExactInclusive functions
quantileExactWeighted	Exactly computes the quantile of a numeric data sequence, taking into account the weight of each element.
quantileExactWeightedInterpolated	Computes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element.
quantileInterpolatedWeighted	Computes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element.
quantiles Functions	quantiles, quantilesExactExclusive, quantilesExactInclusive, quantilesGK
quantileTDigest	Computes an approximate quantile of a numeric data sequence using the t-digest algorithm.
quantileTDigestWeighted	Computes an approximate quantile of a numeric data sequence using the t-digest algorithm.
quantileTiming	With the determined precision computes the quantile of a numeric data sequence.
quantileTimingWeighted	With the determined precision computes the quantile of a numeric data sequence according to the weight of each sequence member.
rankCorr	Computes a rank correlation coefficient.
simpleLinearRegression	Performs simple (unidimensional) linear regression.
singleValueOrNull	The aggregate function `singleValueOrNull` is used to implement subquery operators, such as `x = ALL (SELECT ...)`. It checks if there is only one unique non-NULL value in the data.
skewPop	Computes the skewness of a sequence.
skewSamp	Computes the sample skewness of a sequence.
sparkbar	The function plots a frequency histogram for values `x` and the repetition rate `y` of these values over the interval `[min_x, max_x]`.
stddevPop	The result is equal to the square root of varPop.
stddevPopStable	The result is equal to the square root of varPop. Unlike stddevPop, this function uses a numerically stable algorithm.
stddevSamp	The result is equal to the square root of varSamp
stddevSampStable	The result is equal to the square root of varSamp. Unlike this function uses a numerically stable algorithm.
stochasticLinearRegression	This function implements stochastic linear regression. It supports custom parameters for learning rate, L2 regularization coefficient, mini-batch size, and has a few methods for updating weights (Adam, simple SGD, Momentum, Nesterov.)
stochasticLogisticRegression	This function implements stochastic logistic regression. It can be used for binary classification problem, supports the same custom parameters as stochasticLinearRegression and works the same way.
studentTTest	Applies the student t-test to samples from two populations.
sum	Calculates the sum. Only works for numbers.
sumCount	Calculates the sum of the numbers and counts the number of rows at the same time. The function is used by ClickHouse query optimizer: if there are multiple `sum`, `count` or `avg` functions in a query, they can be replaced to single `sumCount` function to reuse the calculations. The function is rarely needed to use explicitly.
sumKahan	Calculates the sum of the numbers with Kahan compensated summation algorithm
sumMap	Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys without overflow.
sumMapWithOverflow	Totals a `value` array according to the keys specified in the `key` array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys. Differs from the sumMap function in that it does summation with overflow.
sumWithOverflow	Computes the sum of the numbers, using the same data type for the result as for the input parameters. If the sum exceeds the maximum value for this data type, it is calculated with overflow.
theilsU	The `theilsU` function calculates Theils' U uncertainty coefficient, a value that measures the association between two columns in a table.
topK	Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves).
topKWeighted	Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.
uniq	Calculates the approximate number of different values of the argument.
uniqCombined	Calculates the approximate number of different argument values.
uniqCombined64	Calculates the approximate number of different argument values. It is the same as uniqCombined, but uses a 64-bit hash for all data types rather than just for the String data type.
uniqExact	Calculates the exact number of different argument values.
uniqHLL12	Calculates the approximate number of different argument values, using the HyperLogLog algorithm.
uniqTheta	Calculates the approximate number of different argument values, using the Theta Sketch Framework.
varPop	Calculates the population variance.
varPopStable	Returns the population variance. Unlike varPop , this function uses a numerically stable algorithm. It works slower but provides a lower computational error.
varSamp	Calculate the sample variance of a data set.
varSampStable	Calculate the sample variance of a data set. Unlike `varSamp` , this function uses a numerically stable algorithm. It works slower but provides a lower computational error.
welchTTest	Applies Welch's t-test to samples from two populations.