Skip to main content
Skip to main content

Aggregate Functions

ClickHouse supports all standard SQL aggregate functions (sum, avg, min, max, count), as well as a wide range of other aggregate functions.

PageDescription
maxMapCalculates the maximum from value array according to the keys specified in the key array.
meanZTestApplies mean z-test to samples from two populations.
medianThe median* functions are the aliases for the corresponding quantile* functions. They calculate median of a numeric data sample.
minAggregate function that calculates the minimum across a group of values.
minMapCalculates the minimum from value array according to the keys specified in the key array.
quantileComputes an approximate quantile of a numeric data sequence.
quantileGKComputes the quantile of a numeric data sequence using the Greenwald-Khanna algorithm.
quantileBFloat16Computes an approximate quantile of a sample consisting of bfloat16 numbers.
quantileDDComputes an approximate quantile of a sample with relative-error guarantees.
quantileDeterministicComputes an approximate quantile of a numeric data sequence.
quantileExact FunctionsquantileExact, quantileExactLow, quantileExactHigh, quantileExactExclusive, quantileExactInclusive functions
quantileExactWeightedExactly computes the quantile of a numeric data sequence, taking into account the weight of each element.
quantileExactWeightedInterpolatedComputes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element.
quantileInterpolatedWeightedComputes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element.
quantiles Functionsquantiles, quantilesExactExclusive, quantilesExactInclusive, quantilesGK
quantileTDigestComputes an approximate quantile of a numeric data sequence using the t-digest algorithm.
quantileTDigestWeightedComputes an approximate quantile of a numeric data sequence using the t-digest algorithm.
quantileTimingWith the determined precision computes the quantile of a numeric data sequence.
quantileTimingWeightedWith the determined precision computes the quantile of a numeric data sequence according to the weight of each sequence member.
rankCorrComputes a rank correlation coefficient.
simpleLinearRegressionPerforms simple (unidimensional) linear regression.
singleValueOrNullThe aggregate function singleValueOrNull is used to implement subquery operators, such as x = ALL (SELECT ...). It checks if there is only one unique non-NULL value in the data.
skewPopComputes the skewness of a sequence.
skewSampComputes the sample skewness of a sequence.
sparkbarThe function plots a frequency histogram for values x and the repetition rate y of these values over the interval [min_x, max_x].
stddevPopThe result is equal to the square root of varPop.
stddevPopStableThe result is equal to the square root of varPop. Unlike stddevPop, this function uses a numerically stable algorithm.
stddevSampThe result is equal to the square root of varSamp
stddevSampStableThe result is equal to the square root of varSamp. Unlike this function uses a numerically stable algorithm.
stochasticLinearRegressionThis function implements stochastic linear regression. It supports custom parameters for learning rate, L2 regularization coefficient, mini-batch size, and has a few methods for updating weights (Adam, simple SGD, Momentum, Nesterov.)
stochasticLogisticRegressionThis function implements stochastic logistic regression. It can be used for binary classification problem, supports the same custom parameters as stochasticLinearRegression and works the same way.
studentTTestApplies Student's t-test to samples from two populations.
sumCalculates the sum. Only works for numbers.
sumCountCalculates the sum of the numbers and counts the number of rows at the same time. The function is used by ClickHouse query optimizer: if there are multiple sum, count or avg functions in a query, they can be replaced to single sumCount function to reuse the calculations. The function is rarely needed to use explicitly.
sumKahanCalculates the sum of the numbers with Kahan compensated summation algorithm
sumMapTotals a value array according to the keys specified in the key array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys without overflow.
sumMapWithOverflowTotals a value array according to the keys specified in the key array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys. Differs from the sumMap function in that it does summation with overflow.
sumWithOverflowComputes the sum of the numbers, using the same data type for the result as for the input parameters. If the sum exceeds the maximum value for this data type, it is calculated with overflow.
theilsUThe theilsU function calculates the Theil's U uncertainty coefficient, a value that measures the association between two columns in a table.
topKReturns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves).
topKWeightedReturns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account.
uniqCalculates the approximate number of different values of the argument.
uniqCombinedCalculates the approximate number of different argument values.
uniqCombined64Calculates the approximate number of different argument values. It is the same as uniqCombined, but uses a 64-bit hash for all data types rather than just for the String data type.
uniqExactCalculates the exact number of different argument values.
uniqHLL12Calculates the approximate number of different argument values, using the HyperLogLog algorithm.
uniqThetaCalculates the approximate number of different argument values, using the Theta Sketch Framework.
varPopCalculates the population variance.
varPopStableReturns the population variance. Unlike varPop , this function uses a numerically stable algorithm. It works slower but provides a lower computational error.
varSampCalculate the sample variance of a data set.
varSampStableCalculate the sample variance of a data set. Unlike varSamp , this function uses a numerically stable algorithm. It works slower but provides a lower computational error.
welchTTestApplies Welch's t-test to samples from two populations.
anyHeavySelects a frequently occurring value using the heavy hitters algorithm. If there is a value that occurs more than in half the cases in each of the query's execution threads, this value is returned. Normally, the result is nondeterministic.
anyLastSelects the last encountered value of a column.
approx_top_kReturns an array of the approximately most frequent values and their counts in the specified column.
approx_top_sumReturns an array of the approximately most frequent values and their counts in the specified column.
argMaxCalculates the arg value for a maximum val value.
argMinCalculates the arg value for a minimum val value. If there are multiple rows with equal val being the maximum, which of the associated arg is returned is not deterministic.
avgCalculates the arithmetic mean.
avgWeightedCalculates the weighted arithmetic mean.
boundingRatioAggregate function that calculates the slope between the leftmost and rightmost points across a group of values.
categoricalInformationValueCalculates the value of (P(tag = 1) - P(tag = 0))(log(P(tag = 1)) - log(P(tag = 0))) for each category.
contingencyThe contingency function calculates the contingency coefficient, a value that measures the association between two columns in a table. The computation is similar to the cramersV function but with a different denominator in the square root.
corrCalculates the Pearson correlation coefficient.
corrMatrixComputes the correlation matrix over N variables.
corrStableCalculates the Pearson correlation coefficient, but uses a numerically stable algorithm.
countCounts the number of rows or not-NULL values.
covarPopCalculates the population covariance
covarPopMatrixReturns the population covariance matrix over N variables.
covarPopStableCalculates the value of the population covariance
covarSampCalculates the value of Σ((x - x̅)(y - y̅)) / (n - 1)
covarSampMatrixReturns the sample covariance matrix over N variables.
covarSampStableSimilar to covarSamp but works slower while providing a lower computational error.
cramersVThe result of the cramersV function ranges from 0 (corresponding to no association between the variables) to 1 and can reach 1 only when each value is completely determined by the other. It may be viewed as the association between two variables as a percentage of their maximum possible variation.
cramersVBiasCorrectedCalculates Cramer's V, but uses a bias correction.
deltaSumSums the arithmetic difference between consecutive rows.
deltaSumTimestampAdds the difference between consecutive rows. If the difference is negative, it is ignored.
distinctDynamicTypesCalculates the list of distinct data types stored in Dynamic column.
distinctJSONPathsCalculates a list of distinct paths stored in a JSON column.
entropyCalculates Shannon entropy of for a column of values.
exponentialMovingAverageCalculates the exponential moving average of values for the determined time.
exponentialTimeDecayedAvgReturns the exponentially smoothed weighted moving average of values of a time series at point t in time.
exponentialTimeDecayedCountReturns the cumulative exponential decay over a time series at the index t in time.
exponentialTimeDecayedMaxReturns the maximum of the computed exponentially smoothed moving average at index t in time with that at t-1.
exponentialTimeDecayedSumReturns the sum of exponentially smoothed moving average values of a time series at the index t in time.
first_valueIt is an alias for any but it was introduced for compatibility with Window Functions, where sometimes it's necessary to process NULL values (by default all ClickHouse aggregate functions ignore NULL values).
flameGraphAggregate function which builds a flamegraph using the list of stacktraces.
groupArrayCreates an array of argument values. Values can be added to the array in any (indeterminate) order.
groupArrayArrayAggregates arrays into a larger array of those arrays.
groupArrayInsertAtInserts a value into the array at the specified position.
groupArrayIntersectReturn an intersection of given arrays (Return all items of arrays, that are in all given arrays).
groupArrayLastCreates an array of the last argument values.
groupArrayMovingAvgCalculates the moving average of input values.
groupArrayMovingSumCalculates the moving sum of input values.
groupArraySampleCreates an array of sample argument values. The size of the resulting array is limited to max_size elements. Argument values are selected and added to the array randomly.
groupArraySortedReturns an array with the first N items in ascending order.
groupBitAndApplies bit-wise AND for series of numbers.
groupBitmapBitmap or Aggregate calculations from a unsigned integer column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object
groupBitmapAndCalculations the AND of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object.
groupBitmapOrCalculations the OR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object. This is equivalent to groupBitmapMerge.
groupBitmapXorCalculates the XOR of a bitmap column, and returns the cardinality of type UInt64, if used with suffix -State, then it returns a bitmap object
groupBitOrApplies bit-wise OR to a series of numbers.
groupBitXorApplies bit-wise XOR for series of numbers.
groupConcatCalculates a concatenated string from a group of strings, optionally separated by a delimiter, and optionally limited by a maximum number of elements.
groupUniqArrayCreates an array from different argument values.
intervalLengthSumCalculates the total length of union of all ranges (segments on numeric axis).
kolmogorovSmirnovTestApplies Kolmogorov-Smirnov's test to samples from two populations.
kurtPopComputes the kurtosis of a sequence.
kurtSampComputes the sample kurtosis of a sequence.
largestTriangleThreeBucketsApplies the Largest-Triangle-Three-Buckets algorithm to the input data.
last_valueSelects the last encountered value, similar to anyLast, but could accept NULL.
mannWhitneyUTestApplies the Mann-Whitney rank test to samples from two populations.
maxAggregate function that calculates the maximum across a group of values.
maxIntersectionsAggregate function that calculates the maximum number of times that a group of intervals intersects each other (if all the intervals intersect at least once).
maxIntersectionsPositionAggregate function that calculates the positions of the occurrences of the maxIntersections function.
aggThrowThis function can be used for the purpose of testing exception safety. It will throw an exception on creation with the specified probability.
analysisOfVarianceProvides a statistical test for one-way analysis of variance (ANOVA test). It is a test over several groups of normally distributed observations to find out whether all groups have the same mean or not.
anySelects the first encountered value of a column.