Aggregate Functions
ClickHouse supports all standard SQL aggregate functions (sum, avg, min, max, count), as well as a wide range of other aggregate functions.
Page | Description |
---|---|
maxMap | Calculates the maximum from value array according to the keys specified in the key array. |
meanZTest | Applies mean z-test to samples from two populations. |
median | The median* functions are the aliases for the corresponding quantile* functions. They calculate median of a numeric data sample. |
min | Aggregate function that calculates the minimum across a group of values. |
minMap | Calculates the minimum from value array according to the keys specified in the key array. |
quantile | Computes an approximate quantile of a numeric data sequence. |
quantileGK | Computes the quantile of a numeric data sequence using the Greenwald-Khanna algorithm. |
quantileBFloat16 | Computes an approximate quantile of a sample consisting of bfloat16 numbers. |
quantileDD | Computes an approximate quantile of a sample with relative-error guarantees. |
quantileDeterministic | Computes an approximate quantile of a numeric data sequence. |
quantileExact Functions | quantileExact, quantileExactLow, quantileExactHigh, quantileExactExclusive, quantileExactInclusive functions |
quantileExactWeighted | Exactly computes the quantile of a numeric data sequence, taking into account the weight of each element. |
quantileExactWeightedInterpolated | Computes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element. |
quantileInterpolatedWeighted | Computes quantile of a numeric data sequence using linear interpolation, taking into account the weight of each element. |
quantiles Functions | quantiles, quantilesExactExclusive, quantilesExactInclusive, quantilesGK |
quantileTDigest | Computes an approximate quantile of a numeric data sequence using the t-digest algorithm. |
quantileTDigestWeighted | Computes an approximate quantile of a numeric data sequence using the t-digest algorithm. |
quantileTiming | With the determined precision computes the quantile of a numeric data sequence. |
quantileTimingWeighted | With the determined precision computes the quantile of a numeric data sequence according to the weight of each sequence member. |
rankCorr | Computes a rank correlation coefficient. |
simpleLinearRegression | Performs simple (unidimensional) linear regression. |
singleValueOrNull | The aggregate function singleValueOrNull is used to implement subquery operators, such as x = ALL (SELECT ...) . It checks if there is only one unique non-NULL value in the data. |
skewPop | Computes the skewness of a sequence. |
skewSamp | Computes the sample skewness of a sequence. |
sparkbar | The function plots a frequency histogram for values x and the repetition rate y of these values over the interval [min_x, max_x] . |
stddevPop | The result is equal to the square root of varPop. |
stddevPopStable | The result is equal to the square root of varPop. Unlike stddevPop, this function uses a numerically stable algorithm. |
stddevSamp | The result is equal to the square root of varSamp |
stddevSampStable | The result is equal to the square root of varSamp. Unlike this function uses a numerically stable algorithm. |
stochasticLinearRegression | This function implements stochastic linear regression. It supports custom parameters for learning rate, L2 regularization coefficient, mini-batch size, and has a few methods for updating weights (Adam, simple SGD, Momentum, Nesterov.) |
stochasticLogisticRegression | This function implements stochastic logistic regression. It can be used for binary classification problem, supports the same custom parameters as stochasticLinearRegression and works the same way. |
studentTTest | Applies Student's t-test to samples from two populations. |
sum | Calculates the sum. Only works for numbers. |
sumCount | Calculates the sum of the numbers and counts the number of rows at the same time. The function is used by ClickHouse query optimizer: if there are multiple sum , count or avg functions in a query, they can be replaced to single sumCount function to reuse the calculations. The function is rarely needed to use explicitly. |
sumKahan | Calculates the sum of the numbers with Kahan compensated summation algorithm |
sumMap | Totals a value array according to the keys specified in the key array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys without overflow. |
sumMapWithOverflow | Totals a value array according to the keys specified in the key array. Returns a tuple of two arrays: keys in sorted order, and values summed for the corresponding keys. Differs from the sumMap function in that it does summation with overflow. |
sumWithOverflow | Computes the sum of the numbers, using the same data type for the result as for the input parameters. If the sum exceeds the maximum value for this data type, it is calculated with overflow. |
theilsU | The theilsU function calculates the Theil's U uncertainty coefficient, a value that measures the association between two columns in a table. |
topK | Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). |
topKWeighted | Returns an array of the approximately most frequent values in the specified column. The resulting array is sorted in descending order of approximate frequency of values (not by the values themselves). Additionally, the weight of the value is taken into account. |
uniq | Calculates the approximate number of different values of the argument. |
uniqCombined | Calculates the approximate number of different argument values. |
uniqCombined64 | Calculates the approximate number of different argument values. It is the same as uniqCombined, but uses a 64-bit hash for all data types rather than just for the String data type. |
uniqExact | Calculates the exact number of different argument values. |
uniqHLL12 | Calculates the approximate number of different argument values, using the HyperLogLog algorithm. |
uniqTheta | Calculates the approximate number of different argument values, using the Theta Sketch Framework. |
varPop | Calculates the population variance. |
varPopStable | Returns the population variance. Unlike varPop , this function uses a numerically stable algorithm. It works slower but provides a lower computational error. |
varSamp | Calculate the sample variance of a data set. |
varSampStable | Calculate the sample variance of a data set. Unlike varSamp , this function uses a numerically stable algorithm. It works slower but provides a lower computational error. |
welchTTest | Applies Welch's t-test to samples from two populations. |
anyHeavy | Selects a frequently occurring value using the heavy hitters algorithm. If there is a value that occurs more than in half the cases in each of the query's execution threads, this value is returned. Normally, the result is nondeterministic. |
anyLast | Selects the last encountered value of a column. |
approx_top_k | Returns an array of the approximately most frequent values and their counts in the specified column. |
approx_top_sum | Returns an array of the approximately most frequent values and their counts in the specified column. |
argMax | Calculates the arg value for a maximum val value. |
argMin | Calculates the arg value for a minimum val value. If there are multiple rows with equal val being the maximum, which of the associated arg is returned is not deterministic. |
avg | Calculates the arithmetic mean. |
avgWeighted | Calculates the weighted arithmetic mean. |
boundingRatio | Aggregate function that calculates the slope between the leftmost and rightmost points across a group of values. |
categoricalInformationValue | Calculates the value of (P(tag = 1) - P(tag = 0))(log(P(tag = 1)) - log(P(tag = 0))) for each category. |
contingency | The contingency function calculates the contingency coefficient, a value that measures the association between two columns in a table. The computation is similar to the cramersV function but with a different denominator in the square root. |
corr | Calculates the Pearson correlation coefficient. |
corrMatrix | Computes the correlation matrix over N variables. |
corrStable | Calculates the Pearson correlation coefficient, but uses a numerically stable algorithm. |
count | Counts the number of rows or not-NULL values. |
covarPop | Calculates the population covariance |
covarPopMatrix | Returns the population covariance matrix over N variables. |
covarPopStable | Calculates the value of the population covariance |
covarSamp | Calculates the value of Σ((x - x̅)(y - y̅)) / (n - 1) |
covarSampMatrix | Returns the sample covariance matrix over N variables. |
covarSampStable | Similar to covarSamp but works slower while providing a lower computational error. |
cramersV | The result of the cramersV function ranges from 0 (corresponding to no association between the variables) to 1 and can reach 1 only when each value is completely determined by the other. It may be viewed as the association between two variables as a percentage of their maximum possible variation. |
cramersVBiasCorrected | Calculates Cramer's V, but uses a bias correction. |
deltaSum | Sums the arithmetic difference between consecutive rows. |
deltaSumTimestamp | Adds the difference between consecutive rows. If the difference is negative, it is ignored. |
distinctDynamicTypes | Calculates the list of distinct data types stored in Dynamic column. |
distinctJSONPaths | Calculates a list of distinct paths stored in a JSON column. |
entropy | Calculates Shannon entropy of for a column of values. |
exponentialMovingAverage | Calculates the exponential moving average of values for the determined time. |
exponentialTimeDecayedAvg | Returns the exponentially smoothed weighted moving average of values of a time series at point t in time. |
exponentialTimeDecayedCount | Returns the cumulative exponential decay over a time series at the index t in time. |
exponentialTimeDecayedMax | Returns the maximum of the computed exponentially smoothed moving average at index t in time with that at t-1 . |
exponentialTimeDecayedSum | Returns the sum of exponentially smoothed moving average values of a time series at the index t in time. |
first_value | It is an alias for any but it was introduced for compatibility with Window Functions, where sometimes it's necessary to process NULL values (by default all ClickHouse aggregate functions ignore NULL values). |
flameGraph | Aggregate function which builds a flamegraph using the list of stacktraces. |
groupArray | Creates an array of argument values. Values can be added to the array in any (indeterminate) order. |
groupArrayArray | Aggregates arrays into a larger array of those arrays. |
groupArrayInsertAt | Inserts a value into the array at the specified position. |
groupArrayIntersect | Return an intersection of given arrays (Return all items of arrays, that are in all given arrays). |
groupArrayLast | Creates an array of the last argument values. |
groupArrayMovingAvg | Calculates the moving average of input values. |
groupArrayMovingSum | Calculates the moving sum of input values. |
groupArraySample | Creates an array of sample argument values. The size of the resulting array is limited to max_size elements. Argument values are selected and added to the array randomly. |
groupArraySorted | Returns an array with the first N items in ascending order. |
groupBitAnd | Applies bit-wise AND for series of numbers. |
groupBitmap | Bitmap or Aggregate calculations from a unsigned integer column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object |
groupBitmapAnd | Calculations the AND of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object. |
groupBitmapOr | Calculations the OR of a bitmap column, return cardinality of type UInt64, if add suffix -State, then return a bitmap object. This is equivalent to groupBitmapMerge . |
groupBitmapXor | Calculates the XOR of a bitmap column, and returns the cardinality of type UInt64, if used with suffix -State, then it returns a bitmap object |
groupBitOr | Applies bit-wise OR to a series of numbers. |
groupBitXor | Applies bit-wise XOR for series of numbers. |
groupConcat | Calculates a concatenated string from a group of strings, optionally separated by a delimiter, and optionally limited by a maximum number of elements. |
groupUniqArray | Creates an array from different argument values. |
intervalLengthSum | Calculates the total length of union of all ranges (segments on numeric axis). |
kolmogorovSmirnovTest | Applies Kolmogorov-Smirnov's test to samples from two populations. |
kurtPop | Computes the kurtosis of a sequence. |
kurtSamp | Computes the sample kurtosis of a sequence. |
largestTriangleThreeBuckets | Applies the Largest-Triangle-Three-Buckets algorithm to the input data. |
last_value | Selects the last encountered value, similar to anyLast , but could accept NULL. |
mannWhitneyUTest | Applies the Mann-Whitney rank test to samples from two populations. |
max | Aggregate function that calculates the maximum across a group of values. |
maxIntersections | Aggregate function that calculates the maximum number of times that a group of intervals intersects each other (if all the intervals intersect at least once). |
maxIntersectionsPosition | Aggregate function that calculates the positions of the occurrences of the maxIntersections function. |
aggThrow | This function can be used for the purpose of testing exception safety. It will throw an exception on creation with the specified probability. |
analysisOfVariance | Provides a statistical test for one-way analysis of variance (ANOVA test). It is a test over several groups of normally distributed observations to find out whether all groups have the same mean or not. |
any | Selects the first encountered value of a column. |