Analysis tips
MEDIAN() and PERCENTILE() functions
- In Live mode the MEDIAN() and PERCENTILE() functions (since connector v0.1.3 release) use the ClickHouse quantile()() function, which significantly speeds up the calculation, but uses sampling. If you want to get accurate calculation results, then use functions
MEDIAN_EXACT()
andPERCENTILE_EXACT()
(based on quantileExact()()). - In Extract mode you can't use MEDIAN_EXACT() and PERCENTILE_EXACT() because MEDIAN() and PERCENTILE() are always accurate (and slow).
Additional functions for Calculated Fields in Live mode
ClickHouse has a huge number of functions that can be used for data analysis — much more than Tableau supports. For the convenience of users, we have added new functions that are available for use in Live mode when creating Calculated Fields. Unfortunately, it is not possible to add descriptions to these functions in the Tableau interface, so we will add a description for them right here.
-If
Aggregation Combinator (added in v0.2.3) - allows to have Row-Level Filters right in the Aggregate Calculation.SUM_IF(), AVG_IF(), COUNT_IF(), MIN_IF() & MAX_IF()
functions have been added.BAR([my_int], [min_val_int], [max_val_int], [bar_string_length_int])
(added in v0.2.1) — Forget about boring bar charts! UseBAR()
function instead (equivalent ofbar()
in ClickHouse). For example, this calculated field returns nice bars as String:BAR([my_int], [min_val_int], [max_val_int], [bar_string_length_int]) + " " + FORMAT_READABLE_QUANTITY([my_int])
== BAR() ==
██████████████████▊ 327.06 million
█████ 88.02 million
███████████████ 259.37 millionCOUNTD_UNIQ([my_field])
(added in v0.2.0) — Calculates the approximate number of different values of the argument. Equivalent of uniq(). Much faster than COUNTD().DATE_BIN('day', 10, [my_datetime_or_date])
(added in v0.2.1) — equivalent oftoStartOfInterval()
in ClickHouse. Rounds down a Date or Date & Time to the given interval, for example:== my_datetime_or_date == | == DATE_BIN('day', 10, [my_datetime_or_date]) ==
28.07.2004 06:54:50 | 21.07.2004 00:00:00
17.07.2004 14:01:56 | 11.07.2004 00:00:00
14.07.2004 07:43:00 | 11.07.2004 00:00:00FORMAT_READABLE_QUANTITY([my_integer])
(added in v0.2.1) — Returns a rounded number with a suffix (thousand, million, billion, etc.) as a string. It is useful for reading big numbers by human. Equivalent offormatReadableQuantity()
.FORMAT_READABLE_TIMEDELTA([my_integer_timedelta_sec], [optional_max_unit])
(added in v0.2.1) — Accepts the time delta in seconds. Returns a time delta with (year, month, day, hour, minute, second) as a string.optional_max_unit
is maximum unit to show. Acceptable values:seconds
,minutes
,hours
,days
,months
,years
. Equivalent offormatReadableTimeDelta()
.GET_SETTING([my_setting_name])
(added in v0.2.1) — Returns the current value of a custom setting. Equivalent ofgetSetting()
.HEX([my_string])
(added in v0.2.1) — Returns a string containing the argument’s hexadecimal representation. Equivalent ofhex()
.KURTOSIS([my_number])
— Computes the sample kurtosis of a sequence. Equivalent ofkurtSamp()
.KURTOSISP([my_number])
— Computes the kurtosis of a sequence. The equivalent ofkurtPop()
.MEDIAN_EXACT([my_number])
(added in v0.1.3) — Exactly computes the median of a numeric data sequence. Equivalent ofquantileExact(0.5)(...)
.MOD([my_number_1], [my_number_2])
— Calculates the remainder after division. If arguments are floating-point numbers, they are pre-converted to integers by dropping the decimal portion. Equivalent ofmodulo()
.PERCENTILE_EXACT([my_number], [level_float])
(added in v0.1.3) — Exactly computes the percentile of a numeric data sequence. The recommended level range is [0.01, 0.99]. Equivalent ofquantileExact()()
.PROPER([my_string])
(added in v0.2.5) - Converts a text string so the first letter of each word is capitalized and the remaining letters are in lowercase. Spaces and non-alphanumeric characters such as punctuation also act as separators. For example:PROPER("PRODUCT name") => "Product Name"
PROPER("darcy-mae") => "Darcy-Mae"
RAND()
(added in v0.2.1) — returns integer (UInt32) number, for example3446222955
. Equivalent ofrand()
.RANDOM()
(added in v0.2.1) — unofficialRANDOM()
Tableau function, which returns float between 0 and 1.RAND_CONSTANT([optional_field])
(added in v0.2.1) — Produces a constant column with a random value. Something like{RAND()}
Fixed LOD, but faster. Equivalent ofrandConstant()
.REAL([my_number])
— Casts field to float (Float64). Detailshere
.SHA256([my_string])
(added in v0.2.1) — Calculates SHA-256 hash from a string and returns the resulting set of bytes as a string (FixedString). Convenient to use with theHEX()
function, for example,HEX(SHA256([my_string]))
. Equivalent ofSHA256()
.SKEWNESS([my_number])
— Computes the sample skewness of a sequence. Equivalent ofskewSamp()
.SKEWNESSP([my_number])
— Computes the skewness of a sequence. Equivalent ofskewPop()
.TO_TYPE_NAME([field])
(added in v0.2.1) — Returns a string containing the ClickHouse type name of the passed argument. Equivalent oftoTypeName()
.TRUNC([my_float])
— It is the same as theFLOOR([my_float])
function. Equivalent oftrunc()
.UNHEX([my_string])
(added in v0.2.1) — Performs the opposite operation ofHEX()
. Equivalent ofunhex()
.