Aggregate Functions

Applies To: WatchGuard Advanced EPDR

Aggregate functions accumulate data from multiple rows.

In the Advanced SQL Query tool, you can use these aggregate functions to create an SQL query:

Function Description

count()

Counts the number of rows in a table. It accepts zero arguments and returns UInt64. The COUNT (DISTINCT x) syntax is not supported. Use the uniq function for this purpose.

SELECT count() FROM table statements are not optimized because the number of entries in the table are not stored separately.

any(x)

Selects the first encountered value. You can run the query in any order, and in a different order each time. To get a determinate result, use the min or max function instead of any.

In some cases, the order of execution is maintained. This applies to cases when SELECT comes from a sub-query that uses ORDER BY.

When a SELECT query has the GROUP BY clause or at least one aggregate function, it is required that all expressions in the SELECT, HAVING, and ORDER BY clauses are calculated from aggregate functions. Each column selected from the table must be used either in keys or inside aggregate functions.

anyHeavy(x)

Selects a frequently occurring value. If there is a value that occurs in more than half the cases in each of the query execution threads, this value is returned. Usually, the result is different each time.

anyLast(x)

Returns the last value.

min(x)

Returns the minimum value.

max(x)

Returns the maximum value.

argMin(arg, val)

Calculates the arg value for a minimum val value. If there are several different values of arg for minimum values of val, the tool returns the first value encountered.

argMax(arg, val)

Calculates the arg value for a maximum val value. If there are several different values of arg for maximum values of val, the tool returns the first value encountered.

sum(x)

Calculates the sum of the numbers.

sumWithOverflow(x)

Calculates the sum of the numbers, using the same data type for the result as for the input parameters. If the sum exceeds the maximum value for this data type, the function returns an error.

avg(x)

Calculates the arithmetic mean of the numbers. The result is always Float64.

uniq(x)

Calculates the approximate number of different values of the argument. Works for numbers, strings, dates, dateTimes, and multiple arguments and tuple type arguments.

Provides the result deterministically (does not depend on the query processing order).

uniqExact(x)

Calculates the exact number of different argument values. We recommend that you use the uniq function. Use the uniqExact function if you require an exact result.

quantile(level)(x)

Calculates the x quantile of level order. level is a constant floating-point number from 0 to 1.

If you omit the level parameter, then 0.5 is taken by default (median calculation).

This function takes numbers, dates, and dateTimes, and returns:

  • For numbers — Float64
  • For Dates — Date
  • For DateTimes — DateTime

The precision of this function is relatively low. Use quantileExact(level)(x) for maximum precision.

The result is non-deterministic (depends on the query processing order).

quantileExact(level)(x)

Exactly computes the quantile of level level order.

median(x)

Calculates the median of a numeric data sample.

varSamp(x)

Represents an unbiased estimate of the variance of a random variable. The values passed as arguments represent a sample of the total population. The function returns Float64.

varPop(x)

Calculates the variance of the population passed as an argument.

stddevSamp(x)

Represents an unbiased estimate of the standard deviation of a random variable. The values passed as arguments represent a sample of the total population. The function returns Float64.

stddevPop(x)

Calculates the standard deviation of the population passed as an argument.

covarSamp(x, y)

Represents an unbiased estimate of the covariance of two random variables. The values passed as arguments represent two samples of the total population. The function returns Float64.

covarPop(x, y)

Calculation of the covariance of two random variables. The values passed as arguments represent two populations. The function returns Float64.

corr(x, y)

Calculates the Pearson correlation coefficient.

Related Topics

Advanced Query SQL Syntax

Select Clause Syntax

Supported Data Types

Regular Functions