Aggregation operations

In this article, you can see all the operations that you can use when aggregating a new function.  

Remember that you have to group the data before applying any of the following aggregation operations.

1. Average

Calculates the average of the grouped values of the selected argument.

How does it work?

This operation needs at least one argument:

Argument Format
Average of (mandatory) Integer number, number

Example

Using the demo.ecommerce.data table, we can calculate the average number of bytes transferred during each five minute period. First, the table must be grouped in five minute intervals. Then, select Aggregation in the query toolbar and select the Average operation. The arguments needed are:

  • Average of - bytesTransferred column

The new column is added to show the average of the bytes transferred during each five minute period.

2. Count

This performs a simple count and can work with or without an argument. It returns the number of rows in a group when no argument is specified. Or, when you specify an argument, it returns the number of non-null argument values in a group.

How does it work?

This operation takes up to one argument, and also works with no argument.

Argument Format
Count Any

Example

Imagine you want to check if there are any null IP addresses every 5 minutes in the demo.ecommerce.data table. First, you must click the Group button and select 5 minutes. Then, select Aggregation in the query toolbar and select the Count operation. The arguments needed are:

  • Count - clientIpAddress column

The result is a column showing the number of non-null client IP address every five minutes.

3. First

Returns the first value of a group.

How does it work?

This operation needs one argument:

Argument Format
First of (mandatory) Any

Example

Suppose you want to get the first client IP address in every 5-minute group in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the First operation. The arguments needed are:

  • First of - clientIpAddress column


The result is a column showing the first IP address sent every 5 minutes.

4. First not null

Returns the first non-null value of a group. 

How does it work?

This operation needs one argument:

Argument Format
First of (mandatory) Any

Example

Suppose you want to get the first not null client IP address in every 5-minute group in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the First not null operation. The arguments needed are:

  • First of - clientIpAddress column


The result is a column showing the first not null IP address sent every 5 minutes.

5. HyperLogLog++

Returns the HyperLogLog distinct estimation data structure of each group of a column.

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Any

Example

Imagine you want to get the HyperLogLog distinct estimation data structure of the bytes transferred every 5 minutes in the demo.ecommerce.data table. Click Group and select 5 minutes. Then, click Aggregation and select the HyperLogLog++ operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the HyperLogLog distinct estimation data structure of the bytes transferred every 5 minutes.

6. HyperLogLog++ Count Estimation

Returns the HyperLogLog distinct estimation of each group of a column.

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Any

Example

To get the HyperLogLog distinct estimation of the bytes transferred every 5 minutes in the demo.ecommerce.data table, click Group and select 5 minutes. Then, click Aggregation and select the HyperLogLog++ Count Estimation operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the HyperLogLog distinct estimation of the bytes transferred every 5 minutes.

7. Last

Returns the last value of a group. 

How does it work?

This operation needs one argument:

Argument Format
Last of (mandatory) Any

Example

To get the last client IP address in every 5-minute group in the demo.ecommerce.data table, click the Group button and select 5 minutes. Then, click Aggregation and select the Last operation. The arguments needed are:

  • Last of - clientIpAddress column

The result is a column showing the last IP address sent every 5 minutes.

8. Last not null

Returns the last non-null value of a group.

How does it work?

This operation needs one argument:

Argument Format
Last of (mandatory) Any

Example

To get the last non-null client IP address in every 5-minute group in the demo.ecommerce.data table, click the Group button and select 5 minutes. Then, click Aggregation and select the Last not null operation. The arguments needed are:

  • Last of - clientIpAddress column

The result is a column showing the last non-null IP address every 5 minutes.


9. Maximum

Returns the highest value of the groups of the selected column. 

How does it work?

This operation needs one argument:

Argument Format
Sum (mandatory) Integer number

Example

In this example, you will get the highest number of bytes sent every 5-minute group in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Maximum operation. The arguments needed are:

  • Sum - bytesTransferred column

The result is a column showing the event with the highest number of bytes sent every 5 minutes.


10. Median / 2nd quartile / Percentile 50

Returns the median of the groups of the selected column.

How does it work?

This operation needs one argument: 

Argument Format
Source (mandatory) Integer number

Example

In this example, you will get the median of the number of bytes sent every 5-minute group in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Median / 2nd quartile / Percentile 50 operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the median of the number of bytes sent every 5 minutes.

11. Minimum

Returns the lowest value of the groups of the selected column.

How does it work?

This operation needs one argument:

Argument Format
Sum (mandatory) Integer number

Example

In this example, you will get the lowest number of bytes sent every 5-minute group in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Minimum operation. The arguments needed are:

  • Sum - bytesTransferred column

The result is a column showing the event with the lowest number of bytes sent every 5 minutes.

12. Non-null average

Returns the average of the non-null values of the selected column.

How does it work?

This operation needs one argument:

Argument Format
Average of (mandatory) Number

Example

Imagine you want to get the average of the bytes transferred every 5 minutes in the demo.ecommerce.data table, only considering non-null values. First, you must click the Group button and select 5 minutes. Then, click Aggregation and select the Non-null average operation. The arguments needed are:

  • Non-null average bytesTransferred column

The result is a column showing the average of the non-null values in the bytesTransferred column every 5 minutes.

13. Non-null standard deviation (biased)

Returns the biased standard deviation of the non-null values from the selected column.

How does it work?

This operation needs one argument:

Argument Format
Biased standard deviation of (mandatory) Integer number, number

Example

In this example, you will get the biased standard deviation of the number of bytes sent every 5 minutes in the demo.ecommerce.data table, only considering non-null values. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Non-null standard deviation (biased) operation. The arguments needed are:

  • Biased standard deviation of - bytesTransferred column

The result is a column showing the biased standard deviation of the non-null values in the bytesTransferred column sent every 5 minutes.


14. Non-null standard deviation (unbiased)

Returns the unbiased standard deviation of the non-null values from the selected column.

How does it work?

This operation needs one argument:

Argument Format
Unbiased standard deviation of (mandatory) Integer number, number

Example

In this example, you will get the unbiased standard deviation of the number of bytes sent every 5 minutes in the demo.ecommerce.data table, only considering non-null values. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Non-null standard deviation (unbiased) operation. The arguments needed are:

  • Unbiased standard deviation of - bytesTransferred column

The result is a column showing the unbiased standard deviation of the non-null values in the bytesTransferred column sent every 5 minutes.

15. Non-null variance (biased)

Returns the biased variance of the non-null values of the selected column.

How does it work?

This operation needs one argument:

Argument Format
Biased variance of (mandatory) Number

Example

In this example, you will get the biased variance of the bytes transferred every 5 minutes in the demo.ecommerce.data table, only considering non-null values. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Non-null variance (biased) operation. The arguments needed are: 

  • Biased variance of - bytesTransferred column

The result is a column showing the biased variance of the of the non-null values in the bytesTransferred column sent every 5 minutes.

16. Non-null variance (unbiased)

Returns the unbiased variance of the non-null values of the selected column.

How does it work?

This operation needs one argument:

Argument Format
Unbiased variance of (mandatory) Integer number, number

Example

In this example, you will get the unbiased variance of the bytes transferred sent every 5 minutes in the demo.ecommerce.data table, only considering non-null values. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Non-null variance (unbiased) operation. The arguments needed are: 

  • Unbiased variance of - statusCode column 


The result is a column showing the unbiased variance of the of the non-null values in the bytesTransferred column sent every 5 minutes.

17. Percentile 25 / 1st quartile

Returns the maximum value of the 25th percentile using a linear interpolation. 

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Number

Example

In this example, you will get the maximum value of the 25th percentile of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Percentile 25 / 1st quartile operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the maximum value of the 25th percentile of the number of bytes sent every 5 minutes.


18. Percentile 5

Returns the maximum value of the 5th percentile using a linear interpolation.

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Integer number

Example

In this example, you will get the maximum value of the 5th percentile of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Percentile 5 operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the maximum value of the 5th percentile of the number of bytes sent every 5 minutes.


19. Percentile 75 / 3rd quartile

Returns the maximum value of the 75th percentile using a linear interpolation.

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Number

Example

In this example, you will get the maximum value of the 75th percentile of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Percentile 75 / 3rd quartile operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the maximum value of the 75th percentile of the number of bytes sent every 5 minutes.

20. Percentile 95

Returns the maximum value of the 95th percentile using a linear interpolation.

How does it work?

This operation needs one argument:

Argument Format
Source (mandatory) Integer number

Example

In this example, you will get the maximum value of the 95th percentile of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Percentile 95 operation. The arguments needed are:

  • Source - bytesTransferred column

The result is a column showing the maximum value of the 95th percentile of the number of bytes sent every 5 minutes.

21. Standard deviation (biased)

Returns the biased standard deviation of the values from the selected column.

How does it work?

This operation needs one argument:

Argument Format
Biased standard deviation of (mandatory) Integer number, number

Example

In this example, you will get the biased standard deviation of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Standard deviation (biased) operation. The arguments needed are:

  • Biased standard deviation of - bytesTransferred column

 

The result is a column showing the biased standard deviation of the number of bytes sent every 5 minutes.

22. Standard deviation (unbiased)

Returns the unbiased standard deviation of the values from the selected column.

How does it work?

This operation needs one argument:

Argument Format
Unbiased standard deviation of (mandatory) Integer number, number

Example

In this example, you will get the unbiased standard deviation of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Standard deviation (unbiased) operation. The arguments needed are:

  • Unbiased standard deviation of - bytesTransferred column

The result is a column showing the unbiased standard deviation of the number of bytes sent every 5 minutes.

23. Sum

Returns the total sum of a numeric column.

How does it work?

This operation needs one argument:

Argument Format
Sum (mandatory) Integer number

Example

In this example, you will get the sum of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Standard deviation (unbiased) operation. The arguments needed are:

  • Sum - bytesTransferred column

The result is a column showing the sum of the number of bytes sent every 5 minutes.

24. Sum Square

Returns the sum of squares of a numeric column.

How does it work?

This operation needs one argument:

Argument Format
Sum (mandatory) Integer number, number

Example

In this example, you will get the sum of squares of the number of bytes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Standard deviation (unbiased) operation. The arguments needed are: 

  • Sum - bytesTransferred column

The result is a column showing the sum of squares of the number of bytes sent every 5 minutes.


25. Variance (biased)

Returns the biased variance of the values of the selected column. 

How does it work?

This operation needs one argument:

Argument Format
Biased variance of (mandatory) Integer number, number

Example

In this example, you will get the biased variance of the status codes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Variance (biased) operation. The arguments needed are: 

  • Biased variance of - statusCode column

The result is a column showing the biased variance of the status codes sent every 5 minutes.


26. Variance (unbiased)

Returns the unbiased variance of the values of the selected column.

How does it work?

This operation needs one argument:

Argument Format
Unbiased variance of (mandatory) Integer number, number

Example

In this example, you will get the unbiased variance of the status codes sent every 5 minutes in the demo.ecommerce.data table. To do it, click the Group button and select 5 minutes. Then, click Aggregation and select the Variance (unbiased) operation. The arguments needed are: 

  • Unbiased variance of - statusCode column


The result is a column showing the unbiased variance of the status codes sent every 5 minutes.


Have we answered your question?

If not, please contact our technical support team via email by clicking the button below.

CONTACT US