Aggregation & Correlation
Common Syntax
Both aggregation and correlation statements have the following generic structure:
<function-command> [<function-specific-params>] <timespan-param> <group-by-clause> <where-clause>
Some example correlation statements:
count timespan=5m group_by field1 where field3 > 100
distinct_count(myfield) timespan=30s
temporal(ordered=true) [ padasRule="internal_error" || padasRule="new_network_connection" ] timespan=1m group_by internal_ip, remote_ip
Common Parameters for Functions
All correlation statements evaluate streaming events for a given time window (defined via timespan
parameter) and optionally groups them according to selected fields (defined via group_by
clause). For counting aggregation/correlation statements it's also possible to limit the results by providing a query expression (defined via where
clause).
Argument Order
Correlation statements must start with the one of the available functions, followed by function specific parameters (if any). Common argument order and descriptions are provided in the following table.
Order | Keyword | Required | Description | Example |
---|---|---|---|---|
1 | timespan | Yes | Specifies time window to perform aggregated function. The value should be an integer followed by one of the following identifiers: s for second(s)m for minute(s)h for hour(s)d for day(s) | timespan=5m timespan=1h |
2 | group_by | No | Group correlation results according to specified field(s). | group_by field1, field2 |
3 | where | No | Filter events according to specified query expression. | where field1 > 100 |
Aggregation Functions
PDL provides several built-in aggregation functions that can be used to analyze event data over a specified time window. These functions can be broadly categorized into counting and statistical aggregations:
Counting Aggregations
count
: Counts the total number of eventscount(field)
: Counts events where the specified field existsdistinct_count(field)
ordc(field)
: Counts unique values of the specified field
Statistical Aggregations
avg(field)
: Calculates the average value of the specified fieldmedian(field)
: Calculates the median value of the specified fieldmin(field)
: Finds the minimum value of the specified fieldmax(field)
: Finds the maximum value of the specified fieldvariance(field)
: Calculates the variance of the specified fieldstddev(field)
: Calculates the standard deviation of the specified field
All aggregation functions support the common parameters (timespan, group_by, and where clauses) as described above.
Aggregation Function Details
For details please visit Aggregation Functions
Correlation Functions
Temporal
Description
Temporal correlation statement checks for all the events matching the expression array within the time frame defined. If the boolean value ordered
is set to true
, then all the events are expected to occur in the given order. The result may also contain count of events for each group specified by group_by
separately.
Syntax and Functions
... | temporal(<ordered-param>) [ <expression> || <expression> || ... ] <timespan-param> <group-by-clause> <where-clause>
Ordered parameter: order
is assigned either true
or false
as value (e.g. ordered=true
) to specify whether the events are expected to match expression array order.
Expression array: The array consists of one or more expressions separated by double-pipe ||
character (e.g. [ field1="valu*" || field3 < 100 AND field4=false>]
)
Temporal Examples
The following table provides examples of available functionality based on the following JSON value:
JSON Input | Expression | Expected Output |
---|---|---|
|
|
|
|
|
|