Pipeline Functions

Functions serve as the foundational components of pipeline rules. They are pre-defined methods designed to perform specific actions on log messages during processing. Each function can accept various parameters and produce outputs that help shape how messages are handled and processed by Graylog.

For a full list of all supported Graylog functions, their descriptions, and sample syntax, see Functions Reference.

Functions Syntax

Functions are written in Java and are pluggable, allowing Graylog’s pipeline processing capabilities to be easily extended.

Conceptually a function receives parameters, which are the current message context, and returns a value. The data types of its return value and parameters determine where it can be used in a rule. Graylog's role is to ensure the rules are sound from a data-type perspective.

Function parameters can either be passed as named pairs or positions, as long as optional parameters are declared as processing last.

Functions Syntax Example

Let’s look at an example rule to illustrate these properties:

Copy
Text
rule "function howto"
when    
    has_field("transaction_date")
then    
    // the following date format assumes there's no time zone in the string    
    let new_date =parse_date(to_string($message.transaction_date), "yyyy-MM-dd HH:mm:ss");    
    set_field("transaction_year", new_date.year);
 end

In this example the rule determines if the current message contains the field transaction_date and then, after converting it to a string, tries to parse it according to the format string yyyy-MM-dd HH:mm:ss. So for example, the string 2016-03-05 14:45:02 would match the request. The parse_date function returns a DateTime object from the Java Joda-Time library, allowing you easier access to the date’s components.

We then add the transaction’s year as a new field, transaction_year to the message. Note that we did not specify a time zone for our date, but Graylog still had to select one. Graylog never relies on the local time of your server as that makes it difficult for you to figure out why date handling yielded a result.

The reason Graylog knows which timezone to use is because parse_date actually takes four parameters rather than the two we have given it in this example. The other two parameters are a string called timezone (with a default value of UTC) and a string called locale (with the default locale of the system running Graylog). Both parameters are optional.

Now, assume you have another message field called transaction_timezone, which is sent by the application and contains the time zone ID in which the transaction was completed.

Copy
Text
rule "function howto"
when    
    has_field("transaction_date") && has_field("transaction_timezone")
then    
    // the following date format assumes there's no time zone in the string    
    let new_date = parse_date(  
            to_string($message.transaction_date),  
            "yyyy-MM-dd HH:mm:ss",  
            to_string($message.transaction_timezone) 
       );    
       set_field("transaction_year", new_date.year);
end

Now Graylog is passing the parse_date function its timezone parameter to the string value of the message’s transaction_timezone field.

In this case, there is only a single optional parameter, which makes it easy to simply omit it from the end of the function call. However, if there are multiple optional parameters, or if there are so many parameters that it gets difficult to keep track of which positions correspond to which parameters, you can also use the named parameter variant of function calls. In this mode the order of the parameters does not matter, but all required ones still need to be present.

In this case the alternative version of calling parse_date would look like this:

Copy
Text
rule "function howto"
when    
    has_field("transaction_date") && has_field("transaction_timezone")
then    
    // the following date format assumes there's no time zone in the string    
    let new_date = parse_date(                        
            value: to_string($message.transaction_date), 
            pattern: "yyyy-MM-dd HH:mm:ss",
            timezone: to_string($message.transaction_timezone) 
         );    
      set_field("transaction_year", new_date.year);
 end

Java Data Types

Pipeline rules can theoretically be built using some Java data types when creating your query. This is limited to those types that are queried using the get function.

For example, the function .millis can potentially be used in Graylog pipeline rules for DateTime and Period objects.

Copy
rule "time diff calculator millis"
when true
then
let time_diff = to_long((parse_date(
value: to_string(now(timezone:"Europe/Berlin")),
pattern: "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
locale: "de_DE").millis)) -
to_long(parse_date(
value: to_string($message.timestamp),
pattern: "yyyy-MM-dd'T'HH:mm:ss.SSSZ",
locale: "de_DE").millis);
set_field("scan_age_millis", time_diff);
end

Warning: Graylog does not support the use of any functions that are not officially documented. Please exercise caution if you choose to test any unsupported function data types.

Function Types

Built-in Graylog functions can be categorized by the following function types. For a full list of all functions and their descriptions, see Functions Reference.

Anonymization

Anonymization functions obfuscate sensitive data from a dataset or log message.

Asset Enrichment

Asset Enrichment functions enhance, retrieve, or remove asset-related log data. See Asset Enrichment for more information on this Graylog Security feature.

Boolean

Boolean data is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a condition evaluates to true or false. Boolean functions determine Boolean values or operators.

Conversion

Conversion functions are used to convert a value from one format to another.

Date/Time

Date/time functions perform an action or calculation on a date and time value.

Debug

Debug functions are used to determine the state of your program at any point of execution.

Encoding

Encoding functions enable you to decode and convert strings.

List

List functions create or retrieve a collection that can be manipulated for your analysis.

Lookup

Lookup functions enable you to search a database for a value then return additional information from the same record.

Map

Map functions apply a given action to each or all elements in a collection.

Message Handling

Message Handling functions define what is to be done in response to a message. They are used for various enrichment, removal, retrieval, and routing operations for log data when building pipeline rules.

Pattern Matching

Pattern matching functions specify patterns to which some data should conform and deconstruct the data according to those patterns.

String

String functions are used to manipulate a string or query information about a string.

Watchlist

Watchlist functions perform actions that allow you to retrieve or modify watchlists.