Lookup Tables

The lookup tables feature allows you to look up, map, and translate message field values into new values and write them into new message fields or overwrite existing fields. A simple example is to use a static CSV file to map IP addresses to host names.

Components

The lookup table system consists of four components:

  • Data adapters
  • Caches
  • Lookup tables
  • Lookup results

Data Adapters

Data adapters are used to do the actual lookup for a value. They might read from a CSV file, connect to a database, or execute requests to receive the lookup result.

Data adapter implementations are pluggable and new ones can be added through plugins.

Warning: The CSV file adapter reads the entire contents of the file into HEAP memory. Ensure that you size the HEAP accordingly.

Caches

The caches are responsible for caching the lookup results to improve the lookup performance and/or to avoid overloading databases and APIs. They are separate entities to make it possible to reuse a cache implementation for different data adapters. That way, the data adapters do not have to care about caching and do not have to implement it on their own.

Cache implementations are pluggable and new ones can be added through plugins.

Hint: The CSV file adapter refreshes its contents within each check interval if the file was changed. If the cache was purged but the check interval has not elapsed, lookups might return expired values.

Lookup Tables

The lookup table component ties a data adapter instance and a cache instance together. It is needed to enable the usage of the lookup table in extractors, converters, pipeline functions and decorators.

Lookup Results

The lookup result is returned by a lookup table through the data adapter and can contain two types of data. A single value and a multi value .

The single value can be a string, number or boolean and will be used in extractors, converters, decorators and pipeline rules. In our CSV example to lookup host names for IP addresses, this would be the host name string.

A multi value is a map or dictionary-like data structure and can contain several different values. This is useful if the data adapter can provide multiple values for a key. A good example for this would be the geo-ip data adapter which does not only provide the latitude and longitude for an IP address, but also information about the city and country of the location. Currently, the multi value can only be used in a pipeline rule when using the lookup() pipeline function.

Example 1: Output for a CSV data adapter including a single value and a multi value.

Example 2: Output for the geo-ip data adapter including a single value and a multi value.

Setup

Lookup tables can be configured on the System > Lookup Tables page.

You need to create at least one data adapter and one cache before you can create your first lookup table. The following example setup creates a lookup table with a CSV file data adapter and an in-memory cache.

1. Create Data Adapter

Navigate to System > Lookup Tables and click the Data Adapters button in the top right corner. Then you first have to select a data adapter type.

Every data adapter form includes data adapter specific documentation that helps you to configure it correctly.

The Time to Live (TTL) Option for MongoDB Data Adapters

Lookup tables and watchlists accumulate entries such as policy violations or threat intelligence feeds that may become irrelevant over time. In such cases, best practice is to choose the TTL (Time to Live) option on the MongoDB data adapter configuration page to automatically delete old entries. The TTL option may be selected per entry so an adapter can have multiple entries each with their own TTL or no TTL.

A sample use case of the TTL option would be Indicators of Compromise (IoC) lists. These lists are made up of forensic evidence of security threats such as dubious logins or unusual traffic going in or out of a network. IOCs appear and disappear over time and indicators that were critical in the past may lose their importance over time. Since IOC lists are constantly updated, selecting the TTL option would be beneficial in this case.

Another scenario may be an account lockout situation. Providing a list of users who have updated their passwords in the past 72 hours would be beneficial in this case. If these users get locked out of their account, the presence of their username on a lookup table would enable analysts to treat these lockouts differently than users who have not changed their passwords recently.

CIDR Lookups in CSV File and MongoDB Data Adapters

A CIDR address is an IP address ending in a slash. The number following the slash represents the number of addresses in the range.

The CIDR lookup option can be found at the bottom of the data adapter's configuration page. If this option is not selected, the data adapters will perform exact key matching and look for an identical pattern. If the CIDR lookup option is selected, the lookups will compare the key (which must be an IP address) to the CIDR address keys of the adapter. The CIDR addresses will be searched to find a matching IP address.

An example list of key value pairs:

Copy
key: 192.168.100.0/24, value: "Finance Department subnet"
key: 192.168.101.0/24, value: "IT Department subnet"
key: 192.168.102.0/24, value: "HR Department subnet"

In this case, a lookup on the IP address 192.168.101.117 would return “IT Department Subnet.”

2. Create Cache

  1. Navigate to System > LookupTables
  2. Go to the Caches tab and select the Create cache button in the top right corner.

  3. Choose a cache type.

Every cache form includes cache-specific documentation that helps you to configure it correctly.

Null values are cached unless the Ignore empty results box is checked.

3. Create Lookup Table

Now you can create a lookup table with the newly created data adapter and cache.

  1. Navigate to System > Lookup Tables.

  2. Click Create lookup table.

  3. Select the data adapter and cache instances in the creation form.

Default Values

Every lookup table can optionally be configured with default values, which will be used if a lookup operation does not return any result. If the key does not have a value that is defined in the lookup table, it will return the default value.

Usage

Lookup tables can be used with the following Graylog components.

Extractors

A lookup table extractor can be used to look up the value of a message field in a lookup table and write the result into a new field or overwrite an existing field.

Converters

When you use an extractor to get values out of a text message, you can use a lookup table converter to do a lookup on the extracted value.

Decorators

A lookup table decorator can be used to enrich messages by looking up values at search time.

Pipeline Rules

There are two lookup functions that can be used in a pipeline rule, lookup(), and lookup_value(). The first returns the multi value data of the lookup result, and the second returns the single value.

Built-in Data Adapters

The following Data Adapters are shipped with Graylog by default. Detailed on-screen documentation for each is available on the Add/Edit Data Adapter page in Graylog.

CSV File Adapter

Performs key/value lookups from a CSV file.

DNS Lookup Adapter

Provides the ability to perform the following types of DNS resolutions:

  • Resolve hostname to IPv4 address (A records)
  • Resolve hostname to IPv6 address (AAAA records)
  • Resolve hostname to IPv4 and IPv6 address (A and AAAA records)
  • Reverse lookup (PTR record)
  • Text lookup (TXT records)

DSV File from Adapter

Performs key/value from a DSV file. This adapter supports more advanced customization than the CSV File adapter (such a custom delimiter and customizable key/value columns).

https JSONPath Adapter

Executes GET requests to lookup a key and parses the result based on configured JSON Path expressions.

Geo IP - MaxMind Databases

Provides the ability to extract geolocation information of IP addresses from MaxMind ASN, Country and City databases.

Enterprise Data Adapters

Graylog Enterprise brings another Lookup Table Data Adapter.

MongoDB

This data adapter stores its keys and values in the Graylog configuration database. The entries of the database can be altered via pipeline functions and Rest API calls. That way you can alter the result of the lookup table call based on incoming logs or from an external source.

Alter from Rest API

For a detailed look on how to interact with the MongoDB Data Adapter please have a look at the API browser at api/api-browser/#!/Plugins/MongoDBDataAdapter. There you can see that you can add, update, list, and delete key value pairs of the data adapter.

Here an example on how to add a key to a MongoDB adapter with an API token:

Copy
curl -u d2tirtpunshmgdsbq5k3j0g4ku230ggruhsqpa0iu7mj1lia55i:token \
    -H 'X-Requested-By: cli' -H 'Accept: application/json' \  
    -X POST 'http://127.0.0.1:9000/api/plugins/org.graylog.plugins.lookup/lookup/adapters/mongodb/mongodb-data-name' \  
    -H 'Content-Type: application/json' \  
    --data-binary $'{\n"key": "myIP",\n"values:["12.34.42.99"],\n"data_adapter_id":"5e578606cdda4779dd9f2611"\n}'

Alter from Pipeline Function

A reference of the pipeline functions handling the lookup table values can be found in the pipeline rules functions section of the documentation.

Alter from GUI

The values of the MongoDB adapter can also be altered directly via the GUI.

Warning: To add multiple values for one key, you need to separate the values by new lines.