Create an External Data Lake Connector

Graylog external Data Lake allows you to connect to existing third-party data lakes by defining connectors. These connectors enable you to preview and retrieve log data similarly to how these functions work for an internal data lake.

Warning: Retrieving log data from an external data lake counts against license use. Previewing logs in place does not count against your license.

For complete information about internal and external data lakes, see Data Lakes.

Prerequisites

Before proceeding, ensure that the following prerequisites are met:

  • You must be a Graylog administrator to set up and manage a data lake connector.

  • To use an external data lake, you must have an existing third-party data lake source and appropriate access credentials.

Hint: Currently, only Amazon Security Lake is supported for external data lakes.

The Data Retrieval Stream

Each connector you create must be associated with a system-managed stream to which data retrieved from this connector is stored. This stream is created automatically when you create a connector, at which time you must assign a name to the stream. You cannot add stream rules, pipeline rules, routing destinations, or filter rules to a stream associated with a connector.

Users without the Admin role must be granted permission to view this stream in order to select this connector on the Data Lake > Preview page. For details about roles and sharing in Graylog, see Permission Management.

Create a Connector

To create a third-party data lake connector:

  1. Navigate to Data Lake > External Lake Connectors.

  2. Select Add Connector.

  3. Enter all required information:

    Data Lake Name

    Enter a unique and descriptive name for this data lake.

    S3 Output Bucket

    Enter the path to the S3 bucket where data lake query results are stored temporarily.

    AWS Region

    Select the AWS region where this service is running. If you want to connect log data from different regions, you must create separate connectors for each region or create a rollup Region in your Amazon Security Lake.

    AWS Access Key

    Enter your AWS Key. This value should be a 20-character long, alphanumeric string that starts with the letters AK.

    AWS Secret Key

    Enter your AWS Secret, which is typically a 40-character long, base-64 encoded string.

    AWS IAM Role(Optional)

    Enter an AWS IAM Role to be assumed for the connector.

    Stream Title

    Enter a descriptive name for the stream associated with this data lake. Remember, each external data lake has one associated system-managed stream, which is used for retrieved data.

    Stream Description (Optional)

    Use this field to provide a detailed description of the log data routed to this stream, if desired.

  4. Click Save.

The new connector is added to the list on the Data Lake Connectors page. You can use this list to preview data in the data lake or begin a data retrieval operation.

Further Reading

Explore the following additional resources and recommended readings to expand your knowledge on related topics: