AWS S3 Input

The following article exclusively pertains to a Graylog Enterprise feature or functionality. To learn more about obtaining an Enterprise license, please contact the Graylog Sales team.

The AWS S3 input enables Graylog to ingest log files published to an Amazon S3 bucket automatically. It uses Amazon Simple Queue Service (SQS) to receive S3 bucket notifications, allowing Graylog to detect and read new log files as they become available.

Prerequisites

Before proceeding, ensure that the following prerequisites are met:

  • You must have an Amazon Web Services (AWS) subscription.

  • You must set up a defined S3 bucket to which logs may be written.

Supported Log Types

This input supports collecting the following log types:

  • Comma-Separated Values (CSV)

  • Graylog Extended Log Format (GELF)

  • Newline-delimited JSON (one message per line)

  • JSON root array (multiple messages in a single JSON array)

Required Third-Party Setup

To enable integration, complete the following required setup with your third-party service:

  1. Configure a bucket for notifications (SNS topic or SQS queue).

  2. Create SQS queue that Graylog can subscribe to in order to receive notifications of new files to read. Most default options can be accepted. Note that an access policy must be defined to allow the S3 bucket to publish notifications to the queue.

  3. In order for Graylog to connect to AWS S3, an Identity and Access Management (IAM) role must be created and utilized with sufficient permissions to read the target SQS queue, and also to read the files from S3. The following Amazon permissions are required for the input to operate:

    • sqs:ReceiveMessage

    • sqs:DeleteMessage

    • s3:GetObject

  4. Enable and configure event notifications using the Amazon S3 console.

Required Configuration Values

In your third-party configuration, make note of the following values that are required when configuring the input in Graylog:

  • AWS access key

  • AWS secret key

  • SQS Queue Name

Input Type

This input is a pull input type. See Inputs to learn about input types.

Input Configuration

Follow the input setup instructions. During setup of this input, you can configure the following options:

Configuration Option Description

Title

Provide a unique name for your new input.

AWS SQS Region Select the AWS region the queue is in.
AWS S3 Region Select the AWS region where the S3 bucket storing CloudTrail logs resides.
SQS Queue Name Provide the SQS queue name you created, where SNS is writing CloudTrail notifications to.
Enable Throttling Enables Graylog to stop reading new data for this input whenever the system falls behind on message processing and needs to catch up.
AWS access key (optional) The unique identifier created for the AWS Identity and Access Management (IAM) user. The Credential settings retrieval order documentation provides more information.
AWS secret key (optional) The access key ID for the IAM user with permission to the subscriber and the SQS queue.
AWS assume role (ARN) (optional)

This setting is often used for cross-account access.

Content Type

The format of the logs present in the S3 bucket.

  • CSV: Comma-separated values. Also supports newline values within the individual CSV fields.

The compression type of the log files. Supported options are GZIP and None. Use this if log files are written within compressed archives.

Polling Interval

Determines how often (in minutes) Graylog checks for new data in the S3 bucket. The smallest allowable interval is 5 minutes.

Enable Throttling

If enabled, no new messages are read from this input until Graylog catches up with its message load.

Next Steps

After you complete input setup, visit Input Diagnosis for testing and validation of the new input. Use this functionality to help troubleshoot any connection issues.

Further Reading

Explore the following additional resources and recommended readings to expand your knowledge on related topics: