AWS S3 Input

The AWS S3 input supports collecting published log files into an Amazon S3 bucket. As new log files are published, the input ingests them automatically. The input supports both newline-delimited (one message per line) and JSON root array messages (multiple log messages contained in a single JSON array). The input utilizes Simple Queue Service (SQS) S3 bucket notifications in order to identify when new data is available for reading by Graylog.

Prerequisites

  • An Amazon Web Services (AWS) subscription.

  • A defined S3 bucket to which logs may be written.

IAM Permissions

In order for Graylog to connect to AWS S3, an Identity and Access Management (IAM) role must be created and utilized with sufficient permissions to read the target SQS queue, and also to read the files from S3. The following Amazon permissions are required for the input to operate:

  • s3: GetObject

  • sqs: ReceiveMessage

Required S3 and SQS Setup

In addition to the following steps, these articles are helpful to complete the setup:

SQS Setup

Create an SQS queue that Graylog can subscribe to in order to receive notifications of new files to read. Most default options can be accepted. Note that an access policy must be defined to allow the S3 bucket to publish notifications to the queue. The following is an sample policy for authorizing S3 to publish the notifications. Note that the S3 bucket creation is defined below.

Copy
{
  "Version": "2012-10-17",
  "Id": "example-ID",
  "Statement": [
    {
      "Sid": "s3-publish-policy",
      "Effect": "Allow",
      "Principal": {
        "Service": "s3.amazonaws.com"
      },
      "Action": "SQS:SendMessage",
      "Resource": "arn:aws:sqs:<region>:<account-number>:<queue-name>",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "<account-number>"
        },
        "ArnLike": {
          "aws:SourceArn": "arn:aws:s3:::<s3-bucket-name>"
        }
      }
    }
  ]
}

Set Up an S3 Bucket

You must have an S3 bucket to write log message files to. If not, create one following the official Amazon documentation. After you create the bucket, you must enable the Event Notifications option for it. This article provides instructions to perform the setup.

When you configure the notifications, in the "Event Types" section, choose the "All object create events" option to ensure the input is notified of all possible methods by which files are created.

In the Destination section, choose the SQS queue that was created above.

Hint: After the SQS Queue and S3 bucket are set up, the notification capability between them can be tested.

Input Configuration

The following options are available when launching a new input from the Graylog Inputs tab:

Configuration Option Description
Input Name Provide a unique name for your new input.
AWS Authentication Type

The input supports automatic authentication, which relies on the predefined authentication chain defined in the AWS software development kit (SDK). This option is commonly used when an IAM policy is defined for the instance where Graylog is running. The input automatically uses the authentication methods according to the order defined in AWS documentation.

If the Key & Secret option is selected, the input also supports the ability to enter an AWS API Access Key and Secret Key.

AWS Access Key ID: The access key ID generated for the user with required permission to the S3 bucket and the SQS queue associated with the S3 bucket. These AWS credentials can be configured in Graylog.

SQS Queue name The queue name created for the S3 event notifications.
S3 Bucket The S3 bucket where log files are being written.
S3 Region

The region where the S3 bucket is located.

Content Type: The format of the logs present in the S3 bucket.

  • CSV: Comma-separated values. Also supports newline values within the individual CSV fields.

Copy
"field 1", "field 2"
Copy
"same line field", "field with 
line breaks"
  • Text (Newline Delimited): One log message per line. The input creates one message in Graylog for each line.

  • JSON Array: Expects a JSON array at the root of the document containing either strings, or individual JSON documents. For example:

Copy
["log message 1", "log message 2", ...]
Copy
[{"key", "value"}, {"key2", "value2"}, ...]
Compression Type The compression type of the log files. Supported options are GZIP and None. Use this if log files are written within compressed archives.
Polling Interval Determines how often (in minutes) Graylog checks for new data in the S3 bucket. The smallest allowable interval is 5 minutes.
Enable Throttling If enabled, no new messages are read from this input until Graylog catches up with its message load.