Archive Data in an Index Set

The following article exclusively pertains to a Graylog Enterprise feature or functionality. To learn more about obtaining an Enterprise license, please contact the Graylog Sales team.

Now that you have completed setup for Graylog archiving, you may begin to archive your data as desired. You can choose to archive an index set directly or define a specific timeframe to set up automatic archiving.

Create a New Archive for an Index Set

To manually archive a specific index set:

  1. Navigate to Enterprise > Archives.

  2. Select the desired index from the drop down menu.

  3. Click Archive Index.

The archived index set can be viewed in the Archive Catalog found on the same page in the interface.

Set Up Automatic Archiving for an Index Set

Alternatively, you can determine indices to be archived at specific times. This may be done via the Graylog interface or with the REST API.

Automate Archiving via The Graylog Interface

  1. Navigate to System > Indices.

  2. Locate the index set you wish to archive.

  3. Click the Edit button.

  4. Scroll down to the Rotation and Retention section.

  5. Now, you will need to choose your rotation and retention strategy. Select between Data Tiering or Legacy. Note that our recommended model for rotation and retention of indices is data tiering (see Data Tiering for more information). Legacy rotation and retention strategies are supported currently but will be deprecated in future.

    • If you select Data Tiering, check the Archive before deletion box.

    • If you select Legacy, scroll down to Retention Strategy and select Archive index in the drop down menu.

    You may also set the maximum amount of indices allowed before deletion begins. The oldest indices are deleted first.

  6. Click Update index set.

Automate Archiving via The REST API

The Graylog REST API can also automate archive creation if you need a more flexible approach.

In the example below, a curl command is used to archive an index. This command starts a system job in the Graylog server to create an archive for index graylog_386.

system_job.id is used to check the progress of the job.

Copy
$ curl -s -u admin -H 'X-Requested-By: cli' -X POST http://127.0.0.1:9000/api/plugins/org.graylog.plugins.archive/archives/graylog_386
Enter host password for user 'admin': ***************
{
   "archive_job_config" : {
     "archive_path" : "/tmp/graylog-archive",
     "max_segment_size" : 524288000,
     "segment_filename_prefix" : "archive-segment",
     "metadata_filename" : "archive-metadata.json",
     "source_histogram_bucket_size" : 86400000,
     "restore_index_batch_size" : 1001,
     "segment_compression_type": "SNAPPY"
   },
   "system_job" : {
     "id" : "cd7ebfa0-079b-11e6-9e1b-fa163e6e9b8a",
     "description" : "Archives indices and deletes them",
     "name" : "org.graylog.plugins.archive.job.ArchiveCreateSystemJob",
     "info" : "Archiving documents in index: graylog_386",
     "node_id" : "c5df7bff-cafd-4546-ac0a-5ccd2ba4c847",
     "started_at" : "2016-04-21T08:34:03.034Z",
     "percent_complete" : 0,
     "provides_progress" : true,
     "is_cancelable" : true
   }
 }

The REST API can automate other archive-related tasks, such as: 

  • Restoring and deleting archives

  • Updating your archive configuration

We encourage you to explore the REST API browser, which can be found under System > Nodes.

Set a Retention Period for Legacy Strategies

You may need to set a variety of retention periods for different indices. Some indices may only require a two month retention period whereas others may require longer periods of storage. For example, firewall logs are often kept for a period of two months, but other log messages may need to be kept for longer periods to fulfill compliance regulations.

After the retention period is complete, Graylog automatically deletes older messages.

Next Steps

Now that you have created your first archive, let's walk through the process of restoring archived data.