Create a Warm Tier on Data Node

The following article exclusively pertains to a Graylog Enterprise feature or functionality. To learn more about obtaining an Enterprise license, please contact the Graylog Sales team.

When you use Graylog Data Tiering, you set data storage retention policies when you create or update index sets. In fact, you can set different tier policies for each index set, based on the requirements of the data they contain. Establishing a retention policy with a warm tier lets you maintain less-frequently searched data in lower cost and lower performance storage. See Data Tiering for complete information.

If you intend to use data tiering with a warm tier, you first need to configure your backend storage. This article explains how to prepare your environment for a warm tier on a Data Node installation, and also includes steps to follow to enable a warm tier for both new and existing index sets.

Hint: If you are using a self-managed OpenSearch installation, the requirements and procedures are different. See Create a Warm Tier on Self-Managed OpenSearch for information.

Prerequisites

Before proceeding, ensure that the following prerequisites are met:

You must be a Graylog administrator to set up a warm tier backend and enable a warm tier.
For Amazon S3 and Google Cloud Storage (GCS) storage backends, you must have appropriate credentials to configure that storage.

Prepare Your Environment for a Warm Tier

Before you can enable a warm tier for your Data Node installation, you must set up a storage repository. We recommend that you locate your warm tier data in an S3 or GCS bucket, but you can also choose to store this data in any supported file system repository.

The initial steps for enabling a repository are completed outside of the Graylog web interface and differ depending on the storage backend you choose:

For a file system, follow these steps▼

In the Data Node configuration file, datanode.conf, for all nodes, specify the location of the local storage with the path_repo property. This storage location must be located on shared storage that is accessible from all nodes, such as an NFS share.
(Optional) If your architecture includes dedicated warm tier nodes, set the node_roles property in datanode.conf on all nodes:
- For hot tier nodes:
  Copy
```
node_roles = cluster_manager,data,ingest,remote_cluster_client
```
- For warm tier nodes:
  Copy
```
node_roles = search
```
  Hint: Using dedicated warm tier storage is a complex architectural decision involving your data retention requirements and your search volume and search range needs. Typically, this option is used only in larger environments.
  
  If you implement warm tier with mixed role nodes, you don't need to set node_roles. All necessary roles are assigned when you set the path_repo property.
Restart all nodes for the changes to take effect.

Create a Repository

After you complete all the prerequisite steps, you must create at least one storage repository to store snapshots. Remember, we recommend that you locate your warm tier data in an S3 or GCS bucket, but you can also choose to store this data in a local file system repository.

Create a warm tier storage repository in the Graylog web interface as follows:

Navigate to System >Indices.
Click Edit for the desired index set.
In the Rotation and Retention section, select Data Tiering.
Click Create new warm storage repository.
Select your Warm Storage Repository Location at the top of the form.

Hint: You must complete backend setup for each type of backend before you can create a backend of that type. Only backend types for which you have completed the prerequisite steps are available to select when you create a warm tier backend.
For Amazon S3▼
1. Give your repository a unique name.
2. Enter the name of the S3 bucket in which logs will be stored.
3. Enter the required base path where the archives should be stored within the S3 bucket.
  
  You can use a single bucket for multiple purposes. For instance, you could use the same bucket for a Data Lake backend and a warm tier snapshot backend. However, if you do, it is important to use different sub folders for each specific use. The base path you set here determines the sub folder structure for this backend.
4. Click Create.
For Google Cloud Storage▼
1. Give your repository a unique name.
2. Enter the name of the GCS bucket in which logs will be stored.
3. Enter the required base path where the archives should be stored within the GCS bucket.
  
  You can use a single bucket for multiple purposes. For instance, you could use the same bucket for a Data Lake backend and a warm tier snapshot backend. However, if you do, it is important to use different sub folders for each specific use. The base path you set here determines the sub folder structure for this backend.
4. Click Create.
For File System▼
1. Give your repository a unique name.
2. Select your file system location from the dropdown.
3. Click Create.

Set Up a Warm Tier

When your environment is ready for Data Tiering, you can enable the warm tier for both new and existing index sets.

Enable the Warm Tier for a New Index Set

You can select warm tier for a new index set when you create it. You can create a new index set based on built-in index set templates provided by Graylog or you can create custom templates for your environment.

For complete information about creating index sets and index set templates, including how to enable the warm tier, see Index Set Templates.

Enable the Warm Tier for an Existing Index Set

To choose warm tier storage for an existing index set, follow these steps:

Navigate to System >Indices.
Click Edit for the desired index set.
In the Rotation and Retention section, select Data Tiering.
Enter the minimum and maximum number of days you want your data to be stored.
Select the Enable warm tier checkbox, then enter the minimum number of days to keep your data in the hot tier. The visual synchronously displays how long your data will be kept in each tier as you make your selections.
Select the repository you want your data stored in from the Repository dropdown. The menu includes any repositories you created earlier.
Click Update index set.

View Data Tiering Configuration

After you have created or updated your index set, you can view configuration information as follows:

Navigate to System > Indices
Select the desired index set to view the index set overview page.

Here you can see warm displayed in the index title after the index prefix.

You can perform searches in the warm tier, and you can verify that your search results include the warm tier by checking the Stored in index section. You will see warm in the index set title.

Warning: If the warm tier is disabled, you still may be able to perform searches in the warm tier, but data no longer rolls over from the hot tier to the warm tier. This limitation can cause performance issues.