Backup and Restore Best Practices

Proper backup and restore practices are essential for maintaining system integrity and ensuring business continuity in Graylog. A comprehensive backup strategy should include all critical components—Graylog server, MongoDB, and Data Node—while automating regular backups to minimize data loss. Storing backups in secure, redundant locations helps protect against failures and security threats. Regularly testing the restore process ensures data integrity and system functionality, while a well-defined disaster recovery plan enables quick restoration in case of unexpected disruptions.

In this article, we cover the essential recommended backup and restore procedures for the Graylog stack.

Warning: Do not rely on virtual machine snapshots as a backup method!

Prerequisites

Before proceeding, ensure that the following prerequisites are met:

Ensure you have administrative access to the Graylog server and related components (MongoDB, Data Node, and optional components).
You must have mongodump and mongorestore tools for MongoDB backups.
A designated storage location for backups (local disk, network storage, or cloud-based solutions) will be required.
If TLS is enabled, ensure all necessary certificates and keys are backed up.

Highlights

The following highlights provide a summary of the key takeaways from this article:

Back up Graylog configuration, MongoDB database configuration and metadata, and log data to ensure full system recovery.
Choose from minimal, full, incremental, and archive backups based on retention needs and resource availability.
Schedule backups, use remote storage, encrypt sensitive data, and implement access controls.
Follow structured steps for restoring Graylog, MongoDB, and Data Node while ensuring compatibility and consistency.
Regularly test backup integrity, maintain multiple backup versions, and document the recovery processes.

Backup Components

To effectively backup your Graylog deployment, all components of the Graylog stack must be considered.

Graylog: Consists of configuration files located in /etc/graylog/server/server.conf and any custom configuration files that are contained on each node as well as optional supporting files such as plugins or custom scripts.

Data Node: Consists of snapshots of the message data and configuration files. This component forms the core of the backup, as it stores log data, which can be efficiently preserved using incremental snapshots. Losing this data could mean gaps in log history, which can impact security analysis and compliance.

MongoDB: Consists of a database backup and configuration settings like user data, dashboards, alerts, and stream definitions. Backing up the MongoDB database is essential for restoring Graylog’s operational settings and user configurations.

Optional Backup Components

Graylog Forwarder: Consists of configuration files and optional supporting files.

Graylog Sidecar: Consists of configuration files.

Hint: If you’ve installed additional functionality or integrations, you should backup these components as well to ensure full restoration.

Types of Backups

Choosing the right backup strategy is essential for ensuring data protection while balancing storage and recovery needs. There are multiple types of backups that can be conducted, each designed for different scenarios. Whether you need a minimal backup to preserve system settings, a full backup for complete recovery, or incremental and archive backups for efficient data management, understanding these options will help you implement a reliable backup plan.

Minimal

A minimal backup consists of only the Graylog configuration and MongoDB database. While this preserves the system settings, all log data will be lost. This type of backup is suitable for cases where log retention is not critical but the system needs to be restored quickly with the same configurations.

Full Backup

A full backup captures the entire Graylog environment, including configurations, logs, and metadata. This method is recommended for comprehensive recovery but can be resource-intensive for large deployments.

Incremental Backup

Incremental backups only capture data that has changed since the last backup. This method is efficient for ongoing maintenance and minimizes storage requirements while ensuring data protection. This method would be accomplished by capturing point-in-time snapshots of configuration and data and utilizing snapshots for incremental capture of log data.

Archive Backup

Archives can be backed up to provide long-term retention of log data. They can be stored offline and restored to the same cluster on which they were archived on demand.

Best Practices for Backup and Restore

To ensure a reliable backup and restore process, it is essential to follow a few best practices:

Automate Backups: Use cron jobs or scheduled tasks to automate regular backups.

Use Remote Storage: Store backups in an off-site or cloud location to prevent loss due to hardware failure.

Test Restores Periodically: Ensure that the backup process is effective by testing restoration procedures regularly.

Encrypt and Secure Backups: Protect sensitive log data by encrypting backups and restricting access.

Maintain Versioning: Keep multiple backup versions to allow recovery from different points in time.

Backup and Restore MongoDB

In MongoDB, the tool mongodump creates a binary export of a database. Instructions on how to utilize this tool can be found in the MongoDB documentation.

As part of a Graylog deployment, a database typically named graylog is created and should be backed up; however, you must know whether you have MongoDB deployed as a single host or as a replica set. In a replica set environment, data is distributed across multiple nodes, which introduces challenges related to data consistency and replication lag. For example, when backing up a replica set, you need to ensure that the data is consistent across all nodes, which often involves using additional options like the --oplog flag to capture the operations log for a point-in-time snapshot. To identify your deployment type, check the value of the mongodb_uri parameter in your Graylog node(s) server.conf configuration file. This distinction shapes backup procedures, especially regarding consistency and replication states. More information on that can also be found in MongoDB's documentation.

In contrast, a single host deployment is more straightforward since there is only one source of data, and consistency issues inherent in replica sets are not a concern. However, even with a single host, you must ensure that the backup process does not disrupt ongoing operations.

In addition to backing up your graylog MongoDB database, backup the configuration files of your MongoDB replica members. Here are their default locations listed by platform. If you have security considerations, consider backing up these related files, such as certificates and keystore files.

Warning: It is strongly recommended you test your backup and restore procedures in a non-production environment to ensure their effectiveness and proper functioning before relying on them in production!

Backup Procedure for MongoDB

Here is a general backup example using mongodump to backup MongoDB:

Use the mongodump tool to create a dump of your database.
Ensure that the backup directory is not on the same disk or volume as the MongoDB data files for better protection against disk failures.

Restore Procedure for MongoDB

To restore MongoDB:

Use the mongorestore command to restore the dump previously created.
Ensure that the database you want to restore does not already exist. If it does, you can either drop it or restore to a different database.

Hint: Restore your data to the same version used in backing up to prevent compatibility issues. For example, if you have a backup from MongoDB 4.4, you should restore it on MongoDB 4.4. You should not restore it on 5.0.

Backup and Restore Data Node

Backing up and restoring a Graylog Data Node involves managing the data stored within the embedded OpenSearch instance that the Data Node controls. This process ensures that your log data and configurations are preserved and can be restored in case of data loss or system failure.

The Data Node backup procedure primarily relies on the OpenSearch snapshot mechanism for backups. Using the OpenSearch snapshot API, a full snapshot captures the complete state of the cluster at a given point in time, including all indices and cluster metadata. OpenSearch snapshots are designed to be incremental by default. After the initial full snapshot, subsequent snapshots only store data that has changed since the previous snapshot.

The snapshot is then restored to a registered snapshot repository. A snapshot repository is just a storage location such as a shared file system, Amazon Simple Storage Service (Amazon S3), or Azure Storage.

Backup Procedure for Data Node

Here is a Data Node backup example using a shared file system under the following example parameters:

Data Node uses a warm tier repository of type FS.
The repo_path configuration is set to /usr/share/fs-repos/r1.
The repository /usr/share/fs-repos/r1 is registered as fs1.
The environment has hot indices as well as warm indices.
Warm indices have their snapshots stored in the fs1 repository

To backup Data Nodes under these parameters, consider the following example process:

Generate a client certificate in the web UI by navigating to System > Data Nodes > Configurations > Generate client certificate. Save the CA Certificate, Private key, and Certificate details generated.
Stop the Graylog service:
Copy
```
sudo systemctl stop graylog-server
```
Record all software versions to ensure you restore to the same version from the backup.
Back up the Graylog configuration file (located at /etc/graylog/server/server.conf).and other files, such as certificates and keystore files, as needed.
Mongodump the MongoDB database and configuration or supporting files.
Create a repository path or optionally add a second destination to the repo_path configuration for the backup, e.g. /usr/share/fs-repos/r2.

Register the backup repository with the following API call to the Data Node. (Include your CA, Private key, and Certificate from the previous step):

Copy

curl -XPUT "http://localhost:9200/_snapshot/backup-repo" --key private.key --cert private.cert --cacert ca.cert -H 'Content-Type: application/json' -d' { "type": "fs", "settings": { "location": "/usr/share/fs-repos/r2" } }'

Create a backup snapshot for warm indices if you use data tiering and have warm tier indices enabled:

Copy

curl -XPUT "http://localhost:9200//_snapshot/backup-repo/warm-indices" --key private.key --cert private.cert --cacert ca.cert -H 'Content-Type: application/json' -d' { "indices": "*_warm_*,-.*", "ignore_unavailable": true, "include_global_state": false, "partial": false }'

Explanation of the indices:

*_warm_* = all indices which contains the _warm_ keyword

-.* = exclude all indices beginning with a dot

Create a backup snapshot for hot indices:

Copy

curl -XPUT "http://localhost:9200/_snapshot/backup-repo/hot-indices" --key private.key --cert private.cert --cacert ca.cert -H 'Content-Type: application/json' -d' { "indices": "*,-.*,-*_warm_*", "ignore_unavailable": true, "include_global_state": false, "partial": false }'

Explanation of the indices:

* = all indices

-.* = exclude all indices beginning with a dot

-*_warm_* = exclude all indices which contains the _warm_ keyword

Copy the datanode.conf file. If you have multiple nodes, then do this for all nodes.
Restart the Graylog service:
Copy
```
sudo systemctl start graylog-server
```

Restore Procedure for Data Node

To restore a Data Node instance under the same parameters discussed in the previous section, consider the following example process:

Install all software to the same versions recorded in step 3 of the backup procedure.
Copy all configurations to their appropriate locations.
Start MongoDB and run mongorestore.
Configure the datanode.conf file to include insecure_startup=true to start the Data Node without the Graylog server by bypassing security requirements.
Set the repo_path configuration to /usr/share/fs-repos/r2 on the new Data Node.
Register the repository with the following API call:
Copy
```
curl -XPUT "http://localhost:9200/_snapshot/fs1" -H 'Content-Type: application/json' -d' { "type": "fs", "settings": { "location": "/usr/share/fs-repos/r2" } }'
```
Hint: It is important to register the repository with the same name as the previous Data Node because the repository name is saved in the index configuration. Otherwise, the data tiering rollover would break.

Restore warm indices if you previously created a warm indices snapshot:

Copy

curl -XPOST "/_snapshot/fs1/warm-indices/_restore" -H 'Content-Type: application/json' -d' { "indices": "*", "ignore_unavailable": true, "include_global_state": false, "include_aliases": false, "partial": false, "storage_type": remote_snapshot }'

Restore hot indices:

Copy

curl -XPOST "/_snapshot/fs1/hot-indices/_restore" -H 'Content-Type: application/json' -d' { "indices": "*", "ignore_unavailable": true, "include_global_state": false, "include_aliases": false, "partial": false, "storage_type": local }'

Edit the datanode.conf file to insecure_startup=false to restart the Data Node with certificates provisioned.
Restart the Graylog service:
Copy
```
sudo systemctl start graylog-server
```

Hint: The snapshot containing the warm indices must be deleted manually at a later time. Normally, each warm tier index has a snapshot, and the index is deleted together with the snapshot at some point during retention. However, since all warm indices are in one backup snapshot, Graylog will not be able to delete this backup snapshot automatically. It is only possible to delete the snapshot when no warm index is using it.

To delete a backup snapshot with warm indices:

Copy

DELETE /_snapshot/fs1/warm-indices

To delete a backup snapshot with hot indices:

Copy

DELETE /_snapshot/fs1/hot-indices

Backup and Restore Graylog

Backing up Graylog overall entails backing up both MongoDB (which stores configurations and user data) and the Data Node (which stores logs and messages). In this section, we focus on backing up essential configuration settings for the Graylog service and not log data.

Your Graylog configuration files are typically found at /etc/graylog/server/server.conf. Similar to MongoDB and Data Node backups, consider backing up other files such as certificates and keystore files if custom certificates are in use.

Backup Procedure for Graylog

The following is a general example of how you might approach backing up important configuration information for Graylog:

Backup the Graylog server configuration file located at /etc/graylog/server/server.conf:
Copy
```
sudo cp -r /etc/graylog/server /backup/graylog-config
```
This command stores Graylog’s configuration file in the /backup/graylog-config directory.
Backup configuration parameters that control the Java Virtual Machine (JVM) environment in which Graylog runs. Here are the default file paths:
- (Debian) /etc/default/graylog-server
- (Redhat) /etc/sysconfig/graylog-server
Backup any additional plugins and customizations you have made to Graylog. The default path to the plugins directory is: /usr/share/graylog-server/plugin.

Restore Procedure for Graylog

To restore Graylog, follow these general procedures:

Install the Graylog service with the same version as your backup and following your previous method of installation, i.e. operating system packages.
Stop the Graylog service:
Copy
```
sudo systemctl stop graylog-server
```

Restore the Graylog server configuration file:

Copy

sudo cp -r /backup/graylog-config/* /etc/graylog/server/

Restore any plugins and customizations.

Restore the MongoDB data (from the section on backing up MongoDB):

Copy

mongorestore --host localhost --port 27017 --db graylog /backup/mongodb/graylog

Update for new DNS names if required.
Restart the Graylog service:
Copy
```
sudo systemctl start graylog-server
```

Backup and Restore Best Practices

Prerequisites

Highlights

Backup Components

Optional Backup Components

Types of Backups

Minimal

Full Backup

Incremental Backup

Archive Backup

Best Practices for Backup and Restore

Backup and Restore MongoDB

Backup Procedure for MongoDB

Restore Procedure for MongoDB

Backup and Restore Data Node

Backup Procedure for Data Node

Restore Procedure for Data Node

Backup and Restore Graylog

Backup Procedure for Graylog

Restore Procedure for Graylog

Further Reading