Preview Logs from a Data Lake
Data Lake Preview in Graylog provides visibility into log data stored in your data lakes. This feature is available for both internal and external data lakes. You can preview and examine logs before retrieving your data. Previewing data does not affect license usage because log data counts against license usage only upon retrieval if it hasn’t previously been sent to your search backend.
This topic explains how to preview log data from Graylog Data Lakes as well as how to retrieve data from previewed results. For complete information about internal and external data lakes, see Data Lakes.
Prerequisites
Before proceeding, ensure that the following prerequisites are met:
-
You must be a Graylog administrator or have the
Data Lake Userrole to preview data and retrieve logs from a data lake. -
You must have an existing data lake containing log data.
Data Lake Preview vs. Search
Data Lake Preview is not the same as search and search queries in Graylog. Search operates on the indexed data in your search backend, and you are able to customize queries against that data. While the Data Lake Preview page includes controls similar to search, preview has limitations in terms of the fields you can search for and the filters you can use.
To preview data in a data lake, you select a specific stream (internal) or table (external), then you can target desired log data in that source through the use of filters and by adjusting the time range associated with the timestamp of the stored data.
Preview Log Data
To preview data in a data lake:
-
Navigate to Data Lake > Preview.
-
(Optional) Select which data lake to preview data from. This step is required only if you have multiple data lakes defined—for example, an internal and an external data lake or multiple external data lakes. Available data lakes are shown in the Data Lake Preview dropdown.
-
Select the stream (internal) or table (external) from which you want to preview log data.
For external data lakes, you can use the Table List sidebar to navigate the table structure to find and select data you require. When you select a table in the sidebar, that table is selected for preview above. Note that the table selection determines what fields and filters are available as each table can have a different data structure or schema.
-
(Optional) Select the time range from which you want to preview log data. By default, the range is set to preview the past 30 minutes of logs.
-
(Optional) Apply filters to limit the results. Click Filter by fields, then select the field name from the dropdown to filter by, and enter a value to filter on. As noted above, the filters available are based on backend infrastructure. You are not able to create custom filters or queries.
Click Add field to add additional filters. When using multiple filters, select either
ANDorORlogic. You can include as many filters as needed, but they all use the same join logic. Keep in mind that while preview can apply many filters, a data retrieval operation can use no more than three filters collectively!Hint: Filtering on thestreamorassociated_assetsfields can result in extended processing times because these operations require intensive in-memory computation and data transfer. For better performance, we recommend using these filters with a narrow time range. -
Click Perform Search.
Depending on the amount of data your data lake contains, a preview request can take a significant amount of time to complete.
Matching logs from the data lake are displayed in a list view widget. Your most recent preview request is maintained on the Preview page for 24 hours.
You can refine preview results by changing any of your previous selections, then clicking Perform Search again. You can edit the list view widget by clicking the Edit icon at the upper right. You can add or remove columns and order the columns so that you can more easily see your most relevant data. For internal data lakes, you can add custom columns from your data set by entering the field name even if the option is not in the dropdown.
Use the expand icon at the right of any row to view a pop up window that shows all data for that item, including the columns not displayed in the table. You can apply colored highlighting to fields or values in the results with the Highlighting option at the top left of the screen.
Retrieve Log Data from Preview
After you have previewed log data, you can start a data retrieval operation from that subset of data if necessary. You have multiple methods for retrieval you can employ.
First, you can click Retrieve all results on the top right side of the preview list to open the log retrieval dialog box. When the form opens, it includes the same filters and other selections that you have applied to your preview. You can adjust any of these settings here to further refine your selection. However, note that the retrieval operation is limited to no more than three filters. Click Retrieve to proceed with retrieval.
Second, you can retrieve specific messages from the preview results. Select the checkbox for any log messages you want to retrieve, then click Bulk actions > Retrieve Logs. Click Retrieve in the confirmation dialog box that shows how many logs you have chosen to retrieve.
To view retrieved log data, navigate to Data Lake > Retrievals to see a list of all completed retrievals. From here, you can select Show messages to view the retrieved log data. The Data Lake Jobs section on this page shows any running, queued, or recently completed retrieval operations. Retrieved data can be added to search queries, added to investigations, surfaced in dashboards and reports, and more.
Create an External Data Lake Input
For external data lakes, you have a third option for data retrieval. You can create a Graylog Input from the Data Lake > Preview page, which regularly ingests log data from the source.
To create an External Data Lake Filtered Input:
-
Select Create Input after you have previewed and filtered data as desired from an external data lake. For information on how to preview data, see the section above.
The Launch new Input dialog is linked to the table that you previewed. It also includes any filters you applied.
-
(Optional) Update the filter list, as necessary.
-
Complete all other required fields, including the polling interval, which determines how frequently Graylog queries the external data lake for data. For detailed information about how to set up and launch an input, see Set Up an Input.
-
Click Launch Input.
The input allows you to regularly ingest filtered data into Graylog. Use this method for data that has immediate, high value to your organization and data that you need to review on a continuous basis.
Retrieve Logs Into an Investigation
The Preview page also allows you to retreive log messages directly into an investigation. This feature can be useful since often you could be searching the data lake specifically as part of an active investigation. This feature is available for both internal and external data lakes.
To send previewed log data to an investigation, select the log or logs in the preview list. On the Bulk actions menu, choose Select an investigation. You can choose any of your investigations from the list in the dialog box, or add a new one.
Any log data added to investigations this way shows as retrieved data on the Data Lake Retrievals page.
Further Reading
Explore the following additional resources and recommended readings to expand your knowledge on related topics:
