Select and Ingest High-Value Log Sources

Log ingestion in Graylog starts with knowing which logs will provide the most value for your organization. This guide helps you evaluate the wide range of potential log sources and identify which ones are most relevant to your environment. Not every log type is necessary for every user, so your goal should be to focus on the data that best supports your security, compliance, and operational needs.

This article explains core data categories (endpoint, network, identity, cloud, and messaging) and examines what each offers, how to recognize high-value events, and how to ingest this data into Graylog. By aligning your log collection with your business priorities, you can avoid data overload, reduce noise, and ensure that your initial Graylog deployment captures the information that truly matters.

Warning: The information in this guide is intended as general best-practice guidance to help you evaluate and prioritize potential log sources for ingestion into Graylog. Because every environment and use case is unique, these recommendations should be interpreted as illustrative rather than prescriptive. Users are responsible for determining which log sources, configurations, and monitoring approaches are most appropriate for their specific operational, compliance, and security requirements.

If you have already identified the log sources you plan to collect, see Log Source Ingestion Reference for details on supported integrations, input types, and configuration requirements.

Endpoint Data

Endpoints are prime attack surfaces. Endpoint data includes the logs, events, and telemetry collected from desktops, laptops, servers, and sometimes mobile devices. Many breaches start at an endpoint through phishing, malware, or insider misuse.

Endpoint data enables you to:

Detect malicious activity as soon as it happens.
Investigate how attackers move and persist in your environment.
Contain and remediate infected systems.

What to Monitor on Endpoints

Endpoint data gives visibility into infections, persistence mechanisms, and hands-on-keyboard attacks through the following sources:

Process and execution logs: Tracks every process created and its parent/child relationships. Many attacks use a local system’s scripts or other executables, such as Windows PowerShell, to perform malicious activity. Use these logs to track suspicious or unexpected relationships between processes.
File system activity logs: Monitors file creation, modification, and deletion. This activity detects ransomware encryption, malware dropped onto disk, or tampering with critical files. Use this activity to detect unexpected or large-scale renaming or encryption of user files.
Registry and configuration change logs (Windows-specific): Watches for changes to registry keys used for persistence (for example, HKLM\Software\Microsoft\Windows\CurrentVersion\Run). Track these changes to find when attackers create auto-start entries to survive reboots.
User login and authentication event logs: Monitors local logins, RDP sessions, and privilege escalations. These logs help spot credential theft, privilege misuse, or lateral movement attempts. Look for repeated failed logins followed by a successful login with elevated privileges.

Map Endpoint Data to Graylog

The table below lists common endpoint data sources and shows how to ingest them into Graylog using the appropriate inputs and Illuminate packs:

Source		Illuminate pack	Input	Notes
Carbon Black		Yes	Syslog	Logs must be sent in CEF format.
CrowdStrike Falcon		Yes	CrowdStrike Input
Microsoft Defender for Endpoint		Yes	Microsoft Defender for Endpoint Input	Ingests Defender alert logs in JSON format.
Native OS logs:
	Linux audit logs	Yes	Beats
	Linux system logs	Yes	Beats Syslog
	Windows event logs	Yes	Beats GELF	Use Winlogbeat for Beats inputs. Use NXlog for GELF inputs. Additional Illuminate packs might apply, depending on the logs you want to capture.
SentinelOne		No	CEF

Network Data

Network data refers to the traffic-level information that shows how systems communicate within an environment and with the outside world. This data is crucial because:

Every cyberattack communicates on the network at some point for command & control (C2), lateral movement, or data exfiltration.
Even if malware hides on an endpoint, its network activity often exposes it.

Network data monitoring covers what’s moving in and out of the environment and laterally between systems.

What to Monitor in Network Data

Use network data to detect lateral movement, data exfiltration, and command-and-control (C2) communications from sources such as the following:

DNS queries: Many malware families use DNS to contact C2 servers or use Domain Generation Algorithms (DGAs) to create new domains dynamically. Look for DNS requests to known malicious domains or newly registered domains. Also, unusual spikes in DNS lookups can indicate beaconing or tunneling.
Firewall and proxy logs: Shows what’s being allowed or blocked at the edge, and what’s being accessed on the web. Repeated blocks for the same host could indicate probing or a brute-force attack. Look for access to suspicious URLs or download sources as well as traffic bypassing required proxies.
IP traffic and flow logs: Reveals who is talking to whom, how much data is moving, and over which ports. Look for outbound connections to blacklisted IP addresses or IP addresses in unexpected geolocations, which could signify known C2 infrastructure. Sudden large data transfers might indicate potential exfiltration. Also watch for abnormal protocols or ports, such as RDP over non‑standard ports.
Lateral movement traffic: After attackers breach a network, they often move laterally via SMB, RDP, SSH, and so forth. Watch for unusual peer-to-peer traffic between systems that don’t normally communicate as well as sudden surges of authentication attempts across the network.

Map Network Data to Graylog

The following table outlines supported network data sources, related Illuminate packs, and recommended Graylog inputs:

Source	Illuminate pack	Input	Notes
Check Point	Yes	Syslog
Cisco ASA	Yes	Raw/Plaintext Syslog	Illuminate includes an additional pack for the Cisco ASA Firepower extension.
Edge Secure Web Gateway (formerly Blue Coat)	No	Syslog	Use the Syslog TCP input and configure a secure connection.
Fortinet FortiGate	Yes	Raw/Plaintext Syslog	Sending Fortigate logs with the CEF format is not supported.
Juniper SRX	Yes	Syslog
Packetbeat	Yes	Beats
Palo Alto	Yes	Palo Alto Networks Input
SonicWall Next Generation Firewalls (NGFW)	Yes	Syslog
Zeek	Yes	Beats	Requires Filebeat. Zeek must be configured to log in JSON format.

Identity and Access Data

Identity and access data covers everything related to who is accessing what, when, and how across your organization’s IT environment. This data is generated by Identity and Access Management (IAM) systems, authentication services, and directory platforms such as Microsoft Active Directory, Entra ID, Okta, and others.

Identity and access data is critical for TDR because:

Almost every major breach involves stolen or abused credentials.
Insider threats and privilege misuse won’t show up in malware scans, but they do show up in authentication data.

What to Monitor in Identity and Access Data

Insider threats, compromised accounts, and privilege misuse are often invisible without identity-level monitoring of sources such as the following:

Authentication event logs: Logons and logoffs show you who logged in, where, and when. Failed login attempts could signal brute force or password spraying. Look for other suspicious logins, such as a user in an unusual or unexpected geolocation.
Privilege change logs: Watch for new admin accounts created or group membership changes such as adding someone to Domain Admins. Also, look out for role escalations in cloud/SaaS platforms, for instance, Okta or AWS administrator role assignments.
Account lifecycle event logs: Tracks new accounts created, disabled, or deleted. Also watch for dormant accounts suddenly being active again, which could indicate former employee accounts were not removed properly.
MFA and security control activity logs: Reveals multifactor authentication (MFA) prompts denied or push fatigue attacks (multiple MFA prompts in a row). Look for bypass attempts such as disabling MFA on a critical user account.
Access to sensitive resources: Watch for unusual file server or database access as well as SaaS app usage outside normal hours or regions.

Map Identity and Access Data to Graylog

The table below lists key identity-related log sources and how to configure Graylog to collect them effectively:

Source	Illuminate pack	Input	Notes
Active Directory	No	Beats	Requires a collector such as Filebeat.
AWS IAM	No	AWS CloudTrail Input
Google Workspace	Yes	Google Workspace Input
Microsoft Entra ID (formerly Azure AD)	No	Microsoft Office 365 Input
Okta	Yes	Okta Log Events Input

Application and Cloud Data

Application and cloud data refers to the telemetry, logs, and configuration events from Software as a Service (SaaS) platforms such as Microsoft 365, Google Workspace, Salesforce, and Infrastructure-as-a-Service (IaaS) providers such as Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP).

Because so much business now runs on cloud and SaaS, attackers increasingly target these platforms directly. TDR needs cloud data to detect attacks that don’t even touch on-premises systems.

What to Monitor in Application and Cloud Data

Application and cloud data detects cloud misconfigurations, data theft, and API abuse from sources such as the following:

Admin and configuration change logs: Watch for new admin accounts, privilege escalations, or policy changes. Attackers who compromise an account often grant themselves persistent access by creating new admin users or changing security settings.
API and service account activity logs: Look for API calls from unusual IP addresses, creation of new service accounts, or high-volume API usage. APIs are often exploited for data scraping or automation-based attacks. Many service accounts are over-permissioned, which facilitates such attacks.
Resource provisioning and change logs (IaaS-specific): Watch for creation of new VMs, storage buckets, security groups, or firewall rules. Attackers can spin up resources for cryptomining, create backdoor virtual networks, or open access to sensitive assets.
Data access and sharing event logs: Monitor for file downloads, sharing outside the organization, and mass export of records, particularly from systems such as CRMs or HR data. Cloud is a prime data exfiltration vector, especially in SaaS.
Security controls and integration logs: Look for alerts from Cloud Access Security Brokers (CASBs), SIEM integrations, and cloud-native security tools. These signals tell you when security configurations are weakened or bypassed.

Map Application and Cloud Data to Graylog

The table below summarizes supported cloud and application log sources and the corresponding Graylog inputs:

Source	Illuminate pack	Input
AWS CloudTrail	No	AWS CloudTrail Input
AWS CloudWatch	No	AWS Kinesis/CloudWatch Input
GitLab	Yes	Raw HTTP
Google Workspace	Yes	Google Workspace Input
Microsoft Azure	No	Azure Event Hubs Input
Salesforce	No	Salesforce Input

Email and Messaging Data

Email and messaging data includes the logs, metadata, and security alerts generated by email systems, such as Microsoft 365 and Gmail, and messaging platforms, such as Slack, Teams, and Zoom.

This data is vital for TDR because:

Email remains the top attack vector, with most breaches starting with phishing.
Messaging apps are now part of the “new inbox,” and attackers are pivoting to them for scams, malicious links, and social engineering.

What to Monitor in Email and Messaging Data

Email and messaging data provides visibility into phishing campaigns and business email compromise (BEC) through sources such as:

Inbound message telemetry: Phishing, spoofing, and business email compromise (BEC) campaigns are often visible in message telemetry first. Monitor sender domain, IP reputation, and standard authentication protocols, such as Sender Policy Framework (SPF) and DomainKeys Identified Mail (DKIM), that verify the authenticity of messages.
Attachment and link analysis: Many attacks start with a malicious attachment or a link to a credential-harvesting site. Monitor for executable attachments, macros in Office docs, PDF payloads, and shortened links.
Mailbox rule and forwarding activity: Attackers who compromise an account often set rules to hide their tracks or exfiltrate mail. Look for new mailbox rules, particularly auto-forwarding or hiding incoming mail.
Authentication and access logs: If a mailbox is compromised, login anomalies often appear in access logs. Watch for failed login attempts and logins from unfamiliar locations.
Messaging app telemetry: Messaging apps are increasingly abused for phishing links, credential prompts, or even malicious file drops. Watch for suspicious file shares in Slack/Teams, unexpected guest users, and abuse of integrations or bots.

Sources of Email and Messaging Data

Email Security Gateways: Proofpoint, Mimecast, Barracuda, Cisco Email Security.
Cloud Email Logs: Microsoft 365 Security & Compliance Center, Google Workspace Admin Logs.
SIEM & XDR Integrations: Alerts from native APIs into Splunk, Sentinel, or Graylog.
Messaging Platforms:
- Slack: Audit logs, app installation logs.
- Teams/Zoom: Admin activity logs, message retention logs.

Map Email and Messaging Data to Graylog

The following table lists common email and messaging data sources, with details on inputs, Illuminate packs, and configuration notes for Graylog integration:

Source	Illuminate pack	Input	Notes
Gmail	Yes	Google Workspace Input	You select specific log types during input setup. Be sure to select `Gmail` to collect these logs.
Microsoft 365	Yes	Microsoft Office 365 Input
Mimecast	Yes	Mimecast Input