JSON Format

You don’t have to be a big-data expert to load data into your Graylog API Security database. This section covers all the details of our open JSON format, which can be supplied by nearly any data source.

Why Use JSON?

Nearly every modern programming language and data processing system provides JSON support without any extra libraries or special dependencies. In many cases building and parsing JSON is actually done via native routines, which are efficient and fast.

Certainly there are other binary formats, like protobuf and BSON, that can have better performance. But these are harder to consume, especially by humans. None of these alternatives are universally available like JSON. Some have too many dependencies that can potentially conflict with your app's existing dependencies. Given all these factors, we think JSON strikes the right balance with good efficiency and excellent ease of use.

JSON Grammar

This grammar defines the data structures that are specific to logging API requests and responses.

  • Each API call (with request and response details) is a single message.
  • Each message is an array of one or more message details.
  • Each detail associates a key string with a value string.
  • All key strings must be formatted properly based on the type of key.

Here's the geekier way of saying all of that:

Copy
message
    [ message-details ]

message-details
    message-detail
    message-detail, message-details

message-detail
    [ "key", "value" ]

Key Strings

All key strings are formatted based on the type of key, and whether the key includes an identifying name. Keys with names may appear multiple times in a message, but keys without names appear only once in the message.

Copy
Key String              Count   Description
---------------------   -----   ------------------------
custom_field:<name>      0..n   Named custom detail
host                        1   Host identifier
interval                    1   Elapsed service time
now                         1   Response unix timestamp
request_body             0..1   Body content as text
request_header:<name>    0..n   Named header
request_method              1   HTTP method
request_param:<name>     0..n   Param from URL or body
request_url                 1   HTTP url
response_body            0..1   Body content as text
response_code               1   HTTP return code
response_header:<name>   0..n   Named header
session_field:<name>     0..n   Named session detail

By convention, key strings are always all lowercase (including the name portion). This is convenient when using this format and for writing logging rules.

JSON Examples

Use Case

This first example shows the minimum number of details to expect for each HTTP request and response. This has URL and timing information but not much else.

Copy
[
["request_method","GET"],
["request_url","http://myurl"],
["response_code","200"],
["host","web.1"],
["interval","1.29318200"],
["now","1619848800000"]
]

Extended Use Case

This second example shows a larger set of key/value details. (Logging rules are used to control how many details are kept and how many are discarded.)

Copy
[
["request_method", "POST"],
["request_url","http://localhost:5000/?action=new"],
["request_body", "{ \"customerID\" : \"1234\" }"],
["request_header:version","HTTP/1.1"],
["request_header:host","localhost:5000"],
["request_header:connection","keep-alive"],
["request_header:cache-control","max-age=0"],
["request_header:upgrade-insecure-requests","1"],
["request_header:user-agent","Mozilla/5.0..."],
["request_header:accept","text/html,application/xhtml+xml,application/xml"],
["request_header:accept-encoding","gzip, deflate, br"],
["request_header:accept-language","en-US,en;q=0.9"],
["request_header:cookie","_ruby-getting-started_session=MTFxM0tmZG"],
["request_header:if-none-match","W/\"70bd4196dfa68808be58606609ed8357\""],
["request_param:action","new"]
["response_code","200"],
["response_header:x-frame-options","SAMEORIGIN"],
["response_header:x-xss-protection","1; mode=block"],
["response_header:x-content-type-options","nosniff"],
["response_header:content-type","text/html; charset=utf-8"],
["response_header:etag","W/\"1467037e1e8\""],
["response_header:cache-control","max-age=0, private, must-revalidate"],
["response_header:set-cookie","_ruby_session=WHZtbllOcU...; path=/; HttpOnly"],
["response_header:x-request-id","2209f8b1-ed2f-420c-9941-9625d7308583"],
["response_header:x-runtime","0.314384"],
["response_header:content-length","8803"],
["response_body","\n\n\n \n\n\n...\n\n"],
["session_field:session_id","8687e4ba9"],
["session_field:_csrf_token","nMI/JGb4GB"],
["host","web.1"],
["interval","1.29318200"],
["now","1619848800000"]
]

Batching with NDJSON

The JSON format described so far has been used to serialize a single message. When you export or import logger messages into your Graylog API Security database, this is done using NDJSON format, which is an easy way to serialize a long list of messages.

With this format, each line in the file is a valid JSON document. But the entire NDJSON file itself is not valid JSON, because it's not formatted as proper array of comma-separated objects. But if your intent is to read the file one line at a time, each line will be a valid JSON object that can be parsed on its own.

This might seem a little strange to newcomers at first, but this is nicely efficient in cases (like this one) where each message is parsed separately and processed in linear fashion.

The NDJSON files that API Security imports and exports are always gzipped by convention. These files typically have a high compression ratio, and this greatly improves import and export performance, especially when working with remote databases.

Here's an example of posting a NDJSON batch:

Copy
echo '[["now","1619848800001"],["request_method","GET"],["request_url","http://myurl1"],["response_code","200"],["host","web.1"],["interval","1.29318200"]]' > batch.ndjson

echo '[["now","1619848800002"],["request_method","GET"],["request_url","http://myurl2"],["response_code","200"],["host","web.2"],["interval","2.42931820"]]' >> batch.ndjson

gzip batch.ndjson

curl -F "uploaded_file=@$PWD/batch.ndjson.gz" http://localhost:7701/upload