Grok Debugger

Kibana grok pattern is a tool that helps us to construct grok patterns. We can use grok patterns to parse unstructured data such as different logs, which can be any web server log, such as Apache or nginx, any database log, such as MySQL, or syslog logs. These logs are written for us to see whether there is an issue in the system or whether we want to know what is causing the problem.

Grok expressions are basically a pattern-matching syntax, which can be used to parse arbitrary text and convert that into a structured format.

But the unstructured format of logs makes it difficult to fetch any details in an easy way. By creating a grok and applying it in Logstash, we can easily convert the unstructured data into structured data, which can easily be searched and analyzed. But it is never easy to write a Grok Pattern and test it every time by executing the Logstash configuration to check whether we are getting the intended results.

Kibana grok patterns makes this process quite easy as we can simulate the Grok Pattern execution and see the result on the same page. To create a Grok Pattern in Kibana, we need to click on the Grok Pattern tab on the Dev Tools page, which will open the following screen:

The preceding screenshot shows the Grok Pattern page, where we have three segments, Sample Data, Grok Pattern, and Structured Data, which are described as follows:

Sample Data: Here, we use to type the unstructured Sample Data from any log such as Apache, MySQL, syslog, or any other arbitrary data. We can write the Grok Pattern to create Structured Data using this data.
Grok Pattern: Here, we write our actual Grok Pattern, and it matches the sample data value. Once we write our Grok Pattern, we can click on the Simulate button to see the result.
Structured Data: Once the Grok Pattern is written and we click on the Simulate button, the Structured Data can be seen under the Structured Data text area. In case of any issue, an error message can be displayed, and we won't be able to see anything under the Structured Data text area.

Now, look at this example where we have a log with the following structure, and we have an IP address, request method, URL, total bytes, and duration to serve the page:

127.0.0.1 GET /mytestpage.php 11314 0.011

In the log file, we have the line-wise entry of these logs, which is quite difficult to search and find in case of any anomaly; however, if we can push this log into Elasticsearch, then we can leverage the full text search benefit of it and drill down to get any minute detail from the log file. But, again, because it is unstructured data, how can we search if the method is GET or POST or the URL is abc.html? For that, we need to extract this unstructured data through the mapping to convert it into Structured Data.

For this conversion, we use grok patterns, and by applying the grok plugin into Logstash, we can push the Structured Data into Elasticsearch or any other output source. Now, let's take the example of a Grok Pattern, using which we can extract the preceding log:

%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:bytes} %{NUMBER:duration}

In the preceding expression, we are matching the log entry to the Grok Pattern. So, when we try to simulate this pattern against the sample data, it gives us the following result:

{
      "duration": "0.011",
      "request": "/mytestpage.php",
      "method": "GET",
      "bytes": "11314",
      "client": "127.0.0.1"
}

In the preceding result, we can see that the unstructured log entry has been changed in structured key-value pared JSON. In this way, we can simulate the grok pattern against sample data in the Kibana Grok Debugger, and, once it is working, we can use the pattern in our Logstash configuration to process the actual data against the pattern.

Now, let's take a little more complex data such as Catalina logs from a Java application:

2019-02-19 13:53:53.080 WARN 27408 --- [Executor-870113] o.s.web.client.AsyncRestTemplate : Async POST request for "https://ohionewstore.com" resulted in 401

The preceding sample data has been taken from the catalina.out log file, which captures the Java application logs. This is a little complex from the previous example, as there are different segments that we need to extract and match with the field names. See the following pattern, which we have written to match the preceding sample Catalina log entry:

%{TIMESTAMP_ISO8601:timestamp}%{SPACE} %{LOGLEVEL:level} %{NOTSPACE:sessionid} --- %{NOTSPACE:thread} %{NOTSPACE:source} %{SPACE}: %{GREEDYDATA:message}

In the preceding pattern expression, we are handling the field mapping along with spacing and dashes (---) to extract the unstructured data into structured data. After simulating the pattern against a sample Catalina log entry, we can get the following result:

{
  "level": "WARN",
  "sessionid": "27408",
  "thread": "[Executor-870113]",
  "source": "o.s.web.client.AsyncRestTemplate",
  "message": "Async POST request for "https://ohionewstore.com" resulted in 401",
  "timestamp": "2019-02-19 13:53:53.080"
}

This way, we can create our own grok pattern here, and once it runs successfully, we can apply it for a complete set of data to convert it into a structured form. The following screenshot shows an example of Catalina log extraction using Grok Pattern:

In the preceding screenshot, we can see the Catalina log sample data under the Sample Data textbox, and then we have the pattern under the Grok Pattern textbox. In the Structured Data text area, we can see the structured data, which is generated after clicking on the Simulate button. This way, using Dev Tools, we can do some of the very important stuff in an easy way.

Next, we are going to cover Timelion.

Table of Contents for Grok Debugger

Create new playlist

Sign In

Sign Up

Table of Contents for
Grok Debugger