Preparing data

As Kibana is all about gaining insights from data, before we can start exploring with Kibana, we need to have data ingested to Elasticsearch, which is the primary data source for Kibana. When you launch Kibana, it comes with predefined options to enable loading of data to Elasticsearch with a few clicks and you can start exploring Kibana right away. When you launch Kibana by accessing the http://localhost:5601 link in your browser for the first time, you will see the following screen:

You can click on the Try our sample data button to get started quickly with Kibana by loading predefined data, or you can configure existing indexes present in Elasticsearch and analyze existing data by clicking on the Explore on my own button.

Clicking on the Try our sample data button will take you to the following screen:

Clicking on Add data on any of the three widgets/panels will add some default data to Elasticsearch as well as sample visualizations and dashboards that you can readily explore. Don't worry what visualizations and dashboards are now; we will be covering them in detail in the subsequent sections.

Go ahead and click on Add data for Sample eCommerce orders. It should load data, visualizations, and dashboards in the background. Once ready, you can click on View data, which will take you to the eCommerce dashboard:

The following screenshot shows the dashboard:

The actual data that is powering these dashboards/visualizations can be verified in Elasticsearch by executing the following command. As seen, the sample data is loaded into the kibana_sample_data_ecommerce index, which has 4675 docs:

C:>curl localhost:9200/_cat/indices/kibana_sample*?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open kibana_sample_data_ecommerce 4fjYoAkMTOSF8MrzMObaXg 1 0 4675 0 4.8mb 4.8mb

If you click on the Remove button, all the dashboards and data will be deleted. Similarly, you can click on the Add data button for the other two widgets if you want to explore sample flight and sample logs data.

If you want to navigate back to the home page, you can always click on the Kibana icon at the top-left corner, which will take you to the home screen, which will be the default screen once you load Kibana in the browser again. This is in same screen you would have been taken to if you had clicked on the Explore on my own button when Kibana was loaded for the first time:

Clicking on the link in section 1 will take you to the Sample data page that we just saw. Similarly, if you want to configure Kibana against your own index and use it for data exploration and visualization, you can click on the link in section 2, in the previous screenshot. In earlier chapters, you might have read briefly about Beats, which is used for ingesting file or metric data easily into Elasticsearch. Clicking on the buttons in section 3 will take you to screens that provide standard instructions of how you can enable the insertion of various types of data using Beats. We will be covering more about Beats in the subsequent chapters.

In this chapter, rather than relying on the sample default data shipped out of the box, we will load custom data which we will use to follow the tutorial. One of the most common use cases is log analysis. For this tutorial, we will be loading Apache server logs into Elasticsearch using Logstash and will then use it in Kibana for analysis/building visualizations.

https://github.com/elastic/elk-index-size-tests hosts a dump of Apache server logs that were collected for the www.logstash.net site during the period of May 2014 to June 2014. It contains 300,000 log events.

Navigate to https://github.com/elastic/elk-index-size-tests/blob/master/logs.gz and click the Download button. Unzip the logs.gz file and place it in a folder (For example: C:packtdata).

Make sure you have Logstash version 7.0 or above installed. Create a config file named apache.conf in the $LOGSTASH_HOMEin folder, as shown in the following code block:

input 
{ 
 file {
 path => ["C:/packt/data/logs"]
 start_position => "beginning"
 sincedb_path => "NUL"
 }
}

filter 
{
 grok {
 match => {
 "message" => "%{COMBINEDAPACHELOG}"
 }
 }
 mutate {
 convert => { "bytes" => "integer" }
 }
 date {
 match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ]
 locale => en
 remove_field => "timestamp"
 }
 geoip {
 source => "clientip"
 }
 useragent {
 source => "agent"
 target => "useragent"
 }
}

output 
{ 
 stdout { 
 codec => dots
 } 
 elasticsearch { }
}

Start Logstash, shown as follows, so that it can begin processing the logs, and index them to Elasticsearch. Logstash will take a while to start and then you should see a series of dots (a dot per processed log line):

$LOGSTASH_HOMEin>logstash –f apache.conf

Let's verify the total number of documents (log events) indexed into Elasticsearch:

curl -X GET http://localhost:9200/logstash-*/_count

In the response, you should see a count of 300,000.

Table of Contents for Preparing data

Create new playlist

Sign In

Sign Up

Table of Contents for
Preparing data