Chapter 6. Real-Time Twitter Data Analysis

After understanding the various components of Kibana, let's explore in detail how to use Kibana to analyze and visualize data for real-world scenarios. In this chapter, we will see an end-to-end workflow of how to fetch Twitter data, along with storing data in Elasticsearch. This will be followed by building beautiful visualizations in Kibana to examine various scenarios.

The two possible ways of fetching Twitter data directly into Elasticsearch are by using:

  • Elasticsearch Twitter river
  • Logstash Twitter input

    Note

    Note that Twitter river is available as a plugin. It can be used to fetch tweets using Elasticsearch and Kibana only. To use Twitter input, Logstash is required along with Elasticsearch and Kibana. Both ways allow you to fetch Twitter data easily.

We will use Logstash Twitter input because rivers acting as plugins in Elasticsearch have become deprecated; that is, they will be removed in future versions of Elasticsearch.

Before we move further, let's understand Logstash in brief.

Logstash is an open source tool created by Jordan Sissel. He later joined Elasticsearch, which was renamed Elastic. It is a data collection tool aimed at fetching events for processing. Events are nothing but data containing a timestamp field in it. Logstash is responsible for processing events by connecting with various input sources and storing data in various output sources. It helps combine data from multiple sources and parses it by applying filters to modify the incoming data.

The main purposes of using Logstash are to read event data from different kinds of input sources (these can be a file, HTTP, GitHub, Elasticsearch, and so on), apply filters to transform and process the incoming events (these can be parsing, encoding JSON, aggregation, and so on), and send processed events to the destination source (this can be CSV, a file, CloudWatch, Elasticsearch. and so on).

Logstash can be described in brief as:

Input -------> Filter ---------à Output

The various input plugins available in Logstash are shown here:

Real-Time Twitter Data Analysis

The various filter plugins available in Logstash are as follows:

Real-Time Twitter Data Analysis

Finally, the various output plugins available in Logstash are shown in the following image:

Real-Time Twitter Data Analysis

In this chapter, we are going to take a look at the following topics:

  • The installation of Logstash
  • The workflow for real-time Twitter data analysis
  • Creating a Twitter developer account
  • Creating a Logstash configuration file
  • Creating visualizations for scenarios

The installation of Logstash

In this section, Logstash will be installed. Logstash 1.5.4 will be installed, and the section covers the installation on Ubuntu and Windows separately.

The installation of Logstash on Ubuntu 14.04

To install Logstash on Ubuntu, perform the following steps:

  1. Download Logstash 1.5.4 as a tar file using the following command in the terminal:
    curl-L -O http://download.elastic.co/logstash/logstash/logstash-1.5.4.tar.gz
    
  2. Extract the downloaded .tar file using the following command:
    tar -xvzf logstash-1.5.4.tar.gz
    

    This will extract the files and folder into the current working directory.

  3. Navigate to the bin directory within the logstash-1.5.4 directory:
    cd logstash-1.5.4/bin
    
  4. To check whether Logstash has been installed successfully, type the following command in the terminal after navigating to the bin folder:
    logstash --version
    

    This will print the Logstash version installed.

The installation of Logstash on Windows

We can install Logstash on Windows by going through and applying the following steps:

  1. Download the latest version of Logstash from the Elastic site using the following link:
    curl-L -O http://download.elastic.co/logstash/logstash/logstash-1.5.4.zip
    
  2. Extract the downloaded .zip package by either unzipping it using WinRAR, 7-Zip, and so on (if you don't have any of these software, download any one of them), or using the following command in GIT Bash:
    unzip logstash-1.5.4.zip
    

    This will extract the files and folder into the directory.

  3. Then, click on the extracted folder and navigate through the folder to get to the bin folder.
  4. To check whether Logstash has been installed successfully, type the following command in command prompt after navigating to the bin folder:
    logstash --version
    

    This will print the Logstash version installed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.159.82