Chapter 3. Meeting Hunk Features

Big data analytics is a very popular trend. As a result, most business users want to discover their big data using intuitive and user-friendly tools because exploring data stored in Hadoop or any NoSQL data stores is a challenging task. Fortunately, Hunk does away with all the complexity obstructing analysts and business users. Moreover, it gives additional features that allow us to handle big data in just several mouse clicks. This is possible with Hunk knowledge objects.

In the previous chapter, we created virtual indexes based on web logs for the international fashion retailer Unicorn Fashion. We created some queries and reports via Search Processing Language (SPL). Moreover, we created a web operation dashboard and learnt how to create alerts.

In this chapter, we will explore Hunk knowledge objects, which will help us to achieve better results with less effort. Moreover, we will become familiar with pivots and data models, in order to learn how to work with Hunk with the traditional Business Intelligence (BI) tool.

Knowledge objects

Hunk has the same capabilities as Splunk; as a result we can create various knowledge objects that can help us explore big data and make it more user-friendly.

Tip

A knowledge object is a configuration within Hunk that uses permissions and is controlled via the Hunk access control layer. Knowledge objects can be scoped to specific applications. Read/write permissions for them are granted to roles.

To work with knowledge objects, go to the KNOWLEDGE menu under Settings:

Knowledge objects

There are various knowledge objects available in Hunk. We encountered SPL, reports, dashboards, and alerts in the previous chapter. Let's expand our knowledge of Hunk and explore additional knowledge objects.

Tip

For more information about knowledge objects, see: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/WhatisSplunkknowledge.

Field aliases

Field aliases help us to normalize data over several sources. We can create multiple aliases in one field.

Tip

Aliases should be applied after field extraction, before lookups. In addition, we can apply field aliases to lookups.

Let's create a new alias using the following steps:

  1. Go to Settings | Fields | Field aliases.
  2. Click on Add new.
  3. Enter the Name as Web Browser, type the sourcetype as access_combined, create this alias under Field aliases: useragent = web_browser, and click Save—as shown in the following screenshot:
    Field aliases
  4. Change sharing permissions to This app only (search).
  5. Go to search and run: index="digital_analytics".
  6. Then look at the fields. There is a new field—web_browser:
    Field aliases

Moreover, we can create the same alias web_browser for any other data source. For example, we could have other logs, where instead of useragent we could just have agent. In this case, we can create a new alias that will map agent as web_browser. As a result, we create one alias for two different fields from various data sources.

Calculated fields

A calculated field acts as a shortcut for performing repetitive, long, or complex transformations using the eval command.

Tip

Calculated fields must be based upon an extracted or discovered field. It is impossible to create a calculated field based on another calculated field.

For example, say we want to monitor bandwidth usage in megabytes but we have all our data in bytes. Let's create a new field to convert bytes to megabytes:

  1. Go to Settings | Fields | Calculated fields.
  2. Click on Add new.
  3. Type the sourcetype as access_combined, the Name as bandwidth, and Eval expression as bytes/1024/1024. Click on Save:
    Calculated fields
  4. Change the sharing permissions for the new field to This app only (search).
  5. Go to search and run the new query in order to measure the bandwidth across countries and find the most popular countries:
    index="digital_analytics"  | iplocation clientip |stats  sum(bandwidth) by Country | sort – sum(bandwidth)
    
    Calculated fields

As a result, we got the top countries and their bandwidth in megabytes and used the new calculated field in a search like any other extracted field.

Field extractions

Field extractions are a special utility that helps us create custom fields. It generates a regular expression that pulls those fields from similar events. We can extract fields that are static and often needed in searches using Interactive Field Extractor (IFX). It is a very useful tool that:

  • Has graphical UI
  • Generates regex for us
  • Ensures extracted fields persist as a knowledge object
  • Is reusable in multiple searches

Let's try to extract new fields from our digital data set:

  1. Run a search using the following query:
    index="digital_analytics"
    
  2. Select Extract Fields from the Event actions menu as shown in the following screenshot:
    Field extractions
  3. The new window will appear; we can highlight one or more values in the sample event to create fields. In our case, we want to extract just the name of the browser without the version or any other information. We should highlight the name of browser, give the name, and click on Add Extraction:
    Field extractions
  4. The next step is validation. We can take a quick look at how Hunk extracted the browser name from other events:
    Field extractions
  5. Click Next, change the app permission as usual for the app, and click on Finish.
  6. We can check the result; just run a new query in the search app:
    index="digital_analytics" | stats count by browser_name
    

Moreover, there is another way to extract fields during the search using the rex and erex commands.

Tags

Tags are like nicknames that you create for related field/value pairs. They can make our data more understandable and less ambiguous. It is possible to create several tags for any field/value combination.

Note

Tags are case-sensitive.

Let's create tags for our data set:

  1. Run a search with the following query:
    index="digital_analytics"
    
  2. Click on the arrow for event details. Then, under action click on the down arrow and select Edit Tags for the action field:
    Tags
  3. Name the tag Checkout and click on Save.
  4. Let's check our new tag. Run a new query:
    index="digital_analytics" tag="Checkout"
    

We get the following result with our new tag:

Tags

Event type

An event type is a method of categorizing events based on a search; in other words, we can create a group of events based on common values. Let's look at the following example in order to better understand how this works. We can create a new event to:

  • Categorize specific key/value pairs
  • Classify search strings
  • Tag event types to organize data in categories
  • Identify fields we can report on

For example, say the sales team wants to track monthly online sales. They want to easily identify purchases that are categorized by item. Let's create a new event type for coats:

  1. Run a search using the following query:
    index="digital_analytics" action=purchase productName=COATS
    
  2. Click on Save As | Event Type:
    Event type
  3. Type the name as Purchase Coats. In addition, we can create a new tag and choose the color and priority. Then click on Save.
  4. Go to Settings | Event Type and change permissions for our event type as usual.
  5. We can check this by running a new query:
    index="digital_analytics" action=purchase.
    
  6. There will be a new and interesting field: eventype. As a result, we can group our events in custom groups using tags and event types.

Workflow actions

Workflow actions launch from fields and events in our search results in order to interact with external resources or narrow our search. The possible actions are:

  • GET: This is used to pass information to an external web resource
  • POST: This is used to send field values to an external resource
  • Search: This uses field values to perform a secondary search

For example, organizations often need to track ongoing attempts by external sources trying to log in with invalid credentials. We can use a GET workflow action that will open a new browser window with information about the source IP address.

Tip

For more information about workflow actions in the Splunk knowledgebase with detailed explanations and examples, see: http://docs.splunk.com/Documentation/Splunk/latest/Knowledge/CreateworkflowactionsinSplunkWeb.

Macros

Macros are useful when we frequently run searches with a similar search syntax. It can be a full search string or a portion of a search that can be reused in multiple places. In addition, macros allow us to define one or more arguments within the search segment.

Let's create macros with an argument:

  1. Go to Settings | Advanced search | Search macros.
  2. Click Add new and type the name as activitybycategory(2).
  3. Enter the search string:
    index="digital_analytics" action=$action1$  AND productName=$Name1$ | stats count by product_name
    
  4. In the Arguments field type these arguments: action1, Name1. We should get the following:
    Macros
  5. Let's try to run our macros. Type in a search and run it:
    activitybycategory(purchase,COATS)
    
  6. As a result, we get a result that is similar to running an ordinary search:
    index="digital_analytics" action=purchase  AND productName=COATS | stats count by productName
    

Data model

A data model is a hierarchically structured data set that generates searches and drives a pivot. (A pivot is an interface in which we can create reports based on data models. Soon we will explore pivots more closely.) In other words, data models provide a more meaningful representation of underlying raw machine data.

Data models are designed to make it easy to share and reuse domain knowledge. The idea is that admins or powerusers create data models for non-technical users, who interact with data via a user-friendly pivot UI.

Let's create a data model for our digital data set.

  1. Go to Settings | Data Models.
  2. Click on New Data Model.
  3. In the Title type Unicorn Fashion Digital Analytics and click on Create. A new data model will be created.
  4. Click Add Object and choose Root Event. There are four types of objects:
    • Event objects—a set of events
    • Transaction objects—transactions and groups of events
    • Search objects— the result of an arbitrary search
    • Child objects—a subset of the dataset connected by their parent object
  5. Type in the Object Name as Digital Data and Constraints as index=digital_analytics sourcetype=access_combined. Click on Save:
    Data model
  6. Moreover, we can add child events in order to create predefined events with additional constraints.

Note

A constraint is a search that defines the dataset that an object represents. It uses root event objects and all child objects to define the dataset that they represent. All child objects inherit constraints from their parent objects, and have a new constraint of their own.

Add auto-extracting fields

We successfully added a root event and now we can add fields that Hunk can extract automatically. Let's do it:

  1. Select the Digital Data object.
  2. Click Add Attribute and select Auto-Extracted. A new window will come up displaying all auto-extracted fields. Check the checkbox to the left of the Field column header in order to select all extracted fields. In addition, we can easily rename some fields.
  3. Change the clientip type to IPV4.
  4. Click on Save:
    Add auto-extracting fields

There are four types of attribute in Hunk:

  • String: With this, field values are recognized as alpha-numeric.
  • Number: With this, field values are recognized as numeric.
  • Boolean: With this, field values are recognized as true/false or 1/0.
  • IPV4: With this, field values are recognized as IP addresses. This field type is useful for geo data because Hunk can easily extract geo data from the IP address.

Moreover, there are also four types of attribute flag:

  • Required: Only events that contain this field are returned in the pivot
  • Optional: This field doesn't have to appear in every event
  • Hidden: This field is not displayed to pivot users when they select an object in the pivot
  • Hidden & Required: only events that contain this field are returned, and the fields are hidden from use in the pivot

Adding GeoIP attributes

In order to add GeoIP attributes we should have a latitude and longitude lookup table or GeoIP mapping fields.

Let's add GeoIP attributes:

  1. Click on Add Attribute and choose GeoIP.
  2. Rename the lon and lat fields as longitude and latitude respectively. Click on Save:
    Adding GeoIP attributes

As a result new fields will be added to our data model.

Other ways to add attributes

Hunk offers us other methods of adding attributes, such as:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.59.198