Starting the project

Let's get going.

In Watson Analytics Refine, it is easy to test our data quality column by column (click Refine, select our file, then click on the Data Metrics icon, which is shown as follows):

It would seem that there are plenty of columns  that have very high (98 or above) data quality scores, for example, Advertisement ID, Age Range, Ethnic, and so on. This not only gives us confidence in whatever predictions Watson may come up with, but also affords us the ability to further refine our data to focus more on the specific needs or interests we may have.

For example, suppose we want to establish personal recommendations based upon user behavior of a certain age range? Or perhaps user ethnicities?

Initially, we are going to target those users who fall into the following categories:

  • Retirees (65 -74 years old and older than 75 years too)
  • Caucasian (White and European Americans)
  • Making a purchase for the first time (First Time Buyer)

To narrow down our file to contain only activity for this user group, we can use Watson Analytics Refine filters. The following describes the steps to create filters on our dataset:

  1. In Refine, we start by clicking the Actions icon:

  1. Next, we can click on the appropriate column heading in the dataset.
If the column contains only numeric values, you can use the provided sliders to specify a range of the values that you want; alternatively, you can enter those values directly. If the column contains only text or date values, you can click on the specific date values to include.
  1. In the following screenshot, I have added a filter on the data column named Age Range and included only those age groups that would indicate that the user is most likely retired:

  1. Once you have added a filter to a data column, Watson Analytics adds a blue line or blue dashes to the column name to indicate that the data in that column is filtered. A brief description of the filter also appears on the column name. In the following screenshot, you can see the blue dashes, as well as the text 3 of 9, which indicates that I have chosen only 3 of the 9 available values for this data column:

  1. We will continue filtering our dataset by adding filters to the Ethnic data column, shown as follows:

  1. We will also do this for  the First Time Buyer data column:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.