Understanding how to perform clustering

Often we are required to quickly locate distinct and well separated groups in our data, for example, grouping customers who have the same buying patterns, or patients with similar symptoms, and so on. More often than not, this can be done using the grouping functionality that we saw in previous chapters.

However, this can be challenging, as finding patterns via manual inspection for complex and distributed datasets with no obvious patterns can be very tough.

The new clustering functionality in Tableau automatically groups together similar data points by finds patterns in data using a K-means algorithm to help the user explore patterns in the data that would be tough to pick out otherwise.

Let us explore the clustering functionality in more detail in the recipe.

Getting ready

We will use a new dataset for the following recipe. The dataset is a .tde, file which has been uploaded on the following link:

https://1drv.ms/u/s!Av5QCoyLTBpnhks3n2mxItiI7-tb.

The file is called World Indicators.tde. We will download this extract file and save it to the Tableau Cookbook data folder in Documents | My Tableau Repository | Datasources. We will continue working in our existing Tableau workbook.

Let's get started.

How to do it…

  1. We will create a new sheet and rename it Clustering.
  2. Let us then click on Ctrl+D to connect to the data, select the More… option from the To a File section, and browse to the .tde file from Documents | My Tableau Repository | Datasources | Tableau Cookbook data folder. Refer to the following screenshot:
    How to do it…
  3. Once we do that, our view will update as shown in the following screenshot:
    How to do it…
  4. We will go ahead with the default Live option and click on the Clustering sheet to view the Dimensions and Measures for this new data source. Refer to the following screenshot:
    How to do it…
  5. Next, let us drag the Country field from the Dimensions pane and drop it into the Rows shelf, followed by dragging the Internet Usage field from the Measures pane and dropping it into the Columns shelf. This will create a bar chart as shown in the following screenshot:
    How to do it…
  6. We will then drag Internet Usage field again from the Measures pane and drop it into the Label shelf. We will then sort the bar chart in descending order of Internet Usage using the Sort option on the toolbar. Refer to the following screenshot:
    How to do it…
  7. Next, we will click on the Analytics tab and simply double-click on the Cluster option from the Model pane. This will update our view as shown in the following screenshot:
    How to do it…
  8. Our view will now update as shown in the following screenshot:
    How to do it…
  9. Next, let us quickly convert this into a map by selecting the symbol maps option from Show Me!. This will update our view as shown in the following screenshot:
    How to do it…
  10. Finally, drag the Clusters field from the Rows shelf and drop it into the Color shelf. This will update our view as shown in the following screenshot:
    How to do it…

How it works…

If we now take a look at the bar chart we created earlier, we will see the light blue color, which represents Cluster 4 and indicates that Internet usage is very high; the red color, which represents Cluster 3, indicates that Internet usage is high; the orange color, which represents Cluster 2, indicates Internet usage is moderate; the dark blue color, which represents Cluster 1, indicates that Internet usage is low; and the green color, which represents Not Clustered, indicates a complete lack of any Internet usage data.

As mentioned earlier, Tableau uses the K-means clustering algorithm; to understand this in a little more detail, we can right-click on the Clusters field, which is currently placed in the Color shelf, and select Describe clusters…. Refer to the following screenshot:

How it works…

We will now get the following view:

How it works…

Note

To learn more about clustering in Tableau, refer to the following link:

https://onlinehelp.tableau.com/current/pro/desktop/en-us/clustering.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.101.81