Performing cluster analysis

Borrowing from the Wikipedia definition:

"cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters)."

In this recipe, we will try out the new clustering feature introduced in Tableau 10 to group athletes based on endorsements and salary/winnings:

Performing cluster analysis

Getting ready

To follow this recipe, open B05527_06 – STARTER.twbx. Use the worksheet called Cluster and connect to the Top Athlete Salaries (Global Sport Finances) data source:

Getting ready

How to do it...

Here are the steps to create the scatter plot with clusters:

  1. Click on the dropdown in the Marks card and change the mark to Circle.
  2. From Measures, drag Salary/Winnings $ to Columns.
  3. From Measures, drag Endorsements $ to Rows.
  4. From Dimensions, drag Athlete to Detail.
  5. On the side bar, click on the Analytics tab to activate it.
  6. Under Model, drag Cluster to your view:
    How to do it...
  7. In the Clusters window, set the number of clusters to 3:
    How to do it...
  8. Close the Clusters window when done.

How it works...

Tableau 10 makes it easier to do cluster analysis. Under the Analytics pane in the side bar, there is a new option to add clusters to your view. You simply drag Cluster onto your view, and the clusters will be automatically computed. You will also be presented with a window that asks you which variables need to be considered in the cluster, and how many clusters need to be created:

How it works...

If you need to consider additional measures and/or dimensions when computing the clusters, you can drag them onto the variables pane.

By default, the cluster is placed on Color in the Marks card. When you right-click on this pill, you will see new options to edit or describe the cluster:

How it works...

When you choose Describe clusters..., you can show the Summary values computed for the cluster:

How it works...

You can also show the model used for the analysis of variance for the cluster:

How it works...

There's more...

Clusters can also be saved back to your Tableau data source. When you drag the cluster pill from the Color property in the Marks card back to the side bar, the cluster gets saved in the form of a group:

There's more...

If you would like to take advantage of existing software applications such as R, SPSS, or SAS for your clustering calculations, you can connect to these data sources in Tableau. Tableau also supports R integration, which allows you to write R code from within Tableau.

See also

Please refer to the Performing linear regression with R recipe in this chapter

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.108.119