Chapter 3. Connecting to Data

Upon opening Tableau for the first time, you’re presented with the Connect pane on the left side of the screen, where you can connect to various types of data sources. The data connections are split into four areas:

Search for Data
Allows you to connect to data sources shared to Tableau Server or Tableau Online as published data sources (discussed in Chapter 14).
To a File
Allows you to connect to flat data files such as Excel workbooks, text files (including comma-separated files), and JavaScript Object Notation (JSON) files.
To a Server
Allows you to connect to data hosted on a server such as Microsoft SQL Server, Oracle, Amazon Redshift, and many more—which you can access by clicking the More option. 
Saved Data Sources
These data sources are shared to your Tableau repository for easy access. By default, this section contains two sample data sources that you can use for Tableau practice and for following along with this book.

As of this writing, Tableau Desktop has 83 native data connections in the Microsoft Windows version of the software—plus the ability to connect to third-party web data connectors, Java Database Connectivity (JDBC), and Open Database Connectivity (ODBC). Slightly fewer connections are available in the Mac version of Tableau Desktop. To get started analyzing a data source, click its connection type from the Connect pane, as shown in Figure 3-1.

To follow the examples in this book, you can click Sample – Superstore near the bottom of the Connect pane. Saved data sources are unique in that they immediately bring you into the Authoring interface. We’ll discuss this interface in more detail in coming chapters, but to see what typically happens when you access a data source for the first time, let’s connect to an unsaved version of the Sample – Superstore data source.

To connect to the unsaved version from the Connect pane, click Microsoft Excel (under To a File) and then navigate to Documents > My Tableau Repository > Datasources, and choose the version number you’re using. From here, click the region folder relevant to your location (such as en_US-US) and you will see the Sample – Superstore Excel file. Double-click the file to access the Data Source interface (Figure 3-2), where you can prepare the file for analysis.

Figure 3-1. Tableau’s Connect pane with native data connections
Figure 3-2. Tableau’s Data Source interface

Data Models

Tableau interprets this Excel workbook as a database, and the three tabs within the workbook as database tables. I point this out because Tableau interprets server-based data sources the same way, so you’ll see something similar when connecting to the data sources listed under To a Server on the Connect pane.

Tip

Don’t be confused by seeing two occurrences of the Orders, People, and Returns options in this interface! The first three are the full tables, while the second set are Excel named ranges. In the following examples, we’ll always connect to the Orders and Returns tables.

Tableau provides three types of table connections:

Single tables
These are the simplest and allow you to begin analyzing a single table by left-clicking and dragging the table name from the left pane to the interface label “Drag tables here.” This option may be all you need, particularly if you’ve prepared a data source in a tool such as Tableau Prep before connecting here in Tableau Desktop.
Multiple tables (using Tableau’s data model)
Tableau’s data model introduces what Tableau calls a logical layer that combines tables by using noodles. This is the default way to connect multiple tables, as it combines the data in more intuitive ways and makes data preparation easier. This helps solve data challenges automatically, such as the need to deduplicate joined rows (you can read more about this in Chapter 50 of my book Innovative Tableau (O’Reilly, 2020).
Multiple tables (using joins and unions)
Prior to Tableau Desktop 2020.2, joins and unions were the default tactics for combining data in Tableau. As such, dragging a second table into the view would automatically create a join. Now, to access what Tableau calls the physical layer, you must double-click the primary table.
Tip

To learn more about Tableau’s data model and how it differs from joins, see “The Tableau Data Model” on the Tableau website.

To begin an analysis, I will left-click and drag the Orders table from the Connections pane to the “Drag tables here” area. To add context to our analyses, I will also bring the Returns table into the data model by dragging it from the Connections pane, next to the Orders table. This automatically creates a relationship between the Orders and Returns tables in the logical layer on the Order ID field, as you can see in Figure 3-3.

In this case, Tableau was able to automatically create a relationship because both tables have a field with the same name. If Tableau does not automatically recognize a relationship, you can define one or more relationships in the Edit Relationship dialog that appears.

Figure 3-3. Tableau’s Data model creating a relationship between the Orders and Returns tables from the Sample – Superstore data source

Live Data Connections Versus Data Extracts

As seen in the top-right corner of the Data Source interface (Figure 3-4), we can connect to a data source in two ways: Live or Extract.

Figure 3-4. The Live and Extract radio buttons on Tableau’s Data Source interface

Live data connections, the default, are exactly what they sound like: live connections to the underlying data source. This is the most secure option, as you are not creating copies of the data source or moving data around between systems; you are querying and visualizing the data from its hosted location. The drawback to this option is performance related. Since you are querying live, response time depends on factors including the size of the data source, the type of hardware, and the number of users sharing resources.

Extracts create a snapshot of the data by using Tableau’s own Hyper data engine. These files, which end with the extension .hyper, are optimized for Tableau and will almost always perform faster than a live data connection. The drawbacks are that this option is less secure, as you’re creating copies of a data source that can be distributed outside company servers and, because you’re creating snapshots of a data source at a given point in time, you must refresh an extract to bring new data into the data source.

Data Source Versus Extract Filters

An optional preparation step you can do in the Data Source interface is to add a filter by clicking the Add button in the top-right corner, under Filters (Figure 3-5).

Figure 3-5. Add button for creating a Data Source filter

If you’re using a live connection, the filters you add in this section create a data source filter.

If you’re creating an extract, you’ll see an Edit button appear next to the selected Extract radio button. If you click the Edit button and add filters in the dialog that appears, you’re creating an extract filter.

These are the highest-level filters you can add in Tableau Desktop and the first processing that happens in Tableau’s order of operations, discussed in more detail in Chapter 10.

Data Types

One more item you can update on this screen is the data type for each field in your dataset. In the top-left corner of each column, you’ll see a blue or green icon (Figure 3-6) indicating the data type Tableau has assigned to each field.

Figure 3-6. Data types in Tableau Desktop

It’s important to understand data types because they often determine how data sources can be combined, which fields can be used within calculated fields, and what kind of chart types you can make. For example, you can’t add an integer to a string in a calculation or make a map out of dates. The seven data types used in Tableau are as follows:

  • Number (decimal)

  • Number (whole)

  • Date & Time

  • Date

  • String (i.e., text)

  • Boolean (true or false)

  • Geographic Role (i.e., latitude and longitude)

These classifications are correct most of the time, but these icons can be helpful in determining whether your dataset is optimized for your analyses. If you ever need to change a data type classification, click the data type icon and make a different selection (Figure 3-7).

Figure 3-7. Changing a data type classification in Tableau Desktop

You can do a few additional data preparation tasks on this screen, but they are beyond the scope of this book. You can access them by clicking the down arrow that appears in the top-right corner of a column upon hovering.

Tip

For more information on preparing a data source for use with Tableau, I suggest reading Chapter 3 of my book Practical Tableau (O’Reilly, 2018) or for a thorough deep dive, Tableau Prep: Up & Running by Carl Allchin (O’Reilly, 2020).

Once you’re ready to move to the Authoring interface and begin analyzing a data source, click the orange tab with the Go to Worksheet annotation, at the bottom of the screen (labeled Sheet1 in Figure 3-8).

Figure 3-8. Go to Worksheet caption at the bottom of the Data Source interface

Clicking this tab  takes you to the primary development interface, which is called the Authoring interface. If you ever need to return to the Data Source interface to make updates, such as removing or editing data source or extract filters, simply click the Data Source tab in the bottom-left corner of the screen.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.151.141