Connecting to data

Connecting to data in Tableau Prep is very similar to connecting to data in Tableau Desktop. From the home screen, you may click either Connect to Data or the button on the expanded Connections pane:

As with Tableau Desktop, for file-based data sources, you may drag the file from Windows Explorer or Finder onto the Tableau Prep window to quickly create a connection.

Tableau Prep supports dozens of file types and databases, and the list continues to grow. You'll recognize many of the same types of connection possibilities that exist in Tableau Desktop. However, at the time of writing this book, Tableau Prep does not support all of the connections that are available in Tableau Desktop.

You may create as many connections as you like and the Connections pane will list each connection separately with any associated files, tables, views, and stored procedures, or other options that are applicable to that data source. You will be able to use any combination of data sources in the flow.

For now, let's start our example with the following steps:

  1. Click Connect to Data.
  2. From the expanded list of possible connections that appears, select Microsoft Excel.
  3. You'll see a main table called Employee Flights and a subtable named Employee Flights Table 1. Drag the Employee Flights table to the Profile pane. An input step will be created, giving you a preview of the data and other options.
  1. The input step displays a grid of fields and options for those fields. You'll notice that many of the fields in the Employee Flights table are named F2, F3, F4, and so on. This is due to the format of the Excel file, which has merged cells and a summary subtable. Check the Use Data Interpreter option on the Connections pane and Tableau Prep will correctly parse the file. It should look something like this:

When you select an input step, Tableau Prep will display a grid of fields in the data. You may use the grid to uncheck any fields you do not wish to include, edit the Type of data by clicking the associated symbol (for example, change a string to a date), and edit the Field Name itself by double-clicking the field name value.

If Tableau Prep Builder detects that the data source contains a large number of records, it may turn on data sampling. Data Sampling uses a smaller subset of records for giving rapid feedback and profiling in design mode. However, it will use the full set of data when you run the entire flow in batch mode. You can control the data sampling options by clicking Data Sample on the input pane. You'll receive an indicator of Data Sampling if it occurs anywhere in the flow.

Now, we'll continue to explore the data and fix some issues along the way.

  1. Click the + button that appears when you hover over the Employee Flights input step. This will extend the flow by adding a clean step called Clean 1.
  1. Take a moment to explore the data using the Profile pane. Observe how selecting individual values for fields in the profile pane highlights portions of related values for other fields. This can give you great insight into your data, such as seeing the different price ranges based on Ticket Type:

Highlighting the bar segments across fields in the Profile pane, which results from selecting a field value, is called brushing. You can also take action on selected values via the toolbar at the top of the profile pane or by right-clicking a field value. These actions include filtering, editing values, or replacing with null. However, before making any changes or cleaning any of the data, let's connect to some additional data.

It turns out that most of the airline ticket booking data is in one database that's represented by the Excel file, but another airline's booking data is stored in files that are periodically added to a directory. These files are in the Learning TableauChapter 10 directory. The files are named with the convention Southwest YYYY.csv (where YYYY represents the year).

We'll connect to all of the existing files and ensure that we are prepared for additional future files:

  1. Click the + icon on the Connections pane to add a new connection to a Text File.
  2. Navigate to the Learning TableauChapter 10 directory and select any of the Southwest YYYY.csv files to start the connection. Looking at the Input settings, you should see that Tableau Prep correctly identifies the field separators, field names, and types:

  1. In the Input pane, select the Multiple Files tab and switch from Single table to Wildcard union. Set the Matching Pattern to Southwest* and click Apply. This tells Tableau Prep to union all of text files in the directory that begin with Southwest together:

  1. Use the + icon on the Southwest input step in the flow pane to add a new step. This step will be named Clean 2 by default. Once again, explore the data, but don't take any action until you've brought the two sources together in the flow. You may notice a new field in the Clean 2 step called File Paths, which labels each record with the name of the applicable file from the wildcard union.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.165.86