Union files together

Often, you may have multiple individual files or tables that, together, represent the entire set of data. For example, you might have a process that creates a new monthly data dump as a new text file in a certain directory. Or, you might have an Excel file where data for each department is contained in a separate sheet.

A union is a concatenation of data tables which brings together rows of each table into a single data source. For example, consider the following three tables of data:

Originals:

Name

Occupation

Bank account balance

Luke

Farmer

$2,000

Leia

Princess

$50,000

Han

Smuggler

-$20,000

 

Prequels:

Name

Occupation

Bank account balance

Watto

Junk Dealer

$9,000

Darth Maul

Face Painter

$10,000

Jar Jar

Sith Lord

-$100,000

 

Sequels:

Name

Occupation

Bank account balance

Rey

Scavenger

$600

Poe

Pilot

$30,000

Kylo

Unemployed

$0

 

A union of these tables would give a single table containing the rows of each individual table:

Name

Occupation

Bank account balance

Luke

Farmer

$2,000

Leia

Princess

$50,000

Han

Smuggler

-$20,000

Watto

Junk Dealer

$9,000

Darth Maul

Face Painter

$10,000

Jar Jar

Sith Lord

-$100,000

Rey

Scavenger

$600

Poe

Pilot

$30,000

Kylo

Unemployed

$0

Tableau allows you to union together tables from file-based data sources, including the following:

  • Text files (.csv, .txt, and other text file formats)
  • Sheets (tabs) within Excel documents
  • Subtables within an Excel sheet
  • Multiple Excel documents
  • Google Sheets
  • Relational database tables
Use the Data Interpreter feature to find subtables in Excel or Google Sheets. They will show up as additional tables of data in the left sidebar.

To create a union in Tableau, follow these steps:

  1. Create a new data source from the menu, toolbar, or Data Source screen, starting with one of the files you wish to be part of the union. Then, drag any additional files into the Drag table to union drop zone just beneath the existing table in the designer:

  1. Once you've created a union, you can use the drop-down menu on the table in the designer to configure options for the union. Alternatively, you can drag the New Union object from the left sidebar into the designer to replace the existing table. This will reveal options for creating and configuring the union:

The Specific (manual) tab allows you to drag tables into and out of the union. The Wildcard (automatic) tab allows you to specify wildcards for filenames and sheets (for Excel and Google Sheets) that will automatically include files and sheets in the union based on a wildcard match.

Use the Wildcard (automatic) feature if you anticipate additional files being added in the future. For example, if you have a specific directory where data files are dumped on a periodic basis, the wildcard feature will ensure that you don't have to manually edit the connection.

  1. Once you have defined the union, you may use the resulting data source to visualize the data. Additionally, a union table may be joined with other tables in the designer window, giving you a lot of flexibility in working with data:
In a union, Tableau will match columns between tables by name. Columns that exist in one file/table but not in others will appear as part of the union table, but values will be NULL in files/tables where the column did not exist. For example, if one of the files contained a column named Job instead of Occupation, the final union table would contain a column named Job and another named Occupation, with NULL values where the column did not exist. You can merge the mismatched columns by selecting the columns and using the drop-down menu. This will coalesce (keep the first non-null of) the values per row of data in a single new column.

When you create a union, Tableau will include one or more new fields in your data source that help you identify the file, sheet, and table where the data originated. Path will contain the file path (including filename), Sheet will contain the sheet name (for Excel or Google Sheets), and Table Name will contain the subtable or text filename. You can use these fields to help you identify data issues and also to extend your dataset as needed. For example, if you had a directory of monthly data dump files named 2018-01.txt, 2018-02.txt, 2018-03.txt, and so on, but no actual date field in the files, you could obtain the date using a calculated filed with code such as the following:

DATEPARSE('yyyy-MM', [Table Name] )  
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.140.238