Overview of advanced fixes for data problems

In addition to the techniques that we mentioned previously in this chapter, there are some additional possibilities for dealing with data structure issues. It is outside the scope of this book to develop these concepts fully. However, with some familiarity of these approaches, you can broaden your ability to deal with challenges as they arise:

  • Custom SQL can be used in the data connection to resolve some data problems. Beyond giving a field for a cross database join, as we saw previously, custom SQL can be used to radically reshape the data that's retrieved from the source. Custom SQL is not an option for all data sources, but is for many relational databases. Consider a custom SQL script that takes the wide table of country populations we mentioned earlier in this chapter and restructures it into a tall table:
SELECT [Country Name],[1960] AS Population, 1960 AS Year 
FROM Countries 
 
UNION ALL 
 
SELECT [Country Name],[1961] AS Population, 1961 AS Year 
FROM Countries 
 
UNION ALL
 
SELECT [Country Name],[1962] AS Population, 1962 AS Year 
FROM Countries 
... 
... 
 

And so on. It might be a little tedious to set up, but it will make the data much easier to work with in Tableau! However, many data sources using complex custom SQL will need to be extracted for performance reasons.

  • Unions: Tableau's ability to union data sources such as Excel, text files, and Google sheets can be used in a manner similar to the Custom SQL example to reshape data into tall datasets. You can even union the same file or sheet to itself. This can be useful in cases where you need multiple records for certain visualizations. For example, visualizing a path from a source to a destination is difficult (or impossible) with a single record that has a source and destination column. However, unironing the dataset to itself yields two rows: one that can be visualized as the source and the other as a destination.
  • Table Calculations: Table calculations can be used to solve a number of data challenges from finding and eliminating duplicate records to working with multiple levels of detail. Since table calculations can work within partitions at higher levels of detail, you can use multiple table calculations and aggregate calculations together to mix levels of detail in a single view. A simple example of this is the Percent of Total table calculation, which compares an aggregate calculation at the level of detail in the view with a total at a higher level of detail.
  • Data Blending: Data blending can be used to solve numerous data structure issues. Because you can define the linking fields that's used, you control the level of detail of the blend. For example, the apartment rental data problem we looked at could be solved with a secondary source that has a single record per apartment with the square feet. Blending at the apartment level would allow you to achieve the desired results.
  • Data Scaffolding: Data scaffolding extends the concept of data blending. With this approach, you construct a scaffold of various dimensional values to use as a primary source and then blend to one or more secondary sources. In this way, you can control the structure and granularity of the primary source while still being able to leverage data that's contained in secondary sources.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.183.153