Aggregation options

Aggregation options define how your data values are combined or aggregated. Simple aggregation works at the level of visual granularity.

What this means is the data is aggregated at the level of the dimensions in the visual. Without even studying it, we have used this logic in the previous chapters when we created visuals. Whenever we wanted to see the total SUM(Sales), we would only bring the SUM(Sales) measure to our visual window. However, if we wanted to see the sales per segment, we would add the Segment dimension to our window and our sales would split up, as can be seen in the following screenshot:

This same principle works with all the aggregations. To see the aggregation levels available, right-click on the measured field, select Measure (aggregation level) and it will show you what other options are available. For example, by right-clicking on the SUM(Sales) measure field, the following aggregation levels are available:

Similarly, if you want to use a dimension field to represent a measure value, you can bring it into the visual and right-click and select Measure, and then it will show the aggregation options available for the dimension.

Assuming a column in our dataset looks like 1, 1, 1, 2, 2, 3, 4, the various aggregation options and their results are as follows:

Sum

This adds the measure values, calculated by adding values = 1+1+1+2+2+3+4 = 14.

Average

This is the average of the measure values, calculated by adding values divided by the count of values = 14/7 = 2.

Median

This is the central value = 2.

Count

This simply counts the number of elements = 7.

Count (Distinct)

This is the count of distinct values, so no same value is counted more than once = 4.

Minimum

This the least value = 1.

Maximum

This is the highest value = 4.

Percentile

This shows what percentage of the values are less than this value. It has further options to look at the 5, 10, 25, 50, 75, 90, and 95 percentiles. Here, the 25th percentile would be 1, the 75th percentile would be 3, and so on.

Standard Deviation, Standard Deviation(Pop)

Standard deviation defines how much of the members are differing from the mean value. If you have an entire dataset, use the standard deviation population; if it is a sample, then use standard deviation. Standard deviation (Pop) = 1.07

Variance, Variance (Pop)

Variance is the square of standard deviation, so depending on population or sample, we use variance = 1.14 for this case.

 

The aggregations calculated in Tableau are sent back to the databases for querying. So, the general rule of thumb is if the database doesn't support a particular aggregation, Tableau will not be able to support it either. However, if you create extracts of the same data, then the aggregation will be supported.

As an exercise, look at all of the aggregation options for the Profit measure and the Customer Name dimension.

By simply using a combination of the preceding aggregation options and operators, many important calculations can be done. For example, Cost can be determined as [Sales]-[Profit]. Then, the average cost can be found with AVG(Cost).

We can also calculate the Profit to Sales ratio as SUM(Profit)/SUM(Sales). This will show up in decimals. To see it as a percentage, one way would be to multiply the calculation by 100. However, this will not show the % sign. To show the percentage sign, we need to format the results as a percentage. To do this, right-click on Measure and select Format. A side window will appear. Here, under Default, click on Numbers and then Percentage.

You can also include other things such as a suffix/prefix by using custom options. Once you are happy, you can close the side window.

Beyond the aggregations and operator use, we can look at other number functions available. They can be logically grouped into the following categories:

Trigonometric functions: These include the following:

PI

This returns the numeric constant value for pi, as 3.141592…., up to 16 decimal places. PI() is generally used within other trigonometric functions to specify the angles in radians.

SIN

This returns the sine of an angle. The angle must be specified in radians, for example, SIN(-PI()/4) =-0.7071.

ASIN

This returns the arc sine of a number in radians. This is the inverse of SIN, for example, ASIN(-0.7071)=-0.7854 (which is -pi/4).

ATAN2

This takes the two coordinate positions y and x as input and returns the arc tangent of two numbers in radians, for example, ATAN2(1,1) = 0.7854 i.e pi/4

DEGREE

This converts radians into degrees, so DEGREE(0.7854) =45..

 

** The COS, ACOS, TAN, ATAN, and COT functions are similar to SIN and ASIN.

Logarithmic and exponential functionsThese include the following:

EXP

This returns e raised to the power of a number, for example, EXP(2) = e^2.

LN

This returns the natural logarithm of the number. The value is null for all zero and negative values, for example, LN(2) = 0.69 or LN(EXP(2))=2.

LOG

This returns a logarithmic value to the base of another number. If no base is specified, then it returns a logarithmic value to the base of 100, for example, LOG(1000) = 3 and LOG( 4,2) = 2.

POWER

This returns a value to the power of another, for example, POWER (2,3) =2^3= 8.

 

Other mathematical functionsThese include the following:

SQUARE

This returns the square of a number and is equivalent to Power (number, 2), for example, SQUARE(5) = 25.

SQRT

This returns the square root of a number and is equivalent to Power (number, 0.5), for example, SQRT(25) =5.

ABS

This returns the absolute value of a number, for example, ABS(-7) =7. It's very useful when you work with quantities or deviations where the actual values are more important than the sign of the numbers.

CEILING

This rounds a number to the nearest integer greater or equal to the number. This is useful when doing capacity planning kind of problems, for example, if you had to assign resources to a project to get work completed and the result was 3.2. If you simply round the number, your result would be 3. But you actually have more work, so it makes more sense to have four resources than three and plan based on that, for example, CEILING(3.2) = 4.

FLOOR

FLOOR does the reverse of CEILING. It returns the greatest integer smaller or equal to the given number, for example, FLOOR(3.2) =3.

ROUND

This rounds the number to the nearest integer or the specified decimal values, for example, ROUND (3.2)=3, ROUND(3.7) = 4, and ROUND(3.1421,2) = 3.14.

SIGN

This returns the sign of the number, 1 if positive, 0 if zero, and -1 if negative, for example, SIGN(-3)=-1 and SIGN(4) =1.

ZN

This returns the expression if it is not null; otherwise, it will return 0. This is useful if we want to calculate averages, without ignoring NULL or to display 0s instead of nulls. 

DIV

This returns the quotient of a division operation, for example, DIV(7,2) =3.

HEXBINX, HEXBINY

Both of these functions are used to map an x, y coordinate to the nearest hexagonal bin. HENBINX maps to the nearest coordinate while HEXBINY maps the nearest coordinate. The bins have side length 1, so the inputs need to be scaled accordingly. 

 

Now that we are familiar with the various aggregation options and number functions that can be used to create numeric calculations, let's move on to the next section where we will learn about string calculations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.63.5