In the previous chapter, we saw functions such as SUMIFS and COUNTIFS performing aggregations dependent upon criteria. They are incredible functions, but as with everything in life, they have their limitations. You may need to perform aggregations outside of the scope of those functions’ capabilities.
This chapter covers two next-level aggregation functions – SUMPRODUCT and AGGREGATE. These functions are amazing, and each one has specific abilities that make them extremely useful.
We will begin with the SUMPRODUCT function. This function has achieved infamy with many advanced Excel users who label it as their favorite function. It’s ability to handle arrays is possibly its greatest strength, and this makes it very valuable to users of Excel versions prior to Excel 365.
We then move on to the AGGREGATE function. This function is a powerful toolset of 19 different aggregations including sum, large, median, and mode. But it does not stop there. It has additional abilities such as to ignore errors and hidden values in a range when aggregating.
SUMPRODUCT Function
Availability: All versions
sumproduct.xlsx
Each array can be a range of values or an array returned by a function or a logical expression. Each array must be of the same height or width.
The SUMPRODUCT function multiplies the values of the corresponding arrays and then sums the results.
Let’s start with a simple example of SUMPRODUCT being used in its conventional way. And then explore more powerful examples including handling multiple conditions and working with arrays.
Classic SUMPRODUCT
The conventional use of the SUMPRODUCT function can remove the need for performing intermediary row calculations, therefore reducing calculation time and improving spreadsheet performance.
This is fine, and the row totals may be useful. However, if there were thousands of rows in this dataset, this may be an unnecessary volume of calculations and strain on the spreadsheet.
Only two arrays were used in this example, and SUMPRODUCT can handle up to 255 arrays . Let’s move on and see further examples of the extreme power and versatility of this function.
Sum Values That Meet Multiple Criteria
The SUMPRODUCT function is fantastic for summing and counting values that meet multiple criteria. It can handle both AND and OR logic, so the criteria can be complex.
It is much more efficient at handling different logical tests than the IFS family of functions (SUMIFS, COUNTIFS, etc.). We saw a technique to use OR logic with the SUMIFS function in the previous chapter, as natively it cannot apply OR logic.
Let’s start by using the SUMPRODUCT function to sum values that meet a single condition. We will then increase the complexity of the criteria by adding more conditions and explain how it all works.
This formula is quite different to others that we have seen so far in this book. Two things that stand out are that the condition is enclosed within its own set of brackets and that the results of that condition are multiplied by the values in the [Total] column.
Let’s break down how this all works.
A semicolon (;) is used for a break in row, while a comma (,) is used for a break in column.
Now that we understand how this is working, let’s add a second condition to our SUMPRODUCT formula.
In this formula, each condition is enclosed within its own brackets. Each condition will return an array of TRUE and FALSE values dependent on the result of that logical test. The asterisk (*) is used to multiply the corresponding values of the two arrays. This applies AND logic between the two conditions.
The asterisk, or multiply symbol, is used for AND logic because you only get a non-zero value if all logical tests return TRUE. For example, 1*1=1 but 1*0=0 and 0*1=0.
All numbers except zero evaluate to TRUE including both negative and positive values, while zero evaluates to FALSE.
To perform OR logic between conditions with SUMPRODUCT, the plus symbol (+) is used. Let’s see an example.
These values are then multiplied by the corresponding value from the [Total] column and summed by SUMPRODUCT.
By adding the corresponding values returned by all conditions, as long as one of the logical tests returns TRUE, then the overall result is TRUE. This is perfect OR logic.
If we tested multiple columns, and more than one column returned TRUE, this is fine. Any non-zero value evaluates to TRUE. So, (TRUE + TRUE) = TRUE.
Let’s see one final example, where we will use a combination of both AND and OR logic in a SUMPRODUCT function. These conditional tests can, of course, get more complex than we will take them in this chapter.
In this formula, the OR segment is enclosed in its own set of brackets. This is to follow the mathematical order of precedence and ensure that the OR segment evaluates before applying the AND logical expression to the selected store.
Let’s break this down.
Conditions are entered in the same manner in the FILTER function, discussed in Chapter 13. So, what we are learning here is directly transferable to criteria in the FILTER function.
Count Values That Meet Multiple Criteria
To count values instead of summing them, we simply need to remove the array from the formula that contains the values to be summed.
All the conditions used in the examples so far have tested text values. So, let’s finish with an example that tests a numeric value.
Aggregating Multiple Columns
An area that showcases the flexibility of the SUMPRODUCT function well is its ability to handle criteria ranges of different dimensions.
Figure 9-10 shows a matrix of student names and their results to five different questions. “C” stands for correct and “I” for incorrect. A blank cell represents no answer to the question.
We want to return the number of correct questions answered for the student specified in cell B2. Cell B2 has been named [rngStudent].
This task is no problem for SUMPRODUCT. It can easily handle criteria ranges of irregular dimensions.
Distinct Count Formula
A function that is unfortunately missing from the library of functions in Excel is one that counts only the distinct items in a range.
This list is formatted as a table named [tblWebinars]. We will achieve the distinct count with a formula that combines the SUMPRODUCT and COUNTIFS functions together.
Let’s break down how this formula works.
SUMPRODUCT then sums the values in this array.
This is a nice technique to produce a distinct count. It takes advantage of SUMPRODUCT’s ability to work with arrays and the fabulously useful COUNTIFS function.
A distinct count formula is even easier to create with the new UNIQUE function in Excel. We will cover this technique in Chapter 10.
Weighted Average
Excel also does not have a built-in function to calculate a weighted average. Fortunately, SUMPRODUCT with its ability to handle arrays natively makes this task simple.
In Figure 9-14, we have a table named [tblAssessments]. The arithmetic mean has been calculated using the AVERAGE function on the [Score] column and returns 78.2.
Sum Every Nth Cell
You may come across a scenario where the values you want to sum are not in consecutive cells. Figure 9-16 shows a range that includes quarterly totals in every fourth column (every quarter).
Two is subtracted from each of these column numbers. This is because the range starts in column C, the third column. So, this offsets columns A and B. If the range used started from column A, nothing would need to be subtracted.
This formula could easily be adapted to sum every Nth row. The main change would be to swap the COLUMN function for the ROW function.
Sum Values for a Specific Weekday
The SUMPRODUCT function is a built-in array formula. Its ability to work with arrays is one of its great strengths. This makes the function incredibly valuable to users of older versions of Excel that cannot natively handle arrays (discussed in Chapter 10).
We have seen examples of SUMPRODUCT working with arrays already in the last two examples. However, this example is here to demonstrate a common scenario that avoids the need for creating helper columns.
A common solution is to insert a column and use the WEEKDAY function to return the weekday numbers in all cells of that column. Then use a separate function such as SUMIFS to sum the [Total] values using the weekday number column for the criteria range. Functions like SUMIFS cannot handle the WEEKDAY function directly within its criteria.
With SUMPRODUCT, it can store the weekday numbers returned by WEEKDAY in an array. No need to create an additional physical column. This array of weekday numbers is then tested and calculated all in one formula.
In Figure 9-19, the following formula uses the WEEKDAY function to return the weekday numbers for each date. The 2 within the WEEKDAY function specifies that the weekday numbers should begin with Monday as 1 and end with Sunday as 7.
Without the use of an array function, you would not be able to provide an entire column of values to a function such as WEEKDAY. This also applies to functions such as LEFT, RIGHT, MONTH, etc. By using SUMPRODUCT, we do not have to force an array formula by pressing Ctrl + Shift + Enter.
In Chapter 10, we will discuss dynamic arrays in Excel for Microsoft 365, Excel for the web, and Excel 2021. With this modern development, even the standard SUM function could handle this array task.
Sum the Top Five Values Only
For the final SUMPRODUCT examples, let’s see it being used to sum the top five values in a range only. Once again, this is a showcase of SUMPRODUCT handling arrays.
To do this, we will combine SUMPRODUCT with the LARGE function. The LARGE function returns the kth largest value in a range, for example, the second, third, or fifth largest value.
Array: The range, table column, or array from which to return the kth largest value
K: The position of the value in the array (from the largest) that you want to return
Note There is also a SMALL function in Excel to return the kth smallest value in a range.
We want to return the top five values for SUMPRODUCT to sum, not just a single value. So, for the k argument, we will feed the LARGE function an array of constants containing all the values that we want.
The LARGE function could be replaced by the SMALL function to return the smallest values to be summed.
Let’s now take this a step further and add criteria to the example. Say, we want to sum the top five values for a specific store only.
The importance of SUMPRODUCT in older versions of Excel is indisputable, from their general purpose of calculating the products of arrays followed by a sum to complex criteria sums and counts and finally its ability to natively handle arrays.
AGGREGATE Function
Availability: Excel 2010, Excel 2013, Excel 2016, Excel 2019, Excel 2021, Excel for Microsoft 365, Excel for the Web, Excel 2011 for Mac, Excel 2016 for Mac, Excel 2019 for Mac, Excel 2021 for Mac, Excel for Microsoft 365 for Mac
aggregate.xlsx
The AGGREGATE function was introduced in Excel 2010. This terrific function can apply a plethora of different aggregate functions (19 in fact) to a dataset with the option to ignore hidden rows, error values, and/or subtotals.
The AGGREGATE function is the successor to the SUBTOTAL function in Excel. SUBTOTAL remains in Excel primarily for compatibility reasons. SUBTOTAL is also used for total row calculations with Excel Tables.
- Function num : A number from 1 to 19 that specifies which aggregate function to use. The following list details all 19 functions:
1 – AVERAGE
2 – COUNT
3 – COUNTA
4 – MAX
5 – MIN
6 – PRODUCT
7 – STDEV.S
8 – STDEV.P
9 – SUM
10 – VAR.S
11 – VAR.P
12 – MEDIAN
13 – MODE.SNGL
14 – LARGE
15 – SMALL
16 – PERCENTILE.INC
17 – QUARTILE.INC
18 – PERCENTILE.EXC
19 – QUARTILE.EXC
Options: A number between 0 and 7 that specifies which values to ignore, if any, when performing the aggregation function.
Array, ref1: The array formula, array of values, or range of cells for the function to be applied.
Ref2: Additional ranges of values for which to apply the aggregation function. Up to 253 can be added. This is an optional argument.
- K: The position in the array to return. It is used by the functions in the following list only – kth largest, kth smallest, kth percentile, kth quartile:
LARGE
SMALL
PERCENTILE.INC
QUARTILE.INC
PERCENTILE.EXC
QUARTILE.EXC
The vast array of aggregate functions it offers
Its ability to handle arrays (great for non-dynamic array–enabled versions of Excel)
Its special skill at ignoring hidden rows, errors, and other subtotals
It is not easy to understand. As the function and options arguments return the selection as an index number, it is confusing to the user reading the formula.
It was designed to aggregate vertical ranges of data. If it is used to aggregate data in rows, aggregate cannot ignore hidden columns. The aggregate result will include the hidden values.
Criteria can be added to the AGGREGATE function just like with SUMPRODUCT (an example is shown); however, when adding criteria, the options such as ignoring hidden rows are not applied. This is because once criteria are applied, the range is converted to an array, and arrays cannot have hidden values.
Ignoring Hidden Rows
When using functions such as SUM and AVERAGE on a range that contains hidden rows, the hidden values are still included by the function.
The AGGREGATE function can be requested to perform calculations on the visible cells only, ignoring any hidden rows, which are probably the result of filtering a list.
Let’s now look at an example where we want to return the second largest value in the [tblSales] table while ignoring hidden rows. When using the LARGE function, we will be required to use the K argument.
As the AGGREGATE formula in Figure 9-28 is requested to ignore hidden rows, it returns the correct result of 424.30.
Ignoring Errors in a Range
Functions such as SUM, LARGE, and MEDIAN will not work if a table column or range contains error values. However, the AGGREGATE function has a nifty option to ignore error values.
These examples aggregate a single column of values; however, the AGGREGATE function can handle multiple value ranges and ranges that are multiple columns wide.
The AGGREGATE function provides seven different options. These include the individual options to ignore hidden rows, ignore error values, and ignore other totals. But there are also combined options.
A filter is applied to the table to hide the sales of “Hot Chocolate” and “Wine” in the [Product Name] column.
Adding Criteria to AGGREGATE
The AGGREGATE function, like SUMPRODUCT, can process arrays. This makes it a very useful function for those on older versions of Excel, as it negates the requirement to press Ctrl + Shift + Enter when entering an array formula.
In Excel for Microsoft 365, Excel 2021, and Excel for the Web, all formulas can handle arrays (covered in Chapter 10), so this benefit of the AGGREGATE function is not as important.
This is a fantastic functionality; however, options such as ignore hidden rows cannot be used when adding criteria to the AGGREGATE function.
This is a shame as it removes one of the key strengths of AGGREGATE, but it can still be a very useful technique.
A criterion has been added within a set of brackets and multiplied by the [Total] column values, just like we did in SUMPRODUCT.
Function number 14 is used to specify the LARGE function, and option 4 is used to ignore nothing. Remember, when adding criteria in an array, options such as hidden rows will not be applied anyhow.
Criteria can only be added like this, for the six functions that can handle arrays, as stated earlier in the chapter.
Function Specified by a Cell Value
For the final examples, let’s use a drop-down list to make it easy for the user to select the function they would like to apply. The user can simply select the aggregate function from a list, instead of typing those awkward index numbers in.
Figure 9-32 shows a table named [tblFunctions]. The [Function] column contains five functions that are being used as the source for a drop-down list. The [Index] column contains the index number associated with each of these functions.
Figure 9-33 shows the drop-down list of functions in cell E2 above [tblSales].
We will use the VLOOKUP function (other lookup functions are covered in Chapter 11) to look up the selected function in cell E3 within [tblFunctions] and return the index number for the AGGREGATE function to use.
The VLOOKUP function is nested within the function num argument of AGGREGATE. It returns the function number from [tblFunctions] for the selected function in E2.
With the 19 different functions on offer by AGGREGATE, it makes it simple to create this interactivity and enable a user to choose the function from a nice drop-down list.
Summary
In this chapter, we learned the SUMPRODUCT and AGGREGATE functions in Excel – two powerful functions that offer various benefits over other aggregation functions such as SUMIFS and AVERAGE.
Different examples were shown to demonstrate the full capability of these functions and detail of their limitations.
In the next chapter, we will dive into dynamic array formulas in Excel. One of the greatest updates in Excel history, the ability to handle arrays natively, and have them dynamically update in size, has changed how we write formulas.
We will explain what dynamic arrays are and how to use them effectively with Excel functions and features such as charts. We will also detail, with practical examples, a bunch of dynamic array functions including SORTBY, SEQUENCE, and SUM (OK, that one isn’t very new, but this is SUM 2.0, the dynamic array version).