© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
A. MurrayAdvanced Excel Formulashttps://doi.org/10.1007/978-1-4842-7125-4_10

10. Dynamic Array Formulas

Alan Murray1  
(1)
IPSWICH, UK
 

The introduction of dynamic array formulas in Excel is one of the greatest updates in Excel history. It completely changes the way that we think and write our formulas in Excel.

This fundamental change in how formulas calculate in Excel affects all formulas. So, this change can affect how you, or somebody else, write a classic Excel function such as SUM, IF, or VLOOKUP.

In addition to the introduction of the dynamic array formula engine, Microsoft released six new functions. These functions are commonly referred to as the dynamic array functions, though that is not an official function category.

Dynamic array formulas are only available in the Excel for Microsoft 365, Excel 2021, and Excel for the Web versions. When collaborating with Excel users from external organizations, you should exercise some caution, as they may have a different version of Excel to you.

In this chapter, we will begin by getting to know dynamic arrays. How do you use them, how would you recognize them on a spreadsheet, and what are their limitations?

We then go into detail on five of the six new dynamic array functions – UNIQUE, SORT, SORTBY, SEQUENCE, and RANDARRAY. The sixth function, FILTER, is covered on its own in Chapter 13.

Finally, we explore some examples of how the introduction of dynamic arrays has improved how we use existing Excel formulas. We focus on the FREQUENCY, TRANSPOSE, and SUM functions in this chapter, but the improvements go way beyond just those functions.

Getting to Know Dynamic Array Formulas

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

dynamic-arrays.xlsx

The formulas that we know and love in Excel have always been one cell, one formula. A formula would be filled, probably by clicking and dragging the fill handle, to any other cells in the range requiring that formula.

Dynamic arrays change all that. With the dynamic array formula engine, a formula can return multiple results. If a formula returns more than one value, the additional values are spilled to the adjacent cells. The range of cells that contain the formula is known as the spill range.

In Figure 10-1, the following formula has been entered in cell D2. It returns all the values for the range A1:B6.
=A1:B6

Notice the blue border surrounding the spill range. This appears when you click any cell within the spill range, making dynamic array formulas easy to identify on a spreadsheet.

Now, this is a simple example that is not very practical. However, it demonstrates the concept of dynamic arrays nicely. It also shows that a dynamic array formula can spill across rows and columns.
Figure 10-1

Dynamic array returning multiple values

To achieve this in an Excel version that doesn’t support dynamic arrays, you would need to
  1. 1.

    Select the output range D2:E7.

     
  2. 2.

    Write the formula: =A1:B6.

     
  3. 3.

    Press Ctrl + Shift + Enter.

     
The formula would be entered into all the cells in range D2:E7 and would be enclosed in curly braces, as shown in the following. You would never type these curly braces; they were automatically input by Excel on pressing Ctrl + Shift + Enter.
{=A1:B6}
Figure 10-2 shows an array formula as it would appear when entered in an Excel version that does not support dynamic arrays. Cell E4 is active, and the formula is shown in the Formula Bar. All 12 cells contain the same formula.
Figure 10-2

Array formula in an Excel version without dynamic arrays

Note

Array formulas were often referred to as CSE formulas due to the requirement to press Ctrl + Shift + Enter on running them.

The dynamic array formula back in Figure 10-1 only exists in the origin cell and spills the results (not the formula) to the other cells in the spill range.

In Figure 10-3, cell D3 is active. In the Formula Bar, the formula is shown in a light gray font. This is a visual so that we can understand how the result was achieved; the formula is not actually existing in the cell. If you were to click in the Formula Bar, nothing is shown.
Figure 10-3

Formula shown in spill range but is not present

Arrays vs. Dynamic Arrays

This example demonstrates nicely that although the use of arrays is not new in Excel, array formulas that output multiple results were awkward. You would need to know the cells that the formula would output to, which was a massive constraint to their use. They were static arrays.

So, dynamic arrays really are a combination of two key developments – dynamic + arrays.

These formulas are entered in the same way as any other formula and will handle arrays without the need to press Ctrl + Shift + Enter. And they are dynamic. You do not need to know the range of cells to output to. The spill range will dynamically grow and shrink depending on the number of values being returned by the formula.

Now, not all array formulas return multiple values, so single cell array formulas were easier to apply in the Excel of the past. You just needed to recall pressing Ctrl + Shift + Enter, and not just Enter.

However, did your colleagues know to press Ctrl + Shift + Enter, and what were the curly braces around the formula? Did you need to type them?

These limitations meant that, in the past, writing array formulas ended up being for advanced uses only, and the SUMPRODUCT function gained a lot of love as it could handle the arrays natively.

With dynamic array formulas, this is all a thing of the past.

Dynamic Arrays with Tables

In Figure 10-4, the following formula returns all the values from the table in range A1:B6 named [tblData]. If the [#All] was omitted, the formula would only return the table data excluding the headers.
=tblData[#All]
Figure 10-4

Dynamic array formula referencing all values in a table

Using dynamic array formulas with table data is a dream combination. The first example that referenced range A1:B6 works great. However, it is not a fully dynamic solution as using the range as the source is not dynamic. So, you are losing 50% of the brilliant dynamic + array capabilities.

Tables are dynamic and grow and shrink with data automatically. So, using table data as the source for a dynamic array formula ensures a completely dynamic solution.

Figure 10-5 shows a new row added to the table. The table expands, and the dynamic array formula expands in sync.
Figure 10-5

Dynamic array formula and table provide a fully dynamic solution

Dynamic Array Formula Example

Let’s look at a more practical case for a dynamic array formula. Figure 10-6 shows a table named [tblScores]. We will use an AVERAGEIFS function to calculate the mean average score for each region.
Figure 10-6

Table of scores

The common method to achieve this would be to write an AVERAGEIFS function like the following and fill it down to cell F5, as shown in Figure 10-7:
=AVERAGEIFS(tblScores[Score],tblScores[Region],E3)
The formula references the scalar value in cell E3 for the criteria. The formula is then filled down to cells F4 and F5, which reference the values in cells E4 and E5 consecutively. This is classic Excel.
Figure 10-7

Classic AVERAGEIFS being filled across other cells

With dynamic array formulas, we could reference the range of criteria values, instead of a single criteria value at a time.

Figure 10-8 shows the following formula as an alternative to the previous approach. Notice the blue border around the results identifying the perimeter of the spill range.
=AVERAGEIFS(tblScores[Score],tblScores[Region],E3:E5)

Instead of producing three formulas – one formula per cell – the dynamic array formula is one formula that returns three results.

Once again, this example is not dynamically complete as the criteria is not based on table data or a DA spill range. It, therefore, is not maximizing the full potential of dynamic arrays. We will soon be constructing our dynamic array formulas on table data and spill ranges only.
Figure 10-8

AVERAGEIFS returning a spill range

The last formula example could also be applied in non-dynamic array–enabled versions of Excel. However
  • The output range of F3:F5 would need to be selected before writing the formula. No “spilling” in older versions.

  • The formula would exist in each cell, while dynamic arrays are one cell, multiple results.

  • The formula range would need to be in line with the range being referenced as implicit intersection would be applied. In this example, F3:F5 is in line with E3:E5 so it would work. The formula can be run as an array formula by pressing Ctrl + Shift + Enter to override the implicit intersection.

With dynamic arrays, none of this needs to be considered. The formula is written just like any other formula and wherever you want.

Note

Implicit intersection is denoted by the @ symbol. In modern Excel, this is mainly seen in formulas in a table that reference other cells in the same row of a table, for example, =[@Country].

Welcome the # Operator

You may be wondering how one would reference the spill range of a dynamic array formula. You may want to use the results of a dynamic array formula in another formula or maybe in a chart.

If the spill range is dynamic, you wouldn’t necessarily know the final cell of the range.

Well, the answer, as I’m sure you have concluded from the title of this section, is the # symbol (typically referred to as the hash or pound sign).

To refer to all the cells in the spill range, you reference the origin cell that contains the dynamic array formula followed by the # symbol, for example, A2#.

In Figure 10-9, the following COUNT function is being entered and refers to the spill range that exists in cell F3. Notice the range F3:F5 changes color as it is recognized as the spill range on entering the # symbol.
=COUNT(F3#)
Note

The # symbol references the entire spill range including a spill range that contains multiple columns and rows. In Chapter 11, we will cover how the INDEX function can be used to reference specific columns or rows of a spill range only.

Figure 10-9

Spill operator accesses all values in a spill range

The #SPILL! Error

The #SPILL! error is created when the results of a dynamic array formula cannot be returned to the grid. This is typically caused by other content blocking the spill range.

In Figure 10-10, some text is causing the #SPILL! error to be returned. The DA formula needs blank cells in the spill range to return the results to. To remove the error, this text needs to be moved or deleted.
Figure 10-10

#SPILL! error caused by content blocking the spill range

Note

This is a similar behavior to what you may have encountered with PivotTables. They return an error when they are refreshed and need to expand in size but do not have the space to expand into.

The #SPILL! error is also returned if there are merged cells in the range that the formula needs to spill to or if a dynamic array formula is used within a table.

Figure 10-11 shows the #SPILL! error caused by merged cells in the spill range. The cells in range F5:G5 have been merged in this example.

You would need to unmerge the cells to remove the #SPILL! error.
Figure 10-11

Merged cells causing a #SPILL! error

Merged cells should only be applied in very specific cases in Excel. Very few actions can attract the ire of an Excel user more than the sight of merged cells.

DA Formulas Cannot Be Used Within Tables

The #SPILL! error is also caused by a dynamic array formula being used within a table. An example of this is shown in Figure 10-12. Note the blue corner in the bottom right indicating that the range is formatted as a table.

Using tables as the source data for dynamic arrays is an ideal relationship as it provides a fully dynamic solution. However, dynamic arrays are their own range and cannot be used within tables.
Figure 10-12

#SPILL! error caused by using a DA within a table

Let’s now look at the dynamic array functions that were released to take advantage of this new formula behavior.

UNIQUE Function

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

unique.xlsx

The UNIQUE function is a tremendous addition to Excel. It is incredibly helpful, and I find myself using it in most spreadsheets that I create.

Creating a distinct list of values is a very common Excel task. The Remove Duplicates button, the Advanced Filter, and PivotTables are popular tools to generate a distinct list of values in Excel. This distinct list may be used as the source for a Data Validation list or to create labels for a report.

This function is often used as a formula alternative for creating a distinct list of values. Being a dynamic array function, this approach would ensure a dynamic list that updates when values are added, removed, or changed in the source data.

In addition to returning a distinct list of values, this function can also return a unique list of values (as its name implies). We will cover the distinction (pun fully intended) between the often misunderstood terms of distinct and unique shortly.

The syntax for the UNIQUE function is as follows:
=UNIQUE(array, [by_col], [exactly_once])
  • Array: The range or array from which to return the values.

  • [By col]: To compare the values in rows or columns. Entered as a logical value (True/False). The default is FALSE to compare values in rows. Enter TRUE to compare the values in columns.

  • [Exactly once]: A logical value. The default is FALSE to return a distinct list of values. Enter TRUE to return a list of the unique values (those that only occur once).

Returning Distinct and Unique Values

The UNIQUE function, despite its name, actually returns the distinct values by default. A list of the distinct values is a list with the duplicate values removed. This is the most common use of this function.

A list of the unique values is a list of values that occur only once in the list. This is specified by entering TRUE for the final argument of the UNIQUE function.

There is a lot of confusion over the terms distinct and unique. Users often refer to a list of distinct values as a list of unique values. So be prepared for that when conversing with other Excel users.

It does not help that Microsoft named the function UNIQUE, yet it returns a distinct list by default. I believe it was named UNIQUE as that is the more common term, and Excel users would be more familiar with it. The final argument does however coincide with the function name. Enter TRUE for a unique list, or enter FALSE or omit the response for a distinct list.

In Figure 10-13, the following formula returns the distinct values from the [Customer] column. Five names are returned as there are three duplicates in the column.
=UNIQUE(tblAttendances[Customer])
Figure 10-13

Distinct list of values returned by UNIQUE

Note

Remember that these dynamic array functions automatically update as the source data changes. This formula has a table source for a dynamically complete formula.

In Figure 10-14, the following formula returns the unique names from the [Customer] column. The second argument is ignored, and TRUE is entered for the exactly once argument. There are three names that occur exactly once.
=UNIQUE(tblAttendances[Customer],,TRUE)
Figure 10-14

Returning unique values with the UNIQUE function

UNIQUE with Multiple Columns

The examples so far have demonstrated the UNIQUE function with a single column. However, UNIQUE can handle multicolumn arrays or even entire tables as the array from which to return the distinct or unique results.

In Figure 10-15, the following formula returns the distinct values from the [Name] and [Office] columns of the [tblPoints] table:
=UNIQUE(tblPoints[[Name]:[Office]])
The results show that an “Elizabeth Brown” has been returned twice. This is because the offices are different (London and York), so they must be two distinct Elizabeth Browns. However, there is a duplicate of “Maria Larsson” in [tblPoints], so only the first of those records was returned.
Figure 10-15

UNIQUE function using a multiple column array

In this multiple column array example, the columns were adjacent. In fact, the formula used the range tblPoints[[Name]:[Office]].

Let’s see how the UNIQUE function can be used with a multiple column array where the columns are non-adjacent.

In Figure 10-16, the following formula returns the distinct values from the [Name] and [Office] columns again, but in the [tblPoints2] table, they are separated by the [Points] column. The CHOOSE function (covered in Chapter 11) is used to supply UNIQUE with the [Name] and [Office] columns only.
=UNIQUE(
CHOOSE({1,2},tblPoints2[Name],tblPoints2[Office])
)
This is a technique that can be used to choose which columns you want to extract from any table or array. The CHOOSE function is given the array of constants {1,2}. It is then told to return the [Name] column as column 1 and the [Office] column as column 2. This is passed to UNIQUE to work with.
Figure 10-16

Returning non-adjacent columns with UNIQUE

You can also specify the order that the columns should be returned. We could have returned the [Office] column to the left of the [Name] column by using the following formula:
=UNIQUE(
CHOOSE({1,2},tblPoints2[Office],tblPoints2[Name])
)

UNIQUE to Compare Columns

The UNIQUE function compares the values in rows by default. This is the more common practice as we handle columnar data usually.

It is no problem for UNIQUE to work with data stored in rows though, if required. The second argument of the UNIQUE function allows us to switch to comparing values in columns. This is done by entering TRUE for the second argument named By col.

In Figure 10-17, the following formula returns the distinct values from range B4:G4:
=UNIQUE(B4:G4,TRUE)
Figure 10-17

UNIQUE function comparing columns

The results are returned along a row. The TRANSPOSE function could be used to convert the results into a column. This example is shown when we cover TRANSPOSE later in the chapter.

Distinct Labels for SUMIFS

We covered an example earlier in the chapter that used the AVERAGEIFS function on a range of criteria values. This generated a spill range with the results for each criterion in that range (Figure 10-8).

That example was a static report as the criteria values were simply typed into the cells. Let’s explore a dynamic solution where the UNIQUE function will return the distinct values from a table. A SUMIFS function will then use the spill range generated by UNIQUE for its criteria values.

Figure 10-18 shows the first few rows of a table named [tblSales] that we will use for this example. Our goal is to create a small report showing the sum of the [Total] column for each product in the [Product Name] column.
Figure 10-18

Table of sales data

In Figure 10-19, the summary report is created. The following UNIQUE formula has been used in cell B3 to generate the distinct values from the [Product Name] column of [tblSales]:
=UNIQUE(tblSales[Product Name])
The following SUMIFS function is entered in cell C3 to sum the [Total] column for each of the product names in the spill range of cell B3:
=SUMIFS(tblSales[Total],tblSales[Product Name],B3#)
This report is completely dynamic. If new rows are added to the [tblSales] table, the report will update. And if new product names are added, the report will automatically expand with the new product and its total.
Figure 10-19

Table showing sales by product using UNIQUE and SUMIFS

Distinct Count Formula

The examples so far have all returned values to the grid. The UNIQUE function can, of course, also be nested within other functions. A natural example of this would be to count the number of distinct values returned.

In Chapter 9, we created a distinct count formula using SUMPRODUCT and COUNTIFS. With dynamic arrays and the UNIQUE function, this is much simpler.

In Figure 10-20, the following formula is entered in cell D7. It uses the UNIQUE function with COUNTA to count the distinct values only. In this example, that returns the number of different webinars that occurred.
=COUNTA(UNIQUE(tblWebinars[Webinar]))
The following formula is entered in cell E7 to return the number of webinars that only occurred once in the table. The result is two, the “Pro Formatting in Excel” and the “Power BI for Beginners” webinars .
=COUNTA(UNIQUE(tblWebinars[Webinar],,TRUE))
Figure 10-20

Count distinct and count unique formulas

SORT Function

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

sort.xlsx

Sorting the data in a table is one of the most common everyday tasks that an Excel user performs. The introduction of the SORT function to automate the sorting of the data is a revelation for how we create in Excel.

In the past, PivotTables, VBA, or an overly complex formula would be used to automate the sorting of data in Excel reports and models. With the SORT function, this is now very easy to do.

This is the syntax of the SORT function:
=SORT(array, [sort_index], [sort_order], [by_col])
  • Array: The range or array of data that you want to sort and return.

  • [Sort index]: An index number that represents the column or row of the array to sort by. If omitted, the first column or row of the array is used.

  • [Sort order]: The order that the column or row should be sorted. Enter 1 to sort in ascending order and –1 to sort in descending order. If omitted, it will sort in ascending order.

  • [By col]: A logical value that specifies whether to sort by row or by column. The default value is FALSE, which sorts by row (vertical sort). Enter TRUE to sort by column (horizontal sort).

Simple SORT Example

Let’s start with a simple example of the SORT function being used to sort a single column of data.

In Figure 10-21, the following formula sorts the list of country names in ascending order.

As the table is a single column of data, there is no need to specify a column to sort by (sort index). The SORT function also defaults to sorting in an ascending order, which is what we want. So, no other arguments outside of the array to sort are given in this example. Very simple!
=SORT(tblCountries[Name])
Figure 10-21

Simple example of the SORT function

Sort the Distinct Values

When discussing the UNIQUE function previously, we created a small report showing the total sales by product from data in [tblSales] (Figure 10-18). This report is shown in Figure 10-22.
Figure 10-22

Total sales by product in no particular order

The results in this report are in no particular order. They actually appear in the order that UNIQUE finds them in [tblSales]. This is not useful. Let’s use the SORT function to sort the distinct list of product names in ascending order.

In Figure 10-23, the SORT function is added to sort the values returned by UNIQUE.
=SORT(UNIQUE(tblSales[Product Name]))
Once again, no other arguments are required because we are sorting a single column, and we require the argument defaults of ascending order and sort by row (vertical sort).
Figure 10-23

Sorting the values returned by UNIQUE

Sort by Specified Column

Let’s move on and see examples of the SORT function returning an array with multiple columns. For these examples, we will use the table named [tblProductSales] (Figure 10-24).
Figure 10-24

Table of product sales

For our first example in Figure 10-25, the following formula is using the entire [tblProductSales] table for its array. None of the optional arguments are specified, so the SORT function has applied an ascending sort to the first column in the array.
=SORT(tblProductSales)
So, in this example, the results are in ascending order by the product name.
Figure 10-25

SORT function returning a multiple column array

Let’s specify a different column for the results to be sorted by. We want the results to be in a descending order by the [Total] column.

We need to provide the SORT function with the sort index. So, for the [Total] column, this will be column 4.

In Figure 10-26, the following SORT formula has been used. –1 has been entered for the sort order to sort the values in a descending order.
=SORT(tblProductSales,4,-1)
Figure 10-26

SORT results by column 4 in descending order

Sort by Multiple Columns

It is possible to request the SORT function to sort using multiple columns. To do this, we will enter the multiple sort indexes and sort orders within arrays.

For this example, we will sort the [tblProductSales] data by two levels: first by the [Category] column in ascending order and then by the [Total] column in descending order.

In Figure 10-27, the following SORT formula is entered in cell B3:
=SORT(tblProductSales,{2,4},{1,-1})

The array {2,4} is used in the sort index argument to specify that the data should be sorted by column 2 and then by column 4.

The array {1,-1} is used in the sort order argument to state that column 2 should be sorted in an ascending order and column 4 should be sorted in a descending order.

This formula could be extended to sort by more columns if required.
Figure 10-27

Sorting an array by multiple columns with SORT

SORT Function Returning Specific Columns from a Table

For the final SORT function example, we will return specific columns from [tblProductSales].

Dynamic array functions in isolation can only return columns that are adjacent to each other. In this example, we will return the [Product Name], [Units], and [Total] columns only.

These columns are split by the [Category] column in the table, so SORT will need some assistance to work with these columns as they are non-adjacent.

To do this, we will nest the CHOOSE function within the SORT function to specify the columns to work with. We saw the CHOOSE function assuming this role previously in this chapter within the UNIQUE function.

Note

In Chapter 11, we will see the INDEX and MATCH functions together to create a more dynamic method for returning non-adjacent columns in dynamic arrays.

In Figure 10-28, the following SORT formula returns the three required columns and sorts the data in descending order by the [Total] column. Column 3 is specified as the sort index as the [Total] column is the third column of the array returned by CHOOSE.
=SORT(
CHOOSE({1,2,3},
tblProductSales[Product Name],tblProductSales[Units],tblProductSales[Total]),
3,-1)
Figure 10-28

SORT function returning specific columns from a table

When returning only three columns, this works quite well and is simple to deploy. In more complex examples, this CHOOSE technique can be bulky, and a more advanced technique would involve the INDEX function. We will see examples of this technique in Chapters 11 and 13 of the book.

Note

No examples are shown in this chapter of sorting data by column (sorting horizontally) due to the rarity of this behavior. Excel data is better in a columnar layout. However, TRUE can be entered for the By col argument to sort horizontally. An example of the By col argument was shown with the UNIQUE function previously.

SORTBY Function

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

sortby.xlsx

If the SORT function is not enough for you, Microsoft also gave us SORTBY. There are benefits for using each of these two sort functions.

The SORT function is perfect for very simple single column sorting. It is also better for sorting based on dynamic columns, as the columns are specified using index numbers.

The SORTBY function has two key advantages over SORT:
  • The column to sort by is specified using the range or array. This absolute reference to the column to sort by makes it fantastic for use with table data.

  • With the SORTBY function, you can sort an array by a column that is outside of the array being returned.

This is the syntax for the SORTBY function:
=SORTBY(array, by_array1, [sort_order1], [by_array2], [sort_order2])
  • Array: The range or array of data that you want to sort and return.

  • By array1: The range or array to sort by. The array must be one column wide or one row high.

  • [Sort order1]: The order that the column or row should be sorted. Enter 1 to sort in ascending order and –1 to sort in descending order. If omitted, it will sort in ascending order.

  • [By array2], [sort order2]: The corresponding arrays to sort by and sort orders for any additional sort levels you require.

    Note The SORTBY function can sort data that is arranged in rows or columns. It does not have a special argument for this functionality like UNIQUE and SORT do. SORTBY will automatically understand the direction of the sort and spill range.

The following examples of the SORTBY function will use the table shown in Figure 10-29 named [tblProductSales].
Figure 10-29

Table of product sales

Simple SORTBY Example

Let’s begin with a simple example that returns all columns from the [tblProductSales] table sorted in descending order by the [Total] column.

In Figure 10-30, the following formula is entered in cell B3. This SORTBY formula is more meaningful than its SORT equivalent (Figure 10-26) as it uses the structured reference of tblProductSales[Total] instead of the sort index of 4. The –1 specifies the descending order.
=SORTBY(tblProductSales,tblProductSales[Total],-1)
Figure 10-30

Simple SORTBY example sorting the array by the [Total] column

Sort by Multiple Columns with SORTBY

The SORTBY function can sort an array by multiple columns, and, once again, it is an easier task than the equivalent formula with the SORT function (Figure 10-27).

SORTBY has arguments for us to keep specifying the additional columns and sort orders that we want to apply.

In Figure 10-31, the following formula sorts the data from the [tblProductSales] table by [Category] in ascending order and then by [Total] in descending order:
=SORTBY(tblProductSales,
tblProductSales[Category],1,tblProductSales[Total],-1)
A 1 is entered in the sort order 1 argument to specify ascending order for the [Category] column. This could have been omitted, and the formula would have worked fine, as ascending order is the default. However, it is a more meaningful formula, I believe, when this sort order is stated.
Figure 10-31

Sort by multiple columns with SORTBY

Sort by Column Outside of the Returned Array

A great advantage that the SORTBY function has over SORT is the ability to sort an array using a column, or row, that is not included in the returned array.

In Figure 10-32, the [Product Name] and [Total] columns are returned with some help from the CHOOSE function. This array has been sorted in a descending order by the [Units] column, which is not included in the returned array. Pretty cool!
=SORTBY(
CHOOSE({1,2},tblProductSales[Product Name],tblProductSales[Total]),
tblProductSales[Units],-1)
Figure 10-32

SORTBY sorting by a column not included in the returned array

This can be a very useful technique. Let’s see another example.

Sort Product Name by Sales Totals

Let’s return to the sales by product report that we have used in a few examples of this chapter so far. Figure 10-33 shows this report with the total sales by product and the results in an ascending order by the product name.
Figure 10-33

Sales by product sorted by the product name

We will change this to sort the results by the total sales in a descending order. It will be more practical to order the products by the sales total rather than their name. And it’s a great example of how useful SORTBY is.

We will start by using the SORTBY function to sort the product names by the sales totals. The SUMIFS function will be used to sum the [Total] column for each product. With the product names ordered correctly, the SUMIFS function will then be used again to create the column of total sales values.

In Figure 10-34, the following formula sorts the distinct list of product names returned by UNIQUE in a descending order by the totals returned by SUMIFS:
=SORTBY(UNIQUE(tblSales[Product Name]),
SUMIFS(tblSales[Total],tblSales[Product Name],UNIQUE(tblSales[Product Name])),
-1)
Figure 10-34

Sort product names by sales totals in descending order

This is another nice example of using SORTBY to sort by an array outside of the one being ordered. We now need to return the totals.

In Figure 10-35, the following formula uses the SUMIFS function to sum the values in the [Total] column for each product in the spill range B3#:
=SUMIFS(tblSales[Total],tblSales[Product Name],B3#)
We will use these spill ranges in a bar chart later in this chapter. This dynamic sorting by sales totals will ensure that our bar chart orders the bars from largest to smallest.
Figure 10-35

SUMIFS to sum the totals for each product

SEQUENCE Function

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

sequence.xlsx

The first time that you see it, it may not immediately be apparent how brilliant the SEQUENCE function is. Certainly not in the way that functions such as SORT and UNIQUE will instantly earn your affection. But believe me, SEQUENCE is an awesome function.

The SEQUENCE function returns a series of numbers. It is the formula equivalent of the Fill Series feature in Excel. As a function, we can nest it within other functions to create some real formula magic.

It accepts four arguments, but only the first argument (rows) is required:
=SEQUENCE(rows, [columns], [start], [step])
  • Rows: The number of rows to return.

  • [Columns]: The number of columns to return.

  • [Start]: The first number in the sequence. If omitted, the sequence begins from one.

  • [step]: The amount to increment each subsequent value in the sequence. If omitted, it will step one at a time.

Stepping into the SEQUENCE Function

Let’s start with some basic examples to get a feel for how to operate the SEQUENCE function. This will prepare us for the more practical examples to come in this chapter and the next.

In Figure 10-36, a simple SEQUENCE function is entered in cell B2 to generate a series of numbers from one to eight. Only the rows argument is entered. A start and step value of one is used as these arguments were omitted.
=SEQUENCE(8)
Figure 10-36

Number series across rows with SEQUENCE

In Figure 10-37, the columns argument is included in the SEQUENCE function. The formula generates a series of numbers starting from one that is eight rows high and three columns wide.
=SEQUENCE(8,3)
The order of the number sequence goes row by row. Shortly, we will see how this order can be changed to column by column.
Figure 10-37

Using the rows and columns arguments of SEQUENCE

Let’s now add the start and step arguments to the SEQUENCE function. In Figure 10-38, all arguments are supplied.
=SEQUENCE(6,4,5,5)
The formula returns a series of numbers that is six rows high and four columns wide. It begins from number five and steps every fifth number.
Figure 10-38

SEQUENCE function with all arguments specified

A negative step can be used to reverse the sequence of values returned. In Figure 10-39, the following SEQUENCE function returns a series that is five rows high, starts from five, and steps negatively:
=SEQUENCE(5,,5,-1)
Figure 10-39

Reversing a sequence with a negative step

In this example, the columns argument is skipped. The default one column wide is used.

Changing the Row/Column Order

When returning a two-dimensional array, we saw that the numbering of the series goes across columns, before going to the next row (Figure 10-38).

To switch the ordering of the numbers in the series, the TRANSPOSE function could be wrapped around SEQUENCE.

In Figure 10-40, the following formula switches the direction of the series of the formula shown in Figure 10-38 to down across rows before moving to the next column:
=TRANSPOSE(SEQUENCE(4,6,5,5)
The TRANSPOSE function has been added, and the number of rows and columns has been reversed. SEQUENCE returns an array of four rows and six columns, and this is transposed. Awesome!
Figure 10-40

Changing the row/column order with TRANSPOSE

Let’s see some examples now that really showcase the flexibility and potential of the SEQUENCE function.

Sum the Top N Values

Performing a calculation on the top N values is a common task for reporting in Excel. For this example, we will sum the top N values, but a calculation such as average could also be applied.

We saw a method to sum the top five values in Chapter 9 with the SUMPRODUCT function. This was great as it works for all versions of Excel.

If you have a dynamic array–enabled version of Excel though, there is a better way. It is simpler and more dynamic. And of course, SEQUENCE plays an important role.

The LARGE function will also be used in our formula. This function has been covered already in this book, but let’s briefly remind ourselves of the purpose and syntax of the LARGE function.

The LARGE function returns the kth largest value in a range, for example, the second, third, or fifth largest value. Here is its syntax:
=LARGE(array, k)
  • Array: The range, table column, or array from which to return the kth largest value

  • K: The position of the value in the array (from the largest) that you want to return

In Figure 10-41, the following formula sums the top N values where N is specified by the value in cell D3:
=SUM(LARGE(tblProductSales[Total],SEQUENCE(D3)))

The SEQUENCE function is used for the K argument of the LARGE function. It returns an array of numbers determined by the value in cell D3. In this example, cell D3 contains the number 3, so the SEQUENCE function returns {1,2,3}.

This replaces the need to enter an array of constants like we did in the SUMPRODUCT example in Chapter 9. This also keeps it dynamic.

The LARGE function returns the first, second, and third largest values from the [Total] column, and these are summed by SUM.
Figure 10-41

Summing the top N values

Sum the Last N Days

The SEQUENCE function is great for creating a series of dates that occur at specific intervals. The activity on these dates can then be analyzed with other formulas or charted.

For this example, we will use the SEQUENCE function to return the last N days. We will find the last date in a range and then return each date for the last N days. The number of days will be specified by a cell value.

Figure 10-42 shows the first few rows of a table named [tblSales]. This table will be used as the data source for the next two examples. It contains sales data with daily transactions. The table has been sorted by [Sales Rep] to show the different dates in the [Date] column.
Figure 10-42

tblSales data with daily transactions

In Figure 10-43, the following formula uses SEQUENCE to return the dates for the number of rows entered in cell C2. The MAX function has been used to return the latest date in the [Date] column. A negative one has been used for the step argument to reverse the series of dates and return the last seven dates from the latest date.
=SEQUENCE(C2,,MAX(tblSales[Date]),-1)
The following SUMIFS function returns the total for each date in the spill range returned by SEQUENCE. You can see the B5# reference for the criteria argument.
=SUMIFS(tblSales[Sales],tblSales[Date],B5#)

This is a fully dynamic solution.

If new dates are added to [tblSales], the MAX function will pick it up, and SEQUENCE will adjust to the new data. And if a different value is entered into cell C2, the SEQUENCE function will return that number of rows. The SUMIFS function references the spill range, so this will also update.
Figure 10-43

SEQUENCE returning a series of dates for analysis

Sum the Last N Months

Let’s expand on this example and get the SEQUENCE function to return the last N months from the last month in the [tblSales] table.

In Figure 10-44, the following formula returns the last seven months from the latest date in the [Date] column. The number of months is specified by the value in cell F2.
=DATE(2021,
SEQUENCE(F2,,MONTH(MAX(tblSales[Date])),-1),
1)

The DATE function is used to construct a date value for the results. These values could be formatted to show month names using the custom number formatting feature of Excel. I have not bothered with this in the example, as I’m happy to show the first date of each month.

The MAX function is used again to return the latest date in the [Date] column of [tblSales]. The MONTH function extracts the month number of the latest date.
Figure 10-44

DATE and SEQUENCE functions to return last N months

In Figure 10-45, the following SUMIFS function totals the values from the [Sales] column that occurred between the first and last day of the month in context:
=SUMIFS(tblSales[Sales],
tblSales[Date],">="&E5#,
tblSales[Date],"<="&EOMONTH(+E5#,0)
)

The spill range is accessed using E5# for the first date of the month.

The EOMONTH function is used to return the last date of a month a specified number of months in the future or past. A zero is entered for the months argument to return the last date of the current month.

You may notice the + sign before the spill reference in the EOMONTH function. The EOMONTH function is one of a few functions in Excel that does not spill. An error is returned if the spill reference E5# is used. A trick to work around this issue is simply to add the + before the spill reference.
Figure 10-45

Total values between first and last day of a month

Note

The DATE and EOMONTH functions were covered in detail in Chapter 6.

Last N Months by Country – Pivot Style Report

The last two examples have used the SEQUENCE function to spill the date values across rows. But, of course, SEQUENCE can spill across columns also. This feature can be used to create awesome pivot style reports with time-based analysis.

In Figure 10-46, the following formula is used in cell C4 to spill the last N months across columns. The number of months spilled is based on the value in cell D2 .
=DATE(2021,
SEQUENCE(,D2,MONTH(MAX(tblSales[Date]))-D2+1,1),
1)

There are some differences in this formula compared to the previous one that spilled across rows.

Notably, the rows argument is omitted. The other key difference is that the dates are ordered smallest to largest. In the previous example, they were largest to smallest. To do this, the value in cell D2 was subtracted from the month of the maximum date. This takes us a month too far, so one is added. The SEQUENCE function steps one month at a time from that date .
Figure 10-46

SEQUENCE for last N months spilled across columns

The following SORT and UNIQUE function combination is used to generate the list of countries:
=SORT(UNIQUE(tblSales[Region]))
The following SUMIFS formula is used to sum the values in the [Sales] column for the country entered in the B5# spill range and month in the C4# spill range. The + is entered before the spill range for the EOMONTH function again to get this function to work with the dynamic array.
=SUMIFS(tblSales[Sales],
tblSales[Region],B5#,
tblSales[Date],">="&C4#,
tblSales[Date],"<="&EOMONTH(+C4#,0)
)

Sum Values from Alternate Rows/Columns

There are many scenarios in Excel where we can take advantage of the SEQUENCE function and its ability to return a sequence of values. Another of these scenarios is to perform a calculation on the values in every Nth cell. In the first of these examples, we will sum the values in alternate rows.

Figure 10-47 shows the following formula being used to sum the “Income” values only:
=SUM(
INDEX(tblAccounts[Amount],
SEQUENCE(ROUNDUP(ROWS(tblAccounts[Amount])/2,0),,,2))
)
Figure 10-47

Sum alternate rows with SEQUENCE

The SEQUENCE function returns the following array of numbers. These numbers are the row numbers of the table for which we want to sum the values.
{1;3;5;7}

Four numbers are returned, as the ROWS function returns the total number of rows in the table (eight), and this is divided by two as we want every other row, that is, half the rows.

The ROUNDUP function is included in the scenario that we have an odd number of rows. In this example, it is not relevant, as we are dealing with income and expenses, so the total rows will be an even number. However, it has been added for completeness. If there were 9 rows in total, half would produce 4.5 and ROUNDUP would make this 5. The rows {1;3;5;7;9} would be returned in that scenario.

The columns and start arguments are omitted, and the SEQUENCE function is asked to step by two.

The INDEX function is used to return the values from the [Amount] column for each of the rows. It returns the following array for the SUM function:
{4726;4227;4835;3946}

With the SEQUENCE function, the values that we step through can represent rows or columns, and we can specify any Nth value. For a second example, we will return to a task that we completed with the SUMPRODUCT function along with others in Chapter 9.

In Figure 10-48, the following formula is entered in column B and sums the values in every fourth column (the two quarter totals):
=SUM(
INDEX(C2:J2,,
SEQUENCE(,COLUMNS(C2:J2)/4,4,4))
)
Figure 10-48

Summing the value in every Nth column

This time, the rows argument is omitted, and the COLUMNS function is used to return the total columns in the range C2:J2 (eight columns). This value is then divided by four (the number of columns we want to step) to return the result of two columns.

SEQUENCE is then told to start from four and step every four. The following array is returned:
{4,8}

The INDEX function then returns the values from columns 4 and 8 in the range C2:J2 to be summed. Notice that the rows argument of INDEX is omitted.

Note

We will see more of the INDEX and SEQUENCE functions working together when we discuss the INDEX function in detail in Chapter 11.

RANDARRAY Function

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

randarray.xlsx

For years, Excel has had two functions that are used to return random values – RAND and RANDBETWEEN. There is now a third musketeer – the RANDARRAY function. This function merges the characteristics of the two existing functions along with further advantages.

RAND and RANDBETWEEN

Availability: All versions

Let’s begin with a quick look at the RAND and RANDBETWEEN functions. They are still great to use, and understanding them will help us appreciate the role that RANDARRAY plays.

The RAND function returns a random number between 0 and 1. For example, RAND will generate a number like 0.657233. Its syntax is very simple as it has no arguments:
=RAND()
Figure 10-49 shows the RAND function entered in cell B2 and filled down to cell B6 to return five random numbers between 0 and 1. Five formulas are used here as we are using a non-DA formula. RANDARRAY will be able to achieve this in one formula.
Figure 10-49

RAND function generating random numbers between 0 and 1

The RANDBETWEEN function returns a random integer number from within a given range. The bottom and top values of this range are specified by us. The RANDBETWEEN function requires only two arguments:
=RANDBETWEEN(bottom, top)
  • Bottom: The smallest integer number that RANDBETWEEN can return

  • Top: The largest integer number that RANDBETWEEN can return

Figure 10-50 shows the RANDBETWEEN function returning random integer numbers between 50 and 250.
=RANDBETWEEN(50,250)

It has been entered in cell B2 and filled down five rows and across three columns to cell D6. There are 15 formulas entered to produce these random values. It is probably no surprise that RANDARRAY can handle this task also with just one formula that spills.

So, the two key differences between RAND and RANDBETWEEN are that RAND returns decimal values while RANDBETWEEN returns integers and that RANDBETWEEN allows us to specify the range of values from which to return.
Figure 10-50

RANDBETWEEN returning random integer numbers between 50 and 250

Both functions generate a new random number every time the worksheet calculates.

Note

Functions that behave in this manner are known as volatile functions.

Introduction to RANDARRAY

The RANDARRAY function returns an array of random numbers. All the finest qualities of RAND and RANDBETWEEN are included, with the added ability to return a dynamic array that spills.

You can specify the number of rows and columns to spill, whether you want decimal values or integers, and you can specify the bottom and top values from which to return the random number(s).

The syntax of the RANDARRAY function is as follows. Interestingly, all the arguments are optional:
=RANDARRAY([rows], [columns], [min], [max], [integer])
  • [Rows]: The number of rows to return.

  • [Columns]: The number of columns to return.

  • [Min]: The smallest number that the RANDARRAY function can return.

  • [Max]: The largest number that the RANDARRAY function can return.

  • [Integer]: Would you like an integer value returned? Type TRUE to return an integer value or FALSE for a decimal value. If omitted, FALSE is applied and decimal values are returned.

Let’s look at a few examples that show different applications of these arguments. Then we will see two practical examples of RANDARRAY.

Figure 10-51 shows the RANDARRAY function being used with no arguments. It behaves like the RAND function and returns a random number between 0 and 1 to a single cell. Remember, the default of RANDARRAY is to return a decimal value.
Figure 10-51

RANDARRAY with no arguments entered

And in Figure 10-52, the RANDARRAY function returns a random integer number between 5 and 250. It is behaving like RANDBETWEEN in this example.

The rows and columns arguments have been omitted, and TRUE is specified to return an integer.
=RANDARRAY(,,5,250,TRUE)
Figure 10-52

RANDARRAY being applied like RANDBETWEEN

Now, these examples show us nothing that we could not have achieved with the existing RAND and RANDBETWEEN functions. They are only shown to provide an extensive understanding of using RANDARRAY and to show that the characteristics of both existing random number–generating functions exist within RANDARRAY.

A key advantage of RANDARRAY is clearly its ability to return an array and therefore be used within our dynamic array formulas.

In Figure 10-53, the formula returns five rows of values that are between 1 and 10. The columns argument is omitted, so a single column of values is returned. The integer argument is also omitted, so RANDARRAY returns decimal values.
=RANDARRAY(5,,1,10)
Figure 10-53

RANDARRAY returning five rows of decimal values between 1 and 10

This example not only demonstrates an array being returned but also RANDARRAY returning decimal values larger than 1. This is something that the RAND and RANDBETWEEN functions alone do not offer. RANDBETWEEN returns integers only, and RAND returns decimal values, but only between 0 and 1.

Finally, in Figure 10-54, all arguments of RANDARRAY are completed. An array of five rows and five columns is returned with integer numbers between 1 and 500.
=RANDARRAY(5,5,1,500,TRUE)
Figure 10-54

RANDARRAY with all arguments completed

Pick a Name from a List at Random

A practical example of using the RANDARRAY function is to return a value at random from a list. In this example, we want to return a name at random from a table named [tblNames].

In Figure 10-55, the following formula is entered in cell D3. A single value is being returned, so the rows and columns arguments are omitted. The ROWS function is used to return the total count of names in the table. This is used as the maximum number to be returned. Integer values are specified.
=INDEX(tblNames,
RANDARRAY(,,1,ROWS(tblNames),TRUE)
)

The random value returned by RANDARRAY is used as the row number to return the name from. This is the row of the table, not the row of the spreadsheet. The INDEX function is used to return the name from that row of the [tblNames] table.

In this example, as a single value is being returned, the RANDBETWEEN function works just as well. The following formula shows the RANDBETWEEN function being used instead of RANDARRAY:
=INDEX(tblNames,
RANDBETWEEN(1,ROWS(tblNames))
)
Figure 10-55

Returning a name at random from a list

A different name is returned every time the worksheet calculates. The F9 key is pressed to run calculations on the worksheet.

Shuffle a List of Names

In this example, we will shuffle a list of names, changing the order every time the sheet is calculated. This is a solid example of dynamic arrays in action, as we are now returning an array that is dynamic.

In Figure 10-56, the following formula is entered in cell F3 to return a randomized list of the names in the [tblNamesList] table in columns B and C:
=SORTBY(tblNamesList[Names],
RANDARRAY(ROWS(tblNamesList))
)

The RANDARRAY function has been inserted into the by array argument of the SORTBY function. So, the names are sorted by randomly generated numbers.

The number of rows to return has been calculated by using the ROWS function on [tblNamesList]. No other arguments in RANDARRAY are used, so the function is returning decimal values between 0 and 1.
Figure 10-56

Using RANDARRAY and SORTBY to shuffle a list

Dynamic arrays make tasks such as this easy. The following SEQUENCE formula is used in cell E3 to generate the position numbers. So, this solution is fully dynamic.
=SEQUENCE(ROWS(tblNamesList))
Note

An example of the RANDARRAY function is shown with the CHOOSE function in the next chapter to randomly generate some sample data. This is useful for testing and practice.

FREQUENCY and TRANSPOSE Functions

Availability: All versions

File

frequency-and-transpose.xlsx

Dynamic array formulas are not all about the fabulous new functions. Existing Excel functions also benefit from the array engine. Two functions that really saw a new lease of life were the FREQUENCY and TRANSPOSE functions.

These two functions are array functions. So, in older versions of Excel, they were very awkward to use. The output range had to be selected before writing the formula, and then the user would have to press Ctrl + Alt + Enter to run the function.

With the dynamic array formula engine, these functions are just like any other Excel function. They have essentially been reborn.

FREQUENCY Function

The FREQUENCY function calculates how often values occur within a range of values (known as bins). The results are always returned as a vertical array.

The syntax of the FREQUENCY function is as follows:
=FREQUENCY(data_array, bins_array)
  • Data array: The range, array, or table column that contains the values for which you want to count the frequencies.

  • Bins array: The range or array that contains the intervals for which to group the values.

In Figure 10-57, the following FREQUENCY function returns the frequencies that the values in the [Score] column occur within the intervals specified in range E4:E7:
=FREQUENCY(tblScores[Score],E4:E7)

The score must be greater than the interval value to be grouped in that range. So, the bins array can be read as 1–50, 51–65, 66–80, 81–90, and >90.

The FREQUENCY function always returns one extra value to the number of intervals in the bins array. This extra value is the count of values greater than the largest interval.

To cater for this behavior, the formula was entered into cell F3, one cell before the range specified for the bins array.
Figure 10-57

FREQUENCY function returning the frequencies of scores

Note

In Chapter 8, we saw the COUNTIFS function used to create a frequency distribution table. This is my preferred approach due to the extra flexibility it provides. However, FREQUENCY is great and was built for this purpose.

TRANSPOSE Function

Not many functions benefitted from the introduction of the dynamic array engine more than TRANSPOSE. The TRANSPOSE function was a sleeping giant shackled by Excel’s inability to natively handle arrays in older versions.

The shackles have now been broken for this function to show its value. Welcome to TRANSPOSE 2.0.

The TRANSPOSE function rotates data that is arranged horizontally to being arranged vertically and vice versa. It requires only the array to be transposed.
=TRANSPOSE(array)

Many people are familiar with the transpose functionality that is available when you copy and paste data. This is very useful. With the TRANSPOSE function, you can automatically transpose the results of a formula or feed a formula with a transposed array.

In Figure 10-58, the following formula has been entered in cell B6 to transpose the data in range B2:H4:
=TRANSPOSE(B2:H4)
The data in range B2:H4 has been rotated from being arranged along rows to being arranged down columns.
Figure 10-58

Switching horizontal data to vertical with TRANSPOSE

In versions of Excel that do not have dynamic arrays, the TRANSPOSE function is a horrible experience.

To achieve the same result for this example, you would need to select the output cells for the TRANSPOSE function. The range to be transposed is seven columns wide and three rows high. So, you would need to select an output range that is three columns wide and seven rows high and then type the TRANSPOSE function followed by pressing Ctrl + Shift + Enter.

This is not a feasible way to work. It is awkward, not dynamic, and required Ctrl + Shift + Enter to operate.

Thankfully, it now works like any formula – one cell, one formula that spills.

Let’s see a quick second example.

Earlier in this chapter, we used the UNIQUE function as shown in Figure 10-59. It is entered in cell B2 to compare columns in the range B4:G4 and return the distinct values.
Figure 10-59

Data containing duplicate values in range B4:G4

We may need this array to be rotated into a vertical array. This could be to feed a Data Validation list or maybe to feed an Excel function that only works with vertical arrays (there are a few).

In Figure 10-60, the following formula is entered into cell I2:
=TRANSPOSE(UNIQUE(B4:G4,TRUE))
Figure 10-60

Transposing a horizontal array returned by UNIQUE

Note

The TRANSPOSE function was shown with the SEQUENCE function earlier in this chapter to change the column-row order that SEQUENCE uses when producing two-dimensional arrays.

PivotTable Style Report Using Formulas

A fantastic use of the TRANSPOSE function is how it can be used to produce labels for a PivotTable style report.

We saw an example earlier in the chapter of the SEQUENCE function generating time-series data across columns for a PivotTable style report. This is simple with SEQUENCE as it has a columns argument, and this is great for numeric values. But to get text values dynamically displayed across columns, TRANSPOSE is superb.

Figure 10-61 shows a table named [tblSales]. This will be used as the source data for our pivot style report.
Figure 10-61

Sales data to be used as the source for a pivot style report

We will create a report that displays the [Product Name] values as row labels and the [Store] values as column labels. To show the [Store] values across columns, they will need to be transposed. The values in the [Total] column will be summed for each product and store.

In Figure 10-62, the following formula is used in cell C2 to create the column labels of the report. The distinct values from the [Store] column are sorted and then transposed.
=TRANSPOSE(SORT(UNIQUE(tblSales[Store])))
Figure 10-62

TRANSPOSE for column labels in a pivot style report

This is the reverse of the previous two examples where the TRANSPOSE function was used to switch data that was arranged horizontally to vertically.

The following formula is entered in cell B3 to generate the row labels:
=SORT(UNIQUE(tblSales[Product Name]))
And finally, the following SUMIFS formula is entered in cell C3 to produce the totals for each product and store. The two spill ranges are used for the criteria.
=SUMIFS(tblSales[Total],tblSales[Product Name],B3#,tblSales[Store],C2#)

By using dynamic array formulas based on a table, we have a completely dynamic solution here that updates automatically, unlike PivotTables. It also provides greater flexibility than a built-in tool like PivotTables would allow.

SUM v2.0 – The DA Version

Availability: Excel for Microsoft 365, Excel for the Web, Excel 2021, Excel for Microsoft 365 for Mac, Excel 2021 for Mac

File

sum.xlsx

Excel’s number one function, the SUM function, has also had a new lease of life thanks to the array engine available in Excel 365, Online, and 2021.

In modern versions of Excel, a simple SUM can perform the tasks that we would previously rely on SUMPRODUCT for. It can handle complex arrays, multiple conditional sums, and multiple columns of values (unlike SUMIFS).

Let’s welcome SUM 2.0.

Sum Based on Complex Criteria

Let’s dive straight in and see the SUM function performing a sum that is based on complex criteria. We need to perform both AND and OR logic in the criteria of this formula.

Figure 10-63 shows the first few rows of a table named [tblSales]. This is the source data that our formula will be using.
Figure 10-63

Table of sales data

In Figure 10-64, the following formula sums the values in the [Total] column for sales where the [Product Name] was equal to the value in cell B3 or C3 and the [Store] was equal to the value in cell D3:
=SUM(
((tblSales[Product Name]=B3)+(tblSales[Product Name]=C3))*
(tblSales[Store]=D3)*
tblSales[Total]
)
The following formula returns the count of orders for the same criteria:
=SUM(
((tblSales[Product Name]=B3)+(tblSales[Product Name]=C3))*
(tblSales[Store]=D3)
)
Figure 10-64

SUM based on complex criteria including AND and OR logic

When performing formulas such as this, the “+” operator is used to stipulate OR logic between conditions, and the “*” specifies AND logic. An extra set of brackets surround the OR logic part of the formula to force that operation to calculate before the AND operation.

We saw this example in Chapter 9 when we covered the SUMPRODUCT function in detail. In modern Excel, SUMPRODUCT is no longer required for this work. For a detailed breakdown of how this formula works, visit the “SUMPRODUCT Function” section of Chapter 9.

Note

In Chapter 8, we saw a neat trick to combine the SUM function with the SUMIFS and COUNTIFS functions for a more dynamic version of a formula like this.

Sum with Arrays

As formulas in Excel can now handle arrays, this reduces the requirement for using intermediary formulas that store their results in columns.

In Figure 10-65, we have daily sales transactions in a table named [tblDailySales]. We would like to sum the values that occurred at a weekend only. In this example, the weekend is classified as Friday to Sunday.
Figure 10-65

Table with daily transaction

In non-dynamic array–enabled versions of Excel, the typical approach would be to use the WEEKDAY function in a column to return the number that identifies the day of the week. A function such as SUMIFS can then be used that tests the values from the weekday column to sum the required values.

In Figure 10-66, the following formula is entered in cell E4 and sums the values that occurred on a Friday-Sunday all in one formula:
=SUM(
(WEEKDAY(tblDailySales[Date],2)>4)*
tblDailySales[Total]
)

The WEEKDAY function returns a number that identifies the day of the week. The week starts from a Monday as 1 and ends with Sunday as 7. The formula tests if the value returned by the WEEKDAY function is greater than 4.

The results of this logical expression are multiplied by the values in the [Total] column and then summed by the SUM function.

This formula can be achieved in non-DA versions of Excel also, but it requires the user to press Ctrl + Shift + Enter on completion to specify an array formula.
Figure 10-66

Summing the sales from Friday to Sunday

Sum Based on Multiple Columns

The SUM function is extremely versatile and can sum values of any array dimensions and include any criteria.

Figure 10-67 shows a matrix of sales data in range D1:I16. It contains product names in range D2:D16 and location names in range F1:I1.
Figure 10-67

Sales matrix by product and location

In Figure 10-68, the following SUM function sums the values from range F2:I16, but only for sales of the product stated in cell A2:
=SUM((D2:D16=A2)*(F2:I16))
You would not be able to sum values from multiple columns (F to I) like this with the SUMIFS function. So, this demonstrates the versatility of SUM nicely.
Figure 10-68

Summing values from multiple columns

Taking it a step further, we can set both row and column criteria, creating a two-way SUM.

In Figure 10-69, the following SUM function is summing the values at the intersection of the rows for the product stated in cell A2 and the column for the location stated in cell B1. How cool is that?
=SUM((D2:D16=A2)*(F1:I1=B1)*(F2:I16))
Figure 10-69

Two-way SUM with row and column criteria

Note

Although this section of the chapter is dedicated to SUM, it is worth noting that all aggregation functions benefit from the array engine. We have seen other examples in this book that showcase this, for example, the conditional median function using MEDIAN and IF shown in Chapter 8.

Dynamic Array Formulas with Other Excel Features

Dynamic array formulas have a mixed relationship with other Excel features. Some features work well with DA formulas, although we may need to reference them indirectly, while others do not recognize DA formulas at all.

Let’s look at how DA formulas can be used in combination with Conditional Formatting rules, Data Validation rules, and charts in Excel.

DA Formulas with Conditional Formatting

Unfortunately, Conditional Formatting does not recognize a spill range. You cannot reference a spill range directly in a Conditional Formatting rule. And if you select a spill range and apply a Conditional Formatting rule to it, it will not update dynamically with the spill range.

Figure 10-70 shows a dynamic report with two spill ranges – one in range B3 and another in range C3. The spill range in cell C3 returns the total sales for each product.
Figure 10-70

Spilled range in range C3

We will apply a Conditional Formatting rule to change the cell color for all values that are greater than or equal to 1700.

Because the Conditional Formatting feature will not accept a reference to a spill range, you cannot enter the following formula into the formatting rule:
=C3#>=1700

The approach we will take to make the best of an unfortunate situation is to select more cells than is required. The dynamic array formula may expand the spill range in the future, and we want the rule to apply to the additional cells.

As mentioned, Conditional Formatting cannot work directly with a spill range, so we must resort to selecting cells on the grid and plan a little for future changes.

In Figure 10-71, two additional cells are selected beyond the spill range to cater for the spill range expanding. The Conditional Formatting rule is then created as usual.
Figure 10-71

Selecting additional cells beyond the spill range

  1. 1.

    Click HomeConditional FormattingNew Rule.

     
  2. 2.

    Click Format only cells that contain.

     
  3. 3.

    Select greater than or equal to from the list of logical operations and type “1700” into the box provided (Figure 10-72).

     
  4. 4.

    Click Format and specify the formatting you want to apply.

     
  5. 5.

    Click OK.

     
Figure 10-72

Format cells greater than or equal to 1700

The Conditional Formatting rule is applied to the selected range as shown in Figure 10-73. If the spill range expanded within the additional two selected cells (we could have selected more), the formatting rule would automatically be applied.
Figure 10-73

Conditional Formatting rule applied

DA Formulas with Data Validation

The good news is that Data Validation works with dynamic array formulas. You cannot enter DA formulas directly in the Data Validation window, but you can reference spill ranges on the grid. When the spill range updates, the Data Validation rule will update with it.

For this example, we will use the UNIQUE and SORT functions together to create a source for a Data Validation list. Figure 10-74 shows a table named [tblSubscribers].
Figure 10-74

Table with subscriber data

We will create a dynamic Data Validation list from the countries in the [Country] column. If subscribers are added from new countries, the Data Validation list will automatically update to include them.

In cell B3, the following formula is used to create a distinct list of the countries and is sorted in ascending order (Figure 10-75):
=SORT(UNIQUE(tblSubscribers[Country]))
Figure 10-75

Spill range to be used as the source for a DV list

To create the Data Validation list
  1. 1.

    Click the cell(s) you want the Data Validation list in.

     
  2. 2.

    Click DataData Validation.

     
  3. 3.

    From the Settings tab, click the Allow list and click List.

     
  4. 4.

    Click in the Source box, click the cell that contains the spill range, and type the # operator after the reference (Figure 10-76). The following reference is used in this example. Click OK.

     
=$B$3#
Note

A defined name could be created for this reference, such as lstCountries, and then the defined name used for the source of the Data Validation rule. This is not required but is a nice technique for a more meaningful reference than $B$3#, especially if the spill range was on another sheet.

Figure 10-76

Data Validation list referencing a spill range

Figure 10-77 shows the Data Validation list. The spill range in cell B3 will update with the data in the [tblSubscribers] table, and the Data Validation list will automatically update as it references the spill range.
Figure 10-77

A dynamic Data Validation list of countries

DA Formulas with Charts

It is further good news for charts in Excel, as they also can be used with dynamic arrays. When the spill range of a dynamic array updates, the chart will update with it.

However, the spill reference cannot be used directly in charts. Also, if a chart is created by selecting the spill range on the grid, it does not update with the spill range.

We need to define a name for each spill range and then use the defined name in the appropriate areas of the chart.

Figure 10-78 shows two spill ranges – one in range B3 and the other in range C3. The dynamic array formulas are summing the total values for each product and sorting the product names by their total sales values.
Figure 10-78

Spill ranges for products sorted by sales totals

This is a cool technique, and we demonstrated how to create this report earlier in the chapter with the SORTBY function.

We will insert a column chart that is connected to these spill ranges, so that the chart always sorts the product names by their sales totals in descending order.

First, we need to define names for each spill range:
  1. 1.

    Click cell B3.

     
  2. 2.

    Click FormulasDefine Name.

     
  3. 3.

    Type a Name for the defined name. In Figure 10-79, “rngProducts” is used.

     
  4. 4.

    The reference to cell B3 should automatically appear in the Refers to box, as we clicked the cell before defining the name. Add the # operator to the end of the reference. Click OK.

     
=Charts!$B$3#
Figure 10-79

Define a name for the product name’s spill range

This defined name will be used for the axis labels. We need to define another name for the sales total spill range in range C3#.
  1. 5.

    Click cell C3 and repeat the previous steps using the following reference. In this example, the name “rngSalesTotals” is used for the defined name.

     
=Charts!$C$3#
The next step is to insert the chart and then connect it to our defined names.
  1. 6.

    Click InsertInsert Column or Bar ChartClustered Column.

     
  2. 7.

    With the chart selected, click Chart DesignSelect Data.

     
  3. 8.

    Click the Add button in the Legend Entries (Series) area on the left.

     
  4. 9.

    Type “Product Sales” for the Series name and type the following reference in the Series values box (Figure 10-80). Click OK.

     
Figure 10-80

Editing the series for a chart to use the defined name

=Charts!rngSalesTotals

Even though the defined name has workbook scope, it is essential that the references in charts include the sheet name.

Note

Instead of typing the defined name into the field, you can also press F3 to open the Paste Name window and select it from there.

  1. 10.

    Click the Edit button in the Horizontal (Category) Axis Labels area on the right.

     
  2. 11.

    Type the following reference into the Axis label range box and click OK (Figure 10-81).

     
Figure 10-81

Editing the axis labels to use the defined name

=Charts!rngProducts
  1. 12.

    Click OK to close the Select Data Source window.

     
Figure 10-82 shows the completed column chart based indirectly on the spill ranges. This chart will automatically update to reflect any changes in the spill ranges.
Figure 10-82

Completed chart using dynamic array sources

Summary

In this chapter, we learned all about dynamic arrays in modern Excel. The key takeaways from this chapter were
  • Understanding what exactly a dynamic array is and how to use them effectively in Excel.

  • Learning a few of the dynamic array functions in Excel including SORTBY, UNIQUE, and SEQUENCE. Others such as TEXTSPLIT, FILTER, and STOCKHISTORY are covered in other chapters of the book.

  • How existing functions in Excel such as SUM, TRANSPOSE, EOMONTH, and SUMIFS work with arrays. Functions such as SUM are better than ever, while EOMONTH requires some tricks to work effectively.

  • How dynamic array are used with existing features of Excel such as charts and Data Validation.

In the next chapter, we will dive deeper into the lookup functions of Excel. We have covered VLOOKUP in this book already, but there are many more (and better) lookup functions in Excel than VLOOKUP.

We will cover functions including INDEX, CHOOSE, OFFSET, VSTACK, and INDIRECT. It is the largest chapter of this book, containing many examples and pro tips.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.202.4