In This Chapter
One of Excel’s most interesting (and most powerful) features is its ability to work with arrays in a formula. When you understand this concept, you can create elegant formulas that appear to perform magic. This chapter introduces the concept of arrays and is required reading for anyone who wants to become a master of Excel formulas. Chapter 15, “Performing Magic with Array Formulas,” continues with lots of useful examples.
If you do any computer programming, you’ve probably been exposed to the concept of an array, which is a collection of items operated on collectively or individually. In Excel, an array can be one dimensional or two dimensional. These dimensions correspond to rows and columns. For example, a one-dimensional array can be stored in a range that consists of one row (a horizontal array) or one column (a vertical array). A two-dimensional array can be stored in a rectangular range of cells. Excel doesn’t support three-dimensional arrays (although its VBA programming language does).
As you’ll see, though, arrays need not be stored in cells. You can also work with arrays that exist only in Excel’s memory. You can then use an array formula to manipulate this information and return a result. Excel supports two types of array formulas:
This section presents two array formula examples: an array formula that occupies multiple cells and another array formula that occupies only one cell.
Figure 14.1 shows a simple worksheet set up to calculate product sales. Normally, you would calculate the value in column D (total sales per product) with a formula such as the one that follows and then copy this formula down the column:
=B2*C2
After you copy the formula, the worksheet contains six formulas in column D.
Another alternative uses a single formula (an array formula) to calculate all six values in D2:D7. This single formula occupies six cells and returns an array of six values.
To create a single array formula to perform the calculations, follow these steps:
Select a range to hold the results.
In this example, the range is D2:D7.
Enter the following formula, either by typing it or by pointing to the ranges:
=B2:B7*C2:C7
The formula is entered into all six selected cells. If you examine the Formula bar, you see the following:
{=B2:B7*C2:C7}
Excel places curly brackets around the formula to indicate that it’s an array formula.
This formula performs its calculations and returns a six-item array. The array formula actually works with two other arrays, both of which happen to be stored in ranges. The values for the first array are stored in B2:B7, and the values for the second array are stored in C2:C7.
Because displaying more than one value in a single cell is not possible, six cells are required to display the resulting array. That explains why you selected six cells before you entered the array formula.
This array formula, of course, returns the same values as these six normal formulas entered into individual cells in D2:D7:
=B2*C2 =B3*C3 =B4*C4 =B5*C5 =B6*C6 =B7*C7
Using a single array formula rather than individual formulas does offer a few advantages:
As you see later, multicell array formulas can be more useful than this trivial introductory example.
Now take a look at a single-cell array formula in Figure 14.2. The following array formula is in cell C9:
{=SUM(B2:B7*C2:C7)}
You can enter this formula into any cell. Remember: when you enter this formula, make sure you press Ctrl+Shift+Enter (and don’t type the curly brackets).
This formula works with two arrays, both of which are stored in cells. The first array is stored in B2:B7, and the second array is stored in C2:C7. The formula multiplies the corresponding values in these two arrays and creates a new array (which exists only in memory). The new array consists of six values, which can be represented like this. (The reason for using semicolons is explained a bit later.)
{150;1000;100;90;180;200}
The SUM function then operates on this new array and returns the sum of its values.
The examples in the previous section used arrays stored in worksheet ranges. The examples in this section demonstrate an important concept: An array does not have to be stored in a range of cells. This type of array, which is stored in memory, is referred to as an array constant.
You create an array constant by listing its items and surrounding them with curly brackets. Here’s an example of a five-item horizontal array constant:
{1,0,1,0,1}
The following formula uses the SUM function, with the preceding array constant as its argument. The formula returns the sum of the values in the array (which is 3). Notice that this formula uses an array, but it is not an array formula. Therefore, you do not use Ctrl+Shift+Enter to enter the formula.
=SUM({1,0,1,0,1})
At this point, you probably don’t see any advantage to using an array constant. The formula that follows, for example, returns the same result as the previous formula:
=SUM(1,0,1,0,1)
Keep reading, and the advantages will become apparent.
Following is a formula that uses two array constants:
=SUM({1,2,3,4}*{5,6,7,8})
This formula creates a new array (in memory) that consists of the product of the corresponding elements in the two arrays. The new array is as follows:
{5,12,21,32}
This new array is then used as an argument for the SUM function, which returns the result (70). The formula is equivalent to the following formula, which doesn’t use arrays:
=SUM(1*5,2*6,3*7,4*8)
A formula can work with both an array constant and an array stored in a range. The following formula, for example, returns the sum of the values in A1:D1, each multiplied by the corresponding element in the array constant:
{=SUM(A1:D1*{1,2,3,4})}
Because one of our arrays in this formula is a range, the formula is entered with Ctrl+Shift+Enter to let Excel know we want to convert any ranges to array. The formula is equivalent to this:
=SUM(A1*1,B1*2,C1*3,D1*4)
An array constant can contain numbers, text, logical values (TRUE or FALSE), and even error values such as #N/A. Numbers can be in integer, decimal, or scientific format. You must enclose text in double quotation marks (for example, “Tuesday”). You can use different types of values in the same array constant, as in this example:
{1,2,3,TRUE,FALSE,TRUE,"Moe","Larry","Curly"}
An array constant cannot contain formulas, functions, or other arrays. Numeric values cannot contain dollar signs, commas, parentheses, or percent signs. For example, the following is an invalid array constant:
{SQRT(32),$56.32,12.5%}
As stated previously, an array can be either one dimensional or two dimensional. A one-dimensional array’s orientation can be either vertical or horizontal.
The elements in a one-dimensional horizontal array are separated by commas. The following example is a one-dimensional horizontal array constant:
{1,2,3,4,5}
Displaying this array in a range requires five consecutive cells in a single row. To enter this array into a range, select a range of cells that consists of one row and five columns. Then enter ={1,2,3,4,5} and press Ctrl+Shift+Enter.
The following example is another horizontal array; it has seven elements and is made up of text strings:
{"Sun","Mon","Tue","Wed","Thu","Fri","Sat"}
To enter this array, select seven cells in one row and then type the following (followed by pressing Ctrl+Shift+Enter):
={"Sun","Mon","Tue","Wed","Thu","Fri","Sat"}
The elements in a one-dimensional vertical array are separated by semicolons. The following is a six-element vertical array constant:
{10;20;30;40;50;60}
Displaying this array in a range requires six cells in a single column. To enter this array into a range, select a range of cells that consists of six rows and one column. Then enter the following formula and press Ctrl+Shift+Enter:
={10;20;30;40;50;60}
The following is another example of a vertical array; this one has four elements:
{"Widgets";"Sprockets";"Do-Dads";"Thing-A-Majigs"}
To enter this array into a range, select four cells in a column, enter the following formula, and then press Ctrl+Shift+Enter:
={"Widgets";"Sprockets";"Do-Dads";"Thing-A-Majigs"}
A two-dimensional array uses commas to separate its horizontal elements and semicolons to separate its vertical elements.
The following example shows a 3 × 4 array constant:
{1,2,3,4;5,6,7,8;9,10,11,12}
Displaying this array in a range requires 12 cells. To enter this array into a range, select a range of cells that consists of three rows and four columns. Then type the following formula and press Ctrl+Shift+Enter:
={1,2,3,4;5,6,7,8;9,10,11,12}
Figure 14.3 shows how this array appears when entered into a range (in this case, B3:E5).
If you enter an array into a range that has more cells than array elements, Excel displays #N/A in the extra cells. Figure 14.4 shows a 3 × 4 array entered into a 10 × 5 cell range.
Each row of a two-dimensional array must contain the same number of items. The array that follows, for example, is not valid because the third row contains only three items:
{1,2,3,4;5,6,7,8;9,10,11}
Excel does not allow you to enter a formula that contains an invalid array.
You can use #N/A as a placeholder for a missing element in an array. For example, the following array is missing the element in the third row of the first column:
={1,2,3,4;5,6,7,8;#N/A,10,11,12}
You can create an array constant, give it a name, and then use this named array in a formula. Technically, a named array is a named formula.
To create a named constant array, use the New Name dialog box (choose Formulas ➜ Defined Names ➜ Define Name). In Figure 14.5, the name of the array is DayNames, and it refers to the following array constant:
{"Sun","Mon","Tue","Wed","Thu","Fri","Sat"}
Notice that in the New Name dialog box, the array is defined by using a leading equal sign (=). Without this equal sign, the array is interpreted as a text string rather than an array. Also, you must type the curly brackets when defining a named array constant; Excel does not enter them for you.
After creating this named array, you can use it in a formula. Figure 14.6 shows a worksheet that contains a single array formula entered into the range B2:H2. The formula follows:
{=DayNames}
To enter this formula, select seven cells in a row, type =DayNames, and press Ctrl+Shift+Enter.
Because commas separate the array elements, the array has a horizontal orientation. Use semicolons to create a vertical array. Or you can use Excel’s TRANSPOSE function to insert a horizontal array into a vertical range of cells. (See the “Transposing an array” section later in this chapter.) The following array formula, which is entered into a seven-cell vertical range, uses the TRANSPOSE function:
{=TRANSPOSE(DayNames)}
You also can access individual elements from the array by using Excel’s INDEX function. The following formula, for example, returns Wed, the fourth item in the DayNames array:
=INDEX(DayNames,4)
This section deals with the mechanics of selecting cells that contain arrays, as well as entering and editing array formulas. These procedures differ a bit from working with ordinary ranges and formulas.
When you enter an array formula into a cell or range, you must follow a special procedure so Excel knows that you want an array formula rather than a normal formula. You enter a normal formula into a cell by pressing Enter. You enter an array formula into one or more cells by pressing Ctrl+Shift+Enter.
You can easily identify an array formula because the formula is enclosed in curly brackets in the Formula bar. The following formula, for example, is an array formula:
{=SUM(LEN(A1:A5))}
Don’t enter the curly brackets when you create an array formula; Excel inserts them for you after you press Ctrl+Shift+Enter. If the result of an array formula consists of more than one value, you must select all the cells in the results range before you enter the formula. If you fail to do this, only the first element of the result is returned.
You can select the cells that contain a multicell array formula manually by using the normal cell selection procedures. Alternatively, you can use either of the following methods:
If an array formula occupies multiple cells, you must edit the entire range as though it were a single cell. The key point to remember is that you can’t change just one element of an array formula. If you attempt to do so, Excel displays the warning message shown in Figure 14.7. Click OK and press Esc to exit edit mode; then select the entire range and try again.
The following rules apply to multicell array formulas. If you try to do any of these things, Excel lets you know about it:
To edit an array formula, select all the cells in the array range and activate the Formula bar as usual (click it or press F2). Excel removes the brackets from the formula while you edit it. Edit the formula and then press Ctrl+Shift+Enter to enter the changes. Excel adds the curly brackets, and all the cells in the array now reflect your editing changes.
Although you can’t change any individual cell that makes up a multicell array formula, you can apply formatting to the entire array or to only parts of it.
Often, you may need to expand a multicell array formula (to include more cells) or contract it (to include fewer cells). Doing so requires a few steps:
Select the entire range that contains the array formula.
You can use Ctrl+/ to automatically select the cells in an array that includes the active cell.
Press Ctrl+Enter.
This step enters an identical (nonarray) formula into each selected cell.
This section contains examples that demonstrate additional features of multicell array formulas. A multicell array formula is a single formula that’s entered into a range of cells. These features include creating arrays from values, performing operations, using functions, transposing arrays, and generating consecutive integers.
The following array formula creates an array from a range of cells. Figure 14.8 shows a workbook with some data entered into A1:C4. The range D8:F11 contains a single array formula:
{=A1:C4}
The array in D8:F11 is linked to the range A1:C4. Change any value in A1:C4, and the corresponding cell in D8:F11 reflects that change.
In the previous example, the array formula in D8:F11 essentially created a link to the cells in A1:C4. It’s possible to sever this link and create an array constant made up of the values in A1:C4.
To do so, select the cells that contain the array formula (the range D8:F11, in this example). Press F2 to edit the array formula and then press F9 to convert the cell references to values. Press Ctrl+Shift+Enter to reenter the array formula (which now uses an array constant). The array constant is as follows:
{1,"dog",3;4,5,"cat";7,FALSE,9;"monkey",8,12}
Figure 14.9 shows how this looks after pressing F9 to convert the cell references.
So far, most of the examples in this chapter simply entered arrays into ranges. The following array formula creates a rectangular array and multiplies each array element by 2:
{={1,2,3,4;5,6,7,8;9,10,11,12}*2}
Figure 14.10 shows the result when you enter this formula into a range:
The following array formula multiplies each array element by itself:
{={1,2,3,4;5,6,7,8;9,10,11,12}*{1,2,3,4;5,6,7,8;9,10,11,12}}
The following array formula is a simpler way of obtaining the same result:
{={1,2,3,4;5,6,7,8;9,10,11,12}^2}
Figure 14.11 shows the result when you enter this formula into a range (B8:E10).
If the array is stored in a range (such as A1:D3), the array formula returns the square of each value in the range, as follows:
{=A1:D3^2}
As you might expect, you also can use functions with an array. The following array formula, which you can enter into a ten-cell vertical range, calculates the square root of each array element in the array constant:
{=SQRT({1;2;3;4;5;6;7;8;9;10})}
If the array is stored in a range, a multicell array formula such as the one that follows returns the square root of each value in the range:
{=SQRT(A1:A10)}
When you transpose an array, you essentially convert rows to columns and columns to rows. In other words, you can convert a horizontal array to a vertical array and vice versa. Use Excel’s TRANSPOSE function to transpose an array.
Consider the following one-dimensional horizontal array constant:
{1,2,3,4,5}
You can enter this array into a vertical range of cells by using the TRANSPOSE function. To do so, select a range of five cells that occupy five rows and one column. Then enter the following formula and press Ctrl+Shift+Enter:
=TRANSPOSE({1,2,3,4,5})
The horizontal array is transposed, and the array elements appear in the vertical range.
Transposing a two-dimensional array works in a similar manner. Figure 14.12 shows a two-dimensional array entered into a range normally and entered into a range using the TRANSPOSE function. The formula in A1:D3 follows:
{={1,2,3,4;5,6,7,8;9,10,11,12}}
Here’s the formula in A6:C9:
{=TRANSPOSE({1,2,3,4;5,6,7,8;9,10,11,12})}
You can, of course, use the TRANSPOSE function to transpose an array stored in a range. The following formula, for example, uses an array stored in A1:C4 (four rows, three columns). You can enter this array formula into a range that consists of three rows and four columns:
{=TRANSPOSE(A1:C4)}
As you will see in Chapter 15, it’s often useful to generate an array of consecutive integers for use in an array formula. Excel’s ROW function, which returns a row number, is ideal for this. Consider the array formula shown here, entered into a vertical range of 12 cells:
{=ROW(1:12)}
This formula generates a 12-element array that contains integers from 1 to 12. To demonstrate, select a range that consists of 12 rows and 1 column, and then enter the array formula into the range. The range is filled with 12 consecutive integers (see Figure 14.13).
If you want to generate an array of consecutive integers, a formula like the one shown previously is good—but not perfect. To see the problem, insert a new row above the range that contains the array formula. Excel adjusts the row references so the array formula now reads like this:
{=ROW(2:13)}
The formula that originally generated integers from 1 to 12 now generates integers from 2 to 13.
For a better solution, use this formula:
{=ROW(INDIRECT("1:12"))}
This formula uses the INDIRECT function, which takes a text string as its argument. Excel does not adjust the references contained in the argument for the INDIRECT function. Therefore, this array formula always returns integers from 1 to 12.
All the examples in the previous section used a multicell array formula—a single array formula entered into a range of cells. The real power of using arrays becomes apparent when you use single-cell array formulas. This section contains examples of array formulas that occupy a single cell.
Suppose you have a range of cells that contains text entries (see Figure 14.14). If you need to get a count of the total number of characters in that range, the traditional method involves creating a formula like the one that follows and copying it down the column:
=LEN(A1)
Then you use a SUM formula to calculate the sum of the values that the intermediate formulas return.
The following array formula does the job without using intermediate formulas:
{=SUM(LEN(A1:A14))}
The array formula uses the LEN function to create a new array (in memory) that consists of the number of characters in each cell of the range. In this case, the new array follows:
{10,9,8,5,6,5,5,10,11,14,6,8,8,7}
The array formula is then reduced to the following:
=SUM({10,9,8,5,6,5,5,10,11,14,6,8,8,7})
If you have values in a range named Data, you can determine the smallest value by using the SMALL function:
=SMALL(Data,1)
You can determine the second smallest and third smallest values by using these formulas:
=SMALL(Data,2) =SMALL(Data,3)
To add the three smallest values, you can use a formula like this:
=SUM(SMALL(Data,1), SMALL(Data,2), SMALL(Data,3)
This formula works fine, but using an array formula is more efficient. The following array formula returns the sum of the three smallest values in a range named Data:
{=SUM(SMALL(Data,{1,2,3}))}
The formula uses an array constant as the second argument for the SMALL function. This generates a new array, which consists of the three smallest values in the range. This array is then passed to the SUM function, which returns the sum of the values in the new array.
Figure 14.15 shows an example that sums the three smallest values in the range A1:A10. The SMALL function is evaluated three times, each with a different second argument. The first time, the SMALL function has a second argument of 1, and it returns –5. The second time, the second argument for the SMALL function is 2 and it returns 0 (the second-smallest value in the range). The third time, the SMALL function has a second argument of 3 and returns the third-smallest value of 2.
Therefore, the array that’s passed to the SUM function is
{–5,0,2}
The formula returns the sum of the array (–3).
Suppose that you need to count the number of text cells in a range. The COUNTIF function seems like it might be useful for this task, but it’s not. COUNTIF is useful only if you need to count values in a range that meet some criterion (for example, values greater than 12).
To count the number of text cells in a range, you need an array formula. The following array formula uses the IF function to examine each cell in a range. It then creates a new array (of the same size and dimensions as the original range) that consists of 1s and 0s, depending on whether the cell contains text. This new array is then passed to the SUM function, which returns the sum of the items in the array. The result is a count of the number of text cells in the range.
{=SUM(IF(ISTEXT(A1:D5),1,0))}
Figure 14.16 shows an example of the preceding formula in cell C7. The array created by the IF function is as follows:
{0,1,1,1;1,0,0,0;1,0,0,0;1,0,0,0;1,0,0,0}
Notice that this array contains four rows of three elements (the same dimensions as the range).
A variation on this formula follows:
{=SUM(ISTEXT(A1:D5)*1)}
This formula eliminates the need for the IF function and takes advantage of the fact that
TRUE * 1 = 1
and
FALSE * 1 = 0
One of the main benefits of using an array formula is that you can eliminate intermediate formulas in your worksheet. This makes your worksheet more compact and eliminates the need to display irrelevant calculations. Figure 14.17 shows a worksheet that contains pretest and posttest scores for students. Column D contains formulas that calculate the changes between the pretest and the posttest scores. Cell D17 contains the following formula, which calculates the average of the values in column D:
=AVERAGE(D2:D15)
With an array formula, you can eliminate column D. The following array formula calculates the average of the changes but does not require the formulas in column D:
{=AVERAGE(C2:C15–B2:B15)}
How does it work? The formula uses two arrays, the values of which are stored in two ranges (B2:B15 and C2:C15). The formula creates a new array that consists of the differences between each corresponding element in the other arrays. This new array is stored in Excel’s memory, not in a range. The AVERAGE function then uses this new array as its argument and returns the result.
The new array consists of the following elements:
{11,15,–6,1,19,2,0,7,15,1,8,23,21,–11}
The formula, therefore, is reduced to the following:
=AVERAGE({11,15,–6,1,19,2,0,7,15,1,8,23,21,–11})
Excel evaluates the function and displays the result: 7.57.
You can use additional array formulas to calculate other measures for the data in this example. For instance, the following array formula returns the largest change (that is, the greatest improvement). This formula returns 23, which represents Linda’s test scores:
{=MAX(C2:C15–B2:B15)}
The following array formula returns the smallest change (that is, the least improvement). This formula returns –11, which represents Nancy’s test scores:
{=MIN(C2:C15–B2:B15)}
If your formula uses a function that requires a range reference, you may be able to replace that range reference with an array constant. This is useful when the values in the referenced range do not change.
Figure 14.18 shows a worksheet that uses a lookup table to display a word that corresponds to an integer. For example, looking up a value of 9 returns Nine from the lookup table in D1:E10. Here’s the formula in cell C1:
=VLOOKUP(B1,D1:E10,2,FALSE)
You can use a two-dimensional array in place of the lookup range. The following formula returns the same result as the previous formula, but it does not require the lookup range in D1:E10:
=VLOOKUP(B1,{1,"One";2,"Two";3,"Three";4,"Four";5,"Five"; 6,"Six";7,"Seven";8,"Eight";9,"Nine";10,"Ten"},2,FALSE)
This chapter introduced arrays. Chapter 15 explores the topic further and provides many additional examples.
3.133.134.58