3. Dynamic Arrays: Time for Some Heavy Lifting

The Spill Is Gone

True, the term “spill” – the name Excel has conferred upon its new multi-cell formula capability – suggests an episode of untidiness, like the kind of mishap I routinely inflict upon my shirt at dinner. But as we’ve seen in the realm of spreadsheets, spilling is a good thing indeed. Empowering one formula to dispense thousands of results to as many cells is a hugely efficient and praiseworthy feature. But there are two principal circumstances under which Excel won’t allow a spill to happen.

The first: when the spill runs out of room. We can portray the problem by revisiting the test-grading illustration we framed in the previous chapter and tossing in a small complication: a bit of data adjoining the student answer to question 5 (Figure 3-1).

Figure 3-1
Unwanted intrusion: text alongside answer 5

Now if we replicate the dynamic array formula we wrote in the last chapter to cell D4,

=IF(B4:B13=C4:C13,1,0)

Figure 3-2 is what we see.

Figure 3-2
All bottled up: the formula won’t spill

You’ve probably surmised what’s happened, and why. The spill range’s values can’t assume their rightful place because the “text” entry in D8 obstructs the spill process. Spilling is an all-or-nothing proposition; if even one of the spilled values is denied entry to its cell, they all are.

If you click back on D4, and next click what Excel calls the error floatie signaled by the exclamation point, you’ll see as in Figure 3-3.

Figure 3-3
Let the data flow with the Select Obstructing Cells option

First, the range earmarked for the spill – in our case, D4:D13 – is surrounded by a dotted border, as the floatie unrolls the Select Obstructing Cells option. Click it, and the cell pointer shoots to cell D8. Now that the offending entry has been pinpointed, just delete it, and the spill range now does its thing.

Of course, you could have identified the obstructing cell on your own; but remember that a single dynamic array formula can unfurl its data across multiple rows and columns at the same time, and so tracing the in-the-way cell(s) might not always be quite so straightforward and evident.

You Can’t Spill on This Table

Another can’t-spill-here dilemma will beset the worksheet if you write, or attempt to write, a dynamic array formula in a table. Sticking with our test-grade example, if I remake the grades into a table via the Ctrl-T or Insert > Table command and once again enter

=IF(B4:B13=C4:C13,1,0)

in D4, Figure 3-4 is what will happen.

Figure 3-4
Whole lot of spilling going on

Note

Depending on how you write the formula here, the table might render it as =IF([Answer Key]=[@Student],1,0), but that variation won’t matter for our purposes.

The spill sure isn’t working here – but why?

The answer to that question had bothered me for some time, and my initial surmise was to write off a table’s inability to support dynamic arrays as some design gremlin eating away at Excel’s code. But then I experienced a mini-eureka moment and convinced myself that in fact, with one near-theoretical exception, tables can’t work with dynamic array formulas.

They can’t because tables and dynamic arrays operate at cross-purposes. A standard-issue, garden-variety dataset – a collection of records organized by fields – will allow the user to enter any sort of extraneous data in an adjacent column, data that needn’t have anything to do with the dataset. Nothing would prevent me from entering our Beatles range in the column to the immediate right of our grades, in cell D4 as in Figure 3-5.

But once we reinvent the data into a table, all of its rows become rigorously defined as records, and if you write an ordinary formula to the table, it’s immediately copied to all the records. But remember that our Beatles range formula spills four cells, and so if the table features 100 records, and the Beatle formula and its ensuing spill range were to be copied down a table column, we’d have to contend with 400 results – and 400 cells keyed to 100 records literally doesn’t compute. Because dynamic array formulas spill, their multi-cell output can’t be expected to shoehorn themselves into the records of tables.

Note

There are a few other, far more obscure scenarios in which spills also won’t work, that needn’t concern us here.

But at the same time keep in mind that a dynamic array formula can refer to a table; it just can’t be entered into one – apart from that one unlikely exception.

And that exception to the dynamic array no-table interdiction is, as noted earlier, all but theoretical. If you write a dynamic array formula that yields exactly one record, a table will accommodate it just fine – because the formula has nothing to spill. But given that the dynamic array’s claim to fame is its multi-cell potency, you’re not likely to write, or want to write, such a formula (thanks to Mark of Excel Off the Grid for this pointer).

The # Sign: Weighing In on Its Virtues

Among the novel features figuring in the dynamic array project is the novel way in which dynamic arrays can be referenced. Returning to the grade-assessing formula we dropped into cell D4: if we need to refer to that formula elsewhere in the worksheet, we could simply write

=D4#

The pound-sign (or hash mark, depending on the side of the Atlantic on which you live) denotes the cell in which the dynamic array formula is actually written, and writing such a reference will duplicate the formula written to that cell, with all its spilled results.

In this case, the entry will reproduce the spill range’s 1’s and 0’s; and this expression

=SUM(D4#)

will naturally yield 6.

Now of course =SUM(D4:D13) could drum up the same result, but D4# boasts a few advantages: it’s easier and shorter, and more to the point, the pound-sign reference will automatically register any changes to the spill range wrought by the formula.

For example, and by way of presaging the next chapter: if we ask the new dynamic array SEQUENCE function to write

=SEQUENCE(10)

in cell A3, the formula will spill a range consisting of values 1 through 10 in A3:A12. If I enter A3# somewhere else in the worksheet, the 1–10 sequence will be duplicated. If, however, I edit A3 to read

=SEQUENCE(30)

now, values 1 through 30 will zoom down cells A3:A32. But that A3#, wherever it’s been written, won’t miss a beat; it too will display numbers 1 through 30, accommodating the additional results spilling from the rewritten SEQUENCE formula. And

=COUNT(A3#)

will output 30. But here, even though the COUNT formula is of the one-result type, it can’t be written in previous Excel versions – because those versions have neither the SEQUENCE function nor the pound sign option.

Raising Some More Points About Lifting

To review, the term lifting denotes Excel formulas’ current multi-result potential, now elevated to a veritable default for most of its functions. But there are few more facts about lifting about which you need to know.

Here’s a pair of ranges in C3:D6 (File: Lifting-Additional Features) whose values we want to subject to a row-by-row multiplication, as in Figure 3-6.

Figure 3-6
Times after times – values to be multiplied

In E3 we could write

=C3:C6*D3:D6

And we see in Figure 3-7 that expression results in the following.

We’ve again encountered an instance of pairwise lifting, that mode of lifting in which pairs of values are multiplied down their columns. Remember that our initial meet-up with pairwise lifting compared student test responses to an answer key, under the steam of an IF statement. Here, we’ve run through a set of multiplications with the pairs.

On the other hand, this sort of lifting need not literally be confined to pairs; it could be stretched across three or more columns of values, once you’ve made sure you’ve gotten the syntax right, for example, in Figure 3-8.

Figure 3-8
Thrice is nice: multiplying trios of values

Now that all works. But if we return to our original pairs of values and edit its formula, we’ll see that this rendition in Figure 3-9 won’t work, at least not completely.

You see why. The two ranges don’t line up properly – because one bears four cells, the other three. The entry in C6 multiplies itself by a phantom partner, for example, the missing D6, and its formula has no choice but to surrender to an #NA.

Now for a New Angle

But what about this variation, captured in Figure 3-10? Note the new discrepancy: the reference preceding the asterisk comprises four cells, while the one following it offers only three.

Write that formula in D7 and Figure 3-11 shows you what you’ll see as follows.

Surprise. The second range – which, remember, comprises fewer values than the first one – now stands in a perpendicular relation to the first, and along with that 90-degree shift comes a wholesale rewording of the formula’s marching orders. Now, each value in the first range is multiplied by every value in the second range. For example, the first value (the upper-left cell) in this brand-new spill – 1242 – is nothing else but the product of the first cells of the respective ranges, 23 and 54. Proceeding horizontally, the 736 signifies the product of 23 x 32, and you can probably take it from there.

What we see here is an instance of what’s called broadcasting, a property of dynamic arrays by which the resulting set of values assumes the number of rows of the lengthier range, and the number of columns of the longer range – here, 4 and 3, respectively. The bottom line is that every permutation is realized – every value is multiplied by every value. But remember, if you need to reference the formula above, all that’s required is

=D7#

Of course, broadcasting isn’t just specific to multiplication. This formula in Figure 3-12, driven by addition,

Figure 3-12
Broadcasting: the addition edition

works too.

But what about the following (Figure 3-13)?

Figure 3-13
Going it alone: a one-cell broadcast

Yes, Figure 3-14 shows us that it works.

Figure 3-14
E unum pluribus: out of one (cell), many

But how does it work? My theory is that Excel treats the single value as a narrowest-possible-row, broadcasting it across the four values lining the column.

But terminology aside, you’ll want to understand how broadcasting works, because as we’ll see, it can work for you.

PMT Permutations

Case in point: For a subtler and more practical lesson, let’s see how broadcasting can furnish a series of answers to a question posed by the PMT function. PMT calculates the payments owed on a loan or a mortgage, provided that the user feeds it three bits of information, (1) the loan amount, (2) the number of pay periods spanned by the term of the loan, and (3) the loan’s operative interest rate. PMT looks like this:

=PMT(interest rate divided by number of annual payments,number of payments,loan amount)

Thus, for example, if we wanted to learn the required payment for a $10,000 loan (which we’ll enter in say, cell A1) at a 3% interest rate (A2) spread across 2 year’s worth of monthly payments (in A3 – the number 24), PMT would look like this:

=PMT(A2/12,A3,A1)

To explain it: the interest in A2 must be divided by 12, in order to reflect the number of payments per year issued against the 3% rate. A3 simply records the total number of loan payments (12 payments over each of the 2 years, or 24), and the A1 recalls the $10,000 loan sum. The answer: $-429.81, expressed as a negative number simply because the loan debits the borrower’s account. As such, the A1 is commonly entered -A1 in order to restore a positive sign to the result; but that decision is merely presentational.

No, we haven’t seen any broadcasting yet, but your patience will be rewarded: we want to determine, via a single PMT formula, the sum we’d owe across a variety of payment and interest possibilities, represented by matrix you see when you open the PMT permutations practice file, and Figure 3-15.

(The user would need to enter the row and column values you see – after all, it’s the user who decides on the values that would be applied to the loan.) That sure sounds like a job for a dynamic array formula. Here, the loan figure appears in G4, the rates sweep across G7:K7, and the number of prospective payments plunge down F8:F13; and the formula, which we’ll write in G8, looks like this:

=PMT(G7:K7/12,F8:F13,G4)

When the dust settles, we get as follows, per Figure 3-16.

Figure 3-16
For your interest: a loan payment matrix, courtesy of one dynamic array formula

Our PMT formula simultaneously inspects every interest rate with the G7:K7/12 argument, and each and every payment frequency via F8:F13. It thus breaks with a conventional PMT by citing ranges instead of single cell references – in other words, it’s a dynamic array formula. The informative upshot, then, all the payment possibilities unfold before the borrower, thanks to that one dynamic array formula.

And whether you realize it or not, our exercise emulates Excel’s ancient Data Table tool, an early array-driven feature with an assortment of moving parts (some of which are rather odd) that, between you and me, I’ve never quite understood (for a Microsoft tutorial on Data Tables look here (https://support.microsoft.com/en-us/office/calculate-multiple-results-by-using-a-data-table-e95e2487-6ca6-4413-ad12-77542a5ea50b)). But never mind; our PMT does the same thing with one formula.

More Lifting – but This Time, Inside a Cell

Now we need to consider one more property of lifting, and it’s an important one. Here’s the demo example, one we can pursue on a blank worksheet. Every Excel character possesses a code number that’s registered by the CODE function. For example, =CODE(“b”) yields 98, while the upper-case variant =CODE(“B”) evaluates to 66. Now suppose that, for whatever reason, we need to learn the code of each of the characters in the word Thursday, which we’ll enter in cell A6. Our challenge is to write one formula that’ll do the job, and in order to achieve that end we’ll need to somehow separate, or extract, each character from Thursday, so that we can peg the proper code to the proper character.

To embark on our mission we’re going to turn to another gray-haired, trusted Excel function – MID. A first cousin of LEFT and RIGHT, MID enables the user to extract text of a specified number of characters from a starting position somewhere inside the cell. Thus, this expression

=MID(A6,4,2)

will trot out the result rs for the word Thursday.

MID’s three elements (or “arguments,” as they’re known in the rule book) (1) name the cell on which MID operates, (2) identify the position number of the character from which MID starts its work, and (3) declare the number of characters to be peeled from the cell. Because it’s the r that sits in the fourth position in our Thursday text string, MID commences there and uproots both the r and the s – the 2 characters cited in the last function argument.

But we want something different from MID: we want it to isolate every one of Thursday’s eight letters, so that each one can be evaluated by the CODE function. Here’s our formula, lodged in A8:

=MID(A6,{1,2,3,4,5,6,7,8},1)

Of course it’s the bracketed element that calls for some explanation. Once again we’ve written an array, one that’s positioned as MID’s second argument – the one that signals the character location from which to begin extracting characters. Here we’re instructing MID to extract one character (referenced by the third function argument – the 1) from the text string’s first position, and then from its second position, and then its third position, and so on, until all eight characters have been skimmed from the source cell. Thus far, then, the formula’s spilled results look like this, as portrayed by Figure 3-17.

Figure 3-17
Confined to their cells; MID extracts each character.

Our MID-based array formula spills its eight results across a horizontal range. MID has in effect applied itself eight times, performing eight character extractions in sequence across its target word, a remarkably nimble feat. And unlike the previous instances of lifting we’ve surveyed to date, the multi-cell output orchestrated here springs from a single cell, and that’s a capability you very much need to keep in mind, and one we need to reexamine.

Now that we’ve lifted each character from Thursday into a cell all its own, we can see in Figure 3-18 how the CODE function wraps itself around MID.

Once MID separates the letters, one per cell, the CODE function examines each one and applies its particular code. And all the work has been powered by one formula. That’s real dynamic array power. And remember: neither MID nor CODE is a “new” dynamic array function – but they work dynamically, just the same.

Field Notes: Field and Dataset Names

Now that we’ve battened down our understanding of spills, #s, and the gotta-know essentials of lifting, we can begin to extend a specific welcome to the newest arrivals to the Excel function family – the ones bearing the dynamic array pedigree.

But before we kick off the reviews, a few words about the field and dataset names used here will help clarify the discussions. In the interests of simplicity and training our focus on dynamic arrays, fields will be named after the entries in their header rows unless others indicated. If an exercise requires a reference to the entire dataset, that set will be called All, if it’s static – that is, if the data are to remain as you see them, and not receive any new records. All field names in static datasets – including the global All itself – will work with ranges whose coordinates begin with the row immediately beneath the headers.

Datasets that are prepared to take on new records will have been configured as tables via the standard Ctrl-T keyboard sequence, and they’ll be named Table1, Table2, etc. – but their fields will again simply carry the names of their headers. That latter qualification is necessary, because tables generate default field names exhibiting what Excel calls structured references.

Thus for this table in Figure 3-19,

a standard, by-the-book table formula that would count the number of entries in the Salesperson field would read, per Excel’s default nomenclature,

=COUNTA(Table1[Salesperson])

But in the interests of consistency and intelligibility, we’ll amend the formula to simply read

=COUNTA(Salesperson)

Thus, the above field-name type will appear in both kinds of datasets – the static and the table variants.

Thus if you too want, in the interests of following and clicking along, to simplify table field names per the above, do the following before you transform the dataset into a table:

Click anywhere among the data.
Click Ctrl-A, thus selecting all the dataset cells.
Then click the Formulas tab > Create from Selection command in the Defined Names button group. You’ll see as in Figure 3-20.

Figure 3-20
An oldie but goodie: the Create from Selection dialog box

By clicking OK, this ancient Excel option will proceed to name each field in the dataset after its header (you can also untick the Left column box if you wish).

Then click Ctrl-T to refurbish the dataset into a table.

Now you’ve gifted yourself with the best of both worlds – you’ve fashioned a table whose fields will automatically register any new records in a formula, and you’ve also allowed yourself to work with conventional, header-based field names.

A final point. From here on, in addition to depicting formulas via the FORMULATEXT function, screenshots will color the cells containing formulas yellow, enhancing clarity.

Next Up

Now that we’ve addressed those necessary preliminaries, it’s time to dive deeply into those dynamic array functions. First up: a function we’ve already glimpsed in passing – SEQUENCE.

Table of Contents for 3. Dynamic Arrays: Time for Some Heavy Lifting

Create new playlist

Sign In

Sign Up