When the dynamic array formulas were covered in Chapter 10, the FILTER function was omitted. FILTER was introduced in Excel at the same time as the other dynamic array formulas – SORT, UNIQUE, SEQUENCE, etc. This function will make you wonder, how did I work before the FILTER function? And for this reason, it has its own chapter.
A common question I receive in my training sessions and comments on my YouTube channel is how to write a lookup formula to return all matching results. Functions such as VLOOKUP, the INDEX-MATCH combo, and XLOOKUP cannot do this. This is where the FILTER function steps up.
Now we did see the INDEX and XLOOKUP functions returning ranges, and therefore multiple values, earlier in this book. However, that is not truly returning all results for the value you are looking up. The FILTER function will return the values for each matching item found and not only the first or last matching value.
The FILTER function is quite simply a formula alternative to the popular filter tool in Excel. The filter tool is amazing but is limited in that it requires a user to reapply the filter every time they want to update the results.
FILTER will automate this task of filtering a table of data. It also gives us the capability of combining it with other formulas and Excel features for an even richer Excel experience. FILTER is *insert your chosen superlative here*!
In this chapter, we will begin with the basics of FILTER and learn how to structure the logical tests that determine the results to return. We will then start using FILTER with other functions such as SORT, TEXTJOIN, and NETWORKDAYS.INTL to become familiar with how FILTER can be applied in different scenarios.
The chapter is concluded with examples of using FILTER with other Excel features such as Data Validation and charts.
filter.xlsx
Introduction to the FILTER Function
Availability : Excel 2021, Excel for Microsoft 365, Excel for the Web, Excel 2021 for Mac, Excel for Microsoft 365 for Mac
The FILTER function filters a range or array based on specified criteria. It returns a spill range with the results of the filter.
This function is terrific for creating dynamic reports and models where the values to return are dependent upon the value of an input cell.
Array: The range or array that you want to filter and return
Include: The filter criterion that determines the rows or columns to return
[If empty]: The action to take, or value to return, if no results are returned by FILTER
Simple FILTER Example
Let’s jump straight into a simple example of the FILTER function in action.
The [Review By] column contains a formula to ensure that the sample file, and therefore FILTER results, behaves the same for you as it does for me in these examples.
In our first FILTER example, we will return a range containing all three columns from [tblReviewDates] for the rows where the review due date has passed.
FILTER with Multiple Conditions
Often, there may be more than one condition that determines the results to return. Let’s progress from our first example and see examples that include multiple conditions and handle both AND and OR logic.
In the first of these examples, we will create a report that returns those who have a review due in the next x number of days. The number of days will be determined by a cell value.
The include argument of FILTER in this example contains two conditions. When writing multiple conditions in the FILTER function, each condition must be enclosed in its own set of brackets.
In this example, we need to apply AND logic between the two conditions. They are that the [Review By] date must be between today’s date and the date in the number of days specified by the value in cell B2.
To apply AND logic, the asterisk (*) operator is used between the two conditions. If OR logic was required, the plus (+) operator would be used (we will see an example of this shortly).
How the FILTER Function Logic Works
Let’s briefly recap on why the * and + operators are used to apply AND and OR logic, respectively.
A semicolon separates rows in an array, and a comma separates columns in an array.
The asterisk is the multiplication operator and ensures that the corresponding values in each array are multiplied. This converts the TRUE and FALSE values to 1 and 0, respectively, and performs the operation between each corresponding value.
The rows of the array that are equal to 1 are then returned. So, rows 1 and 5 are returned in this example.
This same application of logical expressions with arrays was seen with the SUMPRODUCT and SUM functions earlier in this book.
Further Examples
Now that we understand why the * and + operators are used to perform the different logical operations, let’s proceed and see further examples with the FILTER function.
Two AND logical tests are applied between the three conditions. Each condition is enclosed in its own set of brackets. Let’s now add some OR logic into the formula.
Sort the FILTER Results
The results of the FILTER function will appear in the same order that they appear in the source data. In this example, that is the [tblReviewDates] table. This may not be the order that you require. Also, if someone sorts the data in [tblReviewDates], this change will propagate through to the FILTER results.
To specify the order for our FILTER results, we can add the SORT function.
Number 1 is entered for the sort order argument of the SORT function to specify the ascending order. This was not necessary as ascending is the default order, but I like to be explicit in a formula and specify the order.
#CALC! Error with FILTER
The #CALC! error is returned by the FILTER function when there are no results for the FILTER function to return.
In the previous examples, the text “No review dates are due” was entered in the if empty argument of FILTER. This text would be returned instead of the #CALC! error if there were no results to return.
In this scenario, it would be good to return an alternative response. The cause of the #CALC! error is different, and the response shown in Figure 13-10 is not relevant. There are definitely review dates due within the next 14 days; the cause is the empty region cell.
This formula can be enhanced further to handle cell B2 being left empty and other considerations. However, you cannot plan for every eventuality, and the #CALC! error is good for telling the user that they have done something wrong, such as omitting required information. Not all errors are bad.
Adding a Calculated Column to the Returned Array
You may need to add calculated columns to your report that use the results returned by the FILTER function. This will require being able to access specific columns, and maybe rows, of the returned array.
In this example, we will add a calculated column to return the number of days remaining until the review date. This provides a puzzle to us, because the FILTER function in this example returns an array that is three columns wide. Yet, we want to use only the third column [Review Date] in our calculation.
In Chapter 11, we covered the INDEX function and saw how it is the ultimate function for working with arrays. Let’s use it here to isolate the third column of this multicolumn array for our calculated column.
By using the TODAY() function in this formula, the result in cell E5 is returned in a date format. It will need reformatting to be presented correctly.
The CHOOSECOLS function provides a neat alternative to INDEX for this task. This is exactly the type of task that the CHOOSECOLS function was developed for.
Having the calculated column separate from the FILTER function results will have its advantages. It will be easier to use in other formulas if required or for values in a chart.
But maybe we want to include this column in a combined array with the FILTER results. For this, we can use the HSTACK function.
This formula uses the LET function which is covered in Chapter 15 of this book. Using LET ensures that the FILTER function is only calculated once. The FILTER function is assigned to the name reviews. CHOOSECOLS then references reviews for its array. Skip to Chapter 15 to read more detail on how to use LET and why the LET function is important.
The CHOOSECOLS and HSTACK functions are available in the Excel for Microsoft 365 version for Windows and Mac and Excel for the Web version only.
FILTER and Return Non-adjacent Columns
The FILTER function cannot independently return specific columns from a table or array. It requires the columns to be together or adjacent as Excel will typically describe it.
For these examples, we will return the [Name] and [Review By] columns from the [tblReviewDates] table only. These are columns 1 and 3.
Fortunately, there are numerous ways to accomplish this, and they are nice and simple. Let’s start with the CHOOSECOLS function.
Entering the column numbers to return is not tough when there are only two out of three to return. However, you may be looking for a more dynamic approach.
The CHOOSECOLS function will accept an array of constants to specify the columns to return, for example, =CHOOSECOLS(tblReviewDates,{1,3}). Because of this, the MATCH function could be used to return the columns stated in range B4:C4 on the worksheet. This will prevent the need to type in the column numbers manually, as the columns will be dynamically returned based on the cell values.
If your preference is to specify the columns to return in an absolute manner, instead of entering column numbers or referencing cell values, then the CHOOSE function is a great alternative.
In Figure 13-17, the following formula uses the CHOOSE function to specify that we want the [Name] and [Review By] columns returned only. This approach is ideal when working with data formatted as a table.
UNIQUE Ignoring Blanks
The FILTER function is great for removing blanks from a range. A typical scenario for this technique would be to prepare a range of values for use in a drop-down list or labels of a report.
FILTER with NETWORKDAYS.INTL
The FILTER function can be used to provide other functions with a filtered array to be used. This provides a lot of potential and fun when you think of the different ways that this can be utilized.
In this example, we will use the FILTER function with NETWORKDAYS.INTL to return a range of holidays dates that are dependent on criteria.
This example could be applied to the NETWORKDAYS function just as well. However, the NETWORKDAYS.INTL function (covered in Chapter 6) is a superior function.
In Figure 13-20, the following formula is entered in cell E2 to calculate the number of working days for the different tasks of a project.
The formula =NETWORKDAYS.INTL(B2,C2) is entered in column F to help demonstrate that the FILTER function successfully returns the correct dates.
FILTER and Combine Results in One Cell
The FILTER results can be combined in a single cell using the TEXTJOIN function. This can offer a neat alternative to spilling the results into the adjacent cells on the sheet.
Let’s imagine that we want to list the different product categories in a column and, in the adjacent column, list all products assigned to that category combined into a single cell.
The FILTER function returns the array of products that match the category entered in the corresponding cell of column B. The values in this array are sorted in ascending order by SORT.
Compare Lists with FILTER
The FILTER function combined with COUNTIFS enables us to easily compare two lists and return the results we require.
This could be that we want all values that occur in both lists, those that occur in one list and not the other, or even the values that occur in either list but not both.
Names That Occur in Both Lists
For this first example, we want to return the [Number] and [Name] columns only for the names that occur in both lists.
It uses the COUNTIFS function to count the occurrences of the ID numbers from [tblFirst] within [tblSecond]. The resulting array is passed to the FILTER function. Remember, any non-zero value evaluates to TRUE. So, if the ID number is found, then FILTER will return the record. The returned results are sorted by the [Number] column in ascending order.
Names That Occur in One List Only
Let’s now return the names that occur in either the first or second table only.
The only adaptation that has been made from the previous formula is the addition of the NOT function to reverse the results of the COUNTIFS. It now returns the names in [tblFirst] that do not appear in [tblSecond].
Names That Occur in Either List but Not Both
We now want to return the names that occur in either [tblFirst] or [tblSecond] only. This is really a table with the appended results of the two formulas from the previous example.
Return All Names from Both Lists
Finally, let’s use an Excel formula to return a distinct list of all names that appear in both tables.
This example does not require the FILTER function; however, it has been included for a thorough coverage of comparing two lists.
Aggregate FILTER Results
We have seen a few aggregation functions in this book. These include SUMIFS, SUMPRODUCT, and AGGREGATE, to name a few. These amazing functions all have small disadvantages which help you decide which one is best for your chosen task.
Well, let’s add another to the list in terms of the FILTER function. It is an extremely well-rounded function that does not suffer from some of the ailments that the other aggregation functions do.
The FILTER function can be used to perform the logical expressions required and return an array of values for an aggregation function to work on. As we know from this chapter, FILTER can handle AND and OR logic, multiple column and row arrays, and arrays provided by other functions. This makes it a fantastic provider for an aggregation function.
FILTER with Other Excel Features
There are some fantastic ways that the FILTER function can be used with other Excel features. Let’s see some examples of FILTER being used to provide a filtered array for Data Validation and for charts.
Shrinking Drop-Down List
In this first example, we will create a shrinking drop-down list. This is a list that decreases in size each time an item in the list is used. Once an item is used, it is removed from the list to prevent it from being used again.
- 1.
Write the required formula and check that it works.
- 2.
Define a name for the spill range generated by FILTER.
- 3.
Reference the defined name as the source of the Data Validation list.
The COUNTIFS function checks if the names occur in [tblTasks]. The NOT function then reverses the logic to show TRUE if the names do not occur and FALSE if they do. FILTER then returns the unused names.
Please refer to Chapter 3 on defined names or Chapter 10 on dynamic arrays if you are unsure on how to define a name for a range.
- 1.
Select the [Assigned] column of [tblTasks].
- 2.
Click Data ➤ Data Validation.
- 3.
On the Settings tab, select List in the Allow field and then enter “=lstNames” in the Source field (Figure 13-34).
- 4.
Click OK.
Dependent Drop-Down List
For a second Data Validation example, we will see how the FILTER function offers a neat method of creating dependent drop-down lists.
A dependent drop-down list is when the items shown in a drop-down list are dependent on the item specified in a previous drop-down list.
Figure 13-36 shows a table named [tblProducts]. It contains information about products that we sell. For this example, we are interested in just the [Product] and [Category] columns of this table.
We want to create a drop-down list for the three different categories of product. And then create a second drop-down list for the product names. The second list of product names will be dependent on the selection made in the first list. Only products relating to the specified category will be shown.
If no selection is made in the previous list, then the text “No category selected” is returned (Figure 13-40). This text was entered for the if empty argument of the FILTER function.
Interactive Chart
The FILTER function can be used to easily add an interactive element to a chart in Excel. In this example, we will use a drop-down list for a user to make a selection, and that selection determines the data to be charted.
On a separate sheet, we have a drop-down list of the three product categories. The cell containing the drop-down list is named [rngChosenCat]. When a user selects a category, only the products relating to that category will be charted and will be sorted in descending order by their sales total.
Figure 13-42 shows two different formulas producing spill ranges in cells B2 and C2.
These formulas are entered on a sheet named [Chart Data]. A name has been defined for each spill range so they can be referenced in the chart.
- 1.
On the [Dynamic Chart] sheet, click Insert ➤ Insert Column or Bar Chart ➤ Clustered Column.
- 2.
With the chart selected, click Chart Design ➤ Select Data.
- 3.
In the Legend Entries (Series) area, click Add.
- 4.
Type “Sales Totals” for the Series name (this will become the chart title and can be changed later for something more meaningful).
- 5.
Click in the Series values field, remove the content, click a cell on the [Chart Data] sheet, remove the cell reference, and type “rngSalesTotals” after the sheet name (Figure 13-44). Charts require an explicit reference that includes the sheet name. Click OK.
- 6.
Click Edit in the Horizontal (Category) Axis Labels area.
- 7.
Click in the Axis label range field, click a cell on the [Chart Data] sheet, remove the cell reference, and type “rngProductNames” after the sheet reference. Click OK.
Excel will actually edit the data series and axis label references in the chart to the workbook name followed by the defined name. But we can let Excel handle that.
Figure 13-46 shows the finished chart. Improvements can and should be made to the chart to take it further.
Summary
In this chapter, we learned the excellent FILTER function of Excel – a very powerful and flexible function. We saw many examples that supported this and complemented the strengths of FILTER compared to other lookup functions. It has changed the game in how we perform certain tasks in modern Excel.
In the next chapter, we will look at using formulas with data types in Excel. Data types are a recent development that is still evolving and finding its place in the Excel jigsaw. They are a rich data type that enables us to store many columns of values in a single cell.
In the chapter, we will discuss in detail the different data types available and how to access their data with some of the most popular functions in Excel, plus a couple of new functions – FIELDVALUE and STOCKHISTORY.