In this chapter
What Is Possible with Pivot Tables 192
Preparing Your Data for Pivot Table Analysis 192
Creating Your First Pivot Table 200
Changing the Default Layout of a Pivot Table 202
Adding New Fields to a Pivot Table 208
Eliminating Blank Cells from a Pivot Table 209
Using Pivot Table Legacy Mode 210
For 12 years, pivot tables have been the most powerful feature in Excel. However, Microsoft estimates that fewer than 10% of people use pivot tables. To make this helpful feature less intimidating in Excel 2007, Microsoft rewrote the pivot table interface to make it vastly simpler to use.
Note
Although I loved the drag-and-drop functionality in pivot tables, it was a source of frustration for people new to pivot tables. With the drag-and-drop interface, it is possible to accidentally drop a field in the wrong place, essentially destroying the pivot table. When this happened to a pivot table rookie, the person was often frustrated enough to quit using pivot tables.
At the end of this chapter, I show pivot table veterans how to get back to the old interface.
Say that you have 400,000 records of transactional data. It is really easy for some people to figure out that this represents $x million. But to really learn some things about the data, you need to do some more analysis to spot trends in the data. A pivot table lets you analyze trends in data without having to worry about formulas. Your focus is more on finding trends than on worrying about writing formulas in Excel.
By using a pivot table, it is possible to create a number of views of your data, including the following:
Of course, these are just examples. You can use pivot tables to slice and dice your data in almost any imaginable way.
Pivot tables are best created from transactional data—that is, raw data files directly from your company’s IS department. You don’t want any totals in the data. You don’t want blank lines, blank columns, or formatting.
The data shown in Figure 10.1 is perfect for pivot tables. Every row in the table represents the sale of one product to one customer on one date. A pivot table could summarize one or more of the numeric fields in this data.
The data shown in Figure 10.2 is a typical Excel worksheet, but it has a number of problems that make it unsuitable for use in a pivot table:
If you have a dataset like the one shown in Figure 10.2, go back to the source of the data. If you are in a corporate environment, you can explore with your IT department where this summary came from. If the summary came from Quickbooks or another software package, try running a trial balance report that is at the detail level. Someone must have started with transactional data in order to create this summary. If it will take six months for IT to get to your project, you might want to follow the steps outlined in the following section to make the data suitable for use in a pivot table.
As noted in the preceding section, there are problems with the dataset shown in Figure 10.2. Many common datasets from software packages like Peachtree or Quicken or even from Oracle or SAP will have similar problems to this dataset. By using your Excel skills, you can convert the data into a suitable format. It isn’t easy. It isn’t something you should do everyday. But it is possible. Here’s what you do:
Tip From
Instead of repeating step 13 for each month, you could set up a set of formulas once. To do so, you put the cell pointer in D2. Using Name a Range on the Formulas ribbon, define UpRight
as =!E1
. In Cell D2, enter the formula =UpRight
. In Cell D3, enter the formula =D2
. Put the cell pointer in Cell D3 and double-click the fill handle to copy this formula down the rest of the column. Note that when you use this method, you can skip step 12 for each month, but you have to change the paste operation to a paste values operation in step 13. Caution: Do not use this trick if you have VBA code in your workbook! Calculations caused by the VBA code will return a value from the active sheet at the time of the calculation.
You can complete the process described here in less than 15 minutes. It is not a particularly pleasant process. But if the data came from an inflexible software package, or your IT department tells you that you cannot have the new dataview for the next month or year, you know you can rely on this process.
To create pivot tables, follow these rules:
When you have your data in the correct format, creating and manipulating a pivot table is very easy.
Let’s say that your manager has asked you to summarize some data to show revenue by region and product. To create a pivot table to do this, you follow these steps:
You are now just two clicks away from the answer you need. But first, let’s take a quick look around the new look of pivot tables, shown in Figure 10.12:
To create a pivot table, you simply use the PivotTable Field List box to check which fields to include in the table. Excel makes fairly intelligent guesses based on the field type. In the current example, Excel builds a passable table using the default guesses. You have to make one adjustment to perfect the table.
In the PivotTable Field List box, you choose Region, Product, and Revenue. When you choose the Revenue field, Excel adds a new field called Sum of Revenue to the Σ Values section of the PivotTable Field List box. Excel decides that this belongs in the Σ Values section because the field is basically numeric. When you choose Region and Product, Excel moves those to the Row Labels section of the PivotTable Field List box.
At this point, the default pivot table looks as shown in Figure 10.13. In Cell B4, you can see that the central region sold $4.67 million. In Cells B5:B15, you can see the revenue for each product in the central region.
The annoying thing about the table in Figure 10.13 is that it would be easier to read the table if the Region field went across Columns B, C, and D, with a total for the three regions in Column E.
When you use the checkmark method for building a pivot table, Excel can’t read your mind about which fields would look better when used as column labels. Luckily, it is very easy to move fields in a pivot table, as described in the following sections.
In Figure 10.13, the PivotTable Field List box is composed of a list of fields at the top and then four distinct areas below (Report Filter, Column Labels, Row Labels, and Σ Values).
The drop-down at the top offers five different views of the Field List box. As shown in Figure 10.14, the name for the default view is Fields and Drop Zones Stacked. This view could have shortcomings if you had more than 16 fields in the field list. You can also see in Figure 10.14 that there is not quite room for the text Sum of Revenue to appear in the Σ Values section.
Figure 10.15 shows the Fields and Drop Zones Side by Side view. This view would allow up to 28 fields to be visible in the Field List box. It still has the problem that you can’t see the entire name Sum of Revenue in the Σ Values section.
Figure 10.16 shows the Field List box in Fields Only view. You can use this view if you can trust Excel to put the fields in the right place.
Figure 10.17 shows Drop Zones Only (2 by 2) view, with the layout fields arranged in a 2×2 grid. Figure 10.18 shows Drop Zones Only (1 by 4) view, with the sections arranged in a 1×4 grid.
The drop zone sections of the PivotTable Field List Box are as follows:
Four of the five views of the PivotTable Field List box include the four drop zones described in the preceding section. These areas are the keys for rearranging the look of a pivot table.
The Row Labels section from our earlier example has Region and Product fields. In your pivot table, you will have different field names. Each of these fields has a drop-down. If you choose the Product drop-down, you see the list of options shown in Figure 10.19.
From the drop-down list in 10.19, if you choose Move to Column Labels, the data is arranged in what is called a crosstab analysis, as shown in Figure 10.20.
In this example, you used the drop-down shown in Figure 10.19 to move a field from one section to another. You can instead drag a field within the Field List box drop zones. To do this, in Figure 10.19, you click the Region field and drop it in the Column Labels section.
Once you have arranged your data in the report, you will want to consider formatting the numeric fields. For example, in Figure 10.21, it would be helpful if the numbers were formatted with commas as thousands separators.
There is a temptation to format a pivot table just like you format any other range in a worksheet. However, as you will see later in this chapter, a pivot table is very fluid. Although the numbers in the figure currently occupy Cells B5:E16, with a couple mouse clicks, they could soon occupy Cells B4:B48 or even Cells B4:B539. Because any pivot table might be changing shape, it is best to do all formatting through the pivot table interface.
If you tell Excel that a particular field should always have the format $#,##0, then no matter how you change the pivot table, Excel will remember the format.
For example, in the PivotTable Field List box, you should click the Sum of Revenue drop-down in the Σ Values section. Be careful. Revenue appears twice in the PivotTable Field List box. There is a Revenue field with a checkbox in the Fields section. There is a Sum of Revenue button in the Σ Values section. Both Revenue and Sum of Revenue have drop-down arrows when selected. You are specifically looking for the Sum of Revenue button in the Σ Values layout section of the PivotTable Field List box.
When you choose the Sum of Revenue drop-down arrow in the Σ Values section, you should choose Field Settings, as shown in Figure 10.21.
The Summarize By tab of the Data Field Settings dialog allows you to change the summary function from Sum to Count, Average, Min, Max, etc. In the lower-left corner of this dialog, click the Number Format button, as shown in Figure 10.22. You can then choose a numeric format from a special version of the Format Cells dialog.
When you have a pivot table, it is easy to further customize it. For example, in the Below the Field List box, you can drag Product from the Row Labels section to the Column Labels section. Then you can choose Customer, which by default moves to the Row Labels section. In two more clicks, you have created a completely different summary of the data.
The first pivot table you created in this chapter gives a great view of sales by product by region. The fantastic thing about pivot tables is that they allow you to drill in and get more detail from your data. Once you have created your first pivot table, think about ways that you can add more data to the report in order to further explain the values you see in the summary.
As an example, say that you want to add customer data to your product/region summary report. This would be a good report to produce for a product line manager. It is incredibly easy to transform your first pivot table into a report that shows such customer detail. You have two choices. In the PivotTable Field List box, you can drag the Customer field over to be the second field in the Row Labels section. Alternatively, you can simply choose the check box next to Customer in the field list. By default, this adds the field as the last field in the Row Labels area. In your dataset, follow the same step to add the new field to the layout.
After you make either of these changes, within a second, Excel redraws the pivot table to show the customers who purchased each product, as shown in Figure 10.23.
Once you produce a report with two or more fields, you might be frustrated by a problem common to most pivot tables. Look at Row 9 in Figure 10.23. This customer made purchases from the central and north regions of your company but did not make any purchases from the south region. By default, a pivot table shows a blank in this cell to indicate that there were no records matching for this customer in that particular region.
Many people would rather see a zero in this cell than see it blank. To override the blank setting, on the PivotTable Tools - Options ribbon, in the PivotTable Options group, choose the Options icon. As shown in Figure 10.24, there are five tabs in the PivotTable Options dialog that appears. The first tab, Layout & Format, has a setting called For Empty Cells Show. You should change this setting from a blank to a zero.
If you are an expert in creating pivot tables in Excel 2003, you might be distraught that the pivot table interface has changed considerably. While the new interface will allow people new to pivot tables to create pivot tables flawlessly, you might wish to use the legacy pivot table functionality to create your pivot tables.
The Excel 95 - Excel 2003 pivot table interface allowed you to drag and drop fields right onto the pivot table. Many people did not notice the subtle visual clues that allowed you to know where the dropped field would appear. Many people would try to drop a new column field between the column area and the data area, resulting in disaster if Excel interpreted the field as a data field. Microsoft changed the interface to protect these new people from themselves.
However, if you have previously mastered the drag and drop method, you might want a way back to that mode. Microsoft provides a way, but it is fairly hidden.
Pivot tables are the greatest invention in spreadsheets, but they do have a few limitations. However, as described in the following sections, many of their limitations have been significantly improved in Excel 2007.
As shown in Table 10.1, the capacity limitations for pivot tables have been greatly improved in Excel 2007.
There are some limitations when it comes to combining features within pivot tables:
Most people are shocked to learn that changes to underlying data do not appear in a pivot table. After all, you change a cell in Excel, and all the formulas derived from the cell automatically change. You would think that the same should hold true for pivot tables, but it does not.
Pivot tables are fast because the data from the worksheet is loaded into a special cache in memory. If you build a pivot table and then change the underlying data, you must click the Refresh icon in the Data group of the PivotTable Tools - Options ribbon in order to have the change appear in the pivot table.
Sometimes the PivotTable Field List box seems to disappear randomly. It is not actually randomly, but it seems like it. Say you have data on Sheet1. You build a new pivot table on Sheet2. My argument is that as long as you are on Sheet2, you are clearly working with or looking at the pivot table. Microsoft disagrees with me. Microsoft’s rule is that as soon as you click in any cell outside the confines of the pivot table, Excel should put away the PivotTable Field List box and switch to a ribbon other than one of the pivot table ribbons. To solve this problem, you can select a cell within the pivot table again to redisplay the PivotTable Field List box.
The pivot table ribbons exhibit strange behavior. If you are on the PivotTable Tools - Options ribbon, you can click one cell outside the pivot table and then immediately select a cell back inside the pivot table, and Excel redisplays the PivotTable Tools - Options ribbon. However, if you are at the right edge of the pivot table and click the right-arrow key twice and then the left-arrow key twice, you have now touched two cells outside the pivot table, so the PivotTable ribbon does not redisplay until you click on the Options tab in the ribbon.
Say your dataset has thousands of rows of data. For any reason, if one of the revenue cells happens to be blank, this completely confuses Excel. There can be 999,999 cells with numbers and 1 blank cell, but Excel will no longer realize that the Revenue column is a numeric column.
Two things tip you off to this problem. First, if you attempt to simply check the box for Revenue in the field list, Excel moves it to the Row Labels section instead of the Σ Values section. Second, if you are in the habit of dragging the Revenue field to the Σ Values section, you see numbers that are too low. If you are an $11 million company and your pivot table shows your revenue as $1,124, you can assume that you probably have one or more blank cells in the underlying revenue data. In Figure 10.27, the blank Cell F4 causes Excel to assume that Revenue is a label field. If you build the pivot table anyway, Excel counts the revenue records instead of summing them.
You need to be aware of this limitation. To correct the problem, you have two choices. The better solution is to fill in the underlying blank cells with zeros. But the easier solution is to double-click the Sum of Revenue heading and then change Data Field Settings to Sum from Count.
3.145.167.176