Chapter 2
In This Chapter
Exporting data from other programs
Importing data into Excel
Running a web query
Importing a database table
Querying an external database
In many cases, the data that you want to analyze with Excel resides in an external database or in a database application, such as a corporate accounting system. Thus, often your very first step and very first true challenge are to get that data into an Excel workbook and in the form of an Excel table.
You can use two basic approaches to grab the external data that you want to analyze. You can export data from another program and then import that data into Excel, or you can query a database directly from Excel. I describe both approaches in this chapter.
You can usually easily export data from popular database programs and accounting systems. Excel is the dominant data analysis tool available to business. Because of this, most database programs and most management information systems export data in a format that makes it simple to import the data into Excel later.
Your first step when grabbing data from one of these external sources, assuming that you want to later import the data, is to first use the other application program — such as an accounting program — to export the to-be-analyzed data to a file.
You have two basic approaches available for exporting data from another application: direct exporting and exporting to a text file.
Direct exporting is available in many accounting programs because accountants love to use Excel to analyze data. For example, the most popular small business accounting program in the world is QuickBooks from Intuit. When you produce an accounting report in QuickBooks, the report document window includes a button labeled Excel or Export. Click this button, and QuickBooks displays the Send Report to Excel dialog box, as shown in Figure 2-1.
The Send Report to Excel dialog box provides radio buttons with which you indicate whether you want to send the report to a comma-separated-values file, to a new Excel spreadsheet, or to an existing Excel spreadsheet.
To send (export) the report to an existing Excel spreadsheet, you need to identify that workbook by entering the workbook pathname and filename into the text box provided. Or, click the Browse button and use the Open Microsoft Excel File dialog box that appears (not shown) to identify the folder and workbook file.
In Figure 2-3, you can see how the QuickBooks report looks after it has been directly exported to Excel.
Therefore, when exporting data from some other program, your first step is to do a little bit of digging and research to see whether there’s a way to easily and automatically export data to Excel. This fact-finding shouldn't take much time if you use the online Help system.
When you need to export data first to a text file because the other application won't automatically export your data to an Excel workbook, you need to go to a little more effort. Fortunately, the process is still pretty darn straightforward.
The trick is that you send the report to a text file rather than sending this report to a printer. This way, the report gets stored on disk as text rather than printed. Later, Excel can easily import these text files.
See how this works in more concrete terms by following how the process works in QuickBooks. Suppose, for the sake of illustration, that you really did want to print a list of items that you sell. The first step is to produce a report that shows this list. In QuickBooks, you produce this report by choosing the appropriate command from the Reports menu. Figure 2-4 shows such a report.
The next step is to print this report to a text file. In QuickBooks, you click the Print button or choose File⇒Print Report. Using either approach, QuickBooks displays the Print Reports dialog box, as shown in Figure 2-5.
If you want to later import the information on the report, you should print the report to a file. In the case of QuickBooks, this means that you select the File radio button. (Refer to Figure 2-5.)
The other thing that you need to do — if you’re given a choice — is to use a delimiter. In Figure 2-5, the File drop-down list shows ASCII text file as the type of file that QuickBooks will print. Often, though, applications — including QuickBooks — let you print delimited text files.
Delimited text files use standard characters, called delimiters, to separate fields of information in the report. You can still import a straight ASCII text file, but importing a delimited text file is easier. Therefore, if your other program gives you the option of creating delimited text files, do so. In QuickBooks, you can create both comma-delimited files and tab-delimited files.
To print the report as a file, you simply click the Print button of the Print Reports dialog box. Typically, the application (QuickBooks, in this example) prompts you for a pathname, like in the Create Disk File dialog box shown in Figure 2-6. The pathname includes both the drive and folder location of the text file as well as the name of the file. You provide this information, and then the application produces the text file … or hopefully, the delimited text file. And that's that.
When you don’t or can’t export directly to Excel, you need to take the second step of importing the ASCII text file that you created with the other program. (To read more about exporting to a text file, see the preceding section.)
To import the ASCII text file, first open the text file itself from within Excel. When you open the text file, Excel starts the Text Import Wizard. This wizard walks you through the steps to describe how information in a text file should be formatted and rearranged as it’s placed in an Excel workbook.
One minor wrinkle in this importing business is that the process works differently depending on whether you’re importing straight (ASCII) text or delimited text.
Here are the steps that you take to import a straight text file:
Excel displays the Open dialog box, shown in Figure 2-7, if you choose the Open command. Excel displays a nearly identical Import Text File dialog box if you choose the Data tab’s Get External Data from Text command.
You should see the text file listed in the Open dialog box.
Excel starts the Text Import Wizard, as shown in Figure 2-8.
This tells Excel that the fields in the text file are arranged in evenly spaced columns.
In general, ASCII text files use the first several rows of the file to show report header information. For this reason, you typically won't want to start importing at row 1; you’ll want to start importing at row 10 or 20 or 5.
Don't get too tense about this business of telling the Text Import Wizard which row is the first one that should be imported. If you import too many rows, you can easily delete the extraneous rows later in Excel.
You can preview the to-be-imported report shown on the bottom section of the Text Import Wizard dialog box.
Excel displays the second step dialog box of the Text Import Wizard, as shown in Figure 2-9. You use this second Text Import Wizard dialog box to break the rows of the text files into columns.
You might not need to do much work identifying where rows should be broken into columns. Excel, after looking carefully at the data in the to-be-imported text file, suggests where columns should be broken and draws vertical lines at the breaks.
Excel displays the third step dialog box of the Text Import Wizard, as shown in Figure 2-10.
You can pick default formatting from the third Text Import Wizard dialog box for the columns of the new workbook.
If you don't want to import a column, select a column in the Data Preview box and then select the Do Not Import Column (Skip) radio button.
You can click the Advanced button (on the third Text Import Wizard dialog box) to display the Advanced Text Import Settings dialog box, as shown in Figure 2-11. The Advanced Text Import Settings dialog box provides text boxes that you can use to specify in more detail or with more precision how the data in the text file is arranged.
Click OK after you make choices here; you return to the third wizard dialog box.
Excel imports the text file according to your specifications and places it into a new Excel workbook, as shown in Figure 2-12. The data probably won't be perfectly laid out. Still, when you have very large data sets, you'll find importing a tremendous timesaver. In general, you won't find it terribly difficult to clean up the new workbook. You only need to delete a few rows or perhaps columns or maybe do a bit of additional formatting or row and column resizing.
Here are the steps that you take to import a delimited text file:
Excel displays the Open dialog box (refer to Figure 2-7) or the Import Text File dialog box (which is nearly identical to Figure 2-7).
The drop-down list box you use to choose the type of file you want appears to the right of the File Name text box.
You should see the text file listed in the Open dialog box.
Excel may be able to guess how the delimited data file organizes its data and just open the file in the Excel workbook window. Or Excel may start the Text Import Wizard, as shown in Figure 2-13.
This tells Excel that the fields in the text file are separated by (one or more) delimiters.
In general, ASCII text files use the first several rows of the file to show report header information. For this reason, you typically want to start importing at row 10 or 20 or 5.
Don't get too tense about this business of telling the Text Import Wizard which row is the first one that should be imported. You can easily delete the extraneous rows later in Excel.
You can preview the to-be-imported report shown on the bottom section of the Text Import Wizard dialog box.
Excel displays the second dialog box of the Text Import Wizard, as shown in Figure 2-14. You use this second Text Import Wizard dialog box to identify the character or characters used as the delimiter to break the text into columns. For example, if the file that's being imported is a tab-delimited file, select the Tab check box in the Delimiters area.
The third Text Import Wizard dialog box appears, as shown in Figure 2-15.
The Data Preview box on the second Text Import Wizard dialog box shows how the file will look after it’s imported based on the delimiters that you identified. Experiment a bit to make sure that you import the data in a clean format.
If you don't want to import a column, select the column and then select the Do Not Import Column (Skip) radio button.
Click the Advanced command button of the third Text Import Wizard dialog box to display the Advanced Text Import Settings dialog box. (Refer to Figure 2-11.) Here, you can specify in more detail how the data in the text file is arranged.
Click OK to return to the third Text Import Wizard dialog box.
Excel imports the delimited text file according to your specifications. As with a straight text file, the data probably won't be perfectly laid out. But you won't find it difficult to clean up the new workbook. A few deletions, a little resizing, and pretty soon the workbook will look the way you want.
Another approach to collecting data that you want to analyze is to extract data from a web page or from an external database. Excel provides three very neat ways to grab this sort of external data:
All three approaches for grabbing external data are described in the paragraphs that follow.
One of the neatest ways to grab external data is through a web query. As you know if you’ve wasted any time surfing the web, Internet websites provide huge volumes of interesting data. Often, you’d like to grab this data and analyze it in some way. And fortunately, Excel provides an easy way to move such data from a web page into Excel.
With the Excel Web Query tool, as long as the data that you want to grab or analyze is stored in something that looks like a table — that is, in something that uses rows and columns to organize the information — you can grab the information and place it into an Excel workbook.
To perform a web query, follow these steps:
You need to place query results into a blank worksheet. Therefore, your first step might need to be to open a workbook with a blank worksheet.
If you need to insert a blank worksheet into an existing workbook, click the Insert Worksheet button. This button appears on the bottom edge of the worksheet next to the sheet tabs: Sheet1, Sheet2, Sheet3, and so on.
Excel displays the New Web Query dialog box, as shown in Figure 2-16.
In Figure 2-16, I show a page from the United States Bureau of Labor Statistics. The Bureau of Labor Statistics website provides tons of tabular information, so if you want to play along, go ahead and visit the website at www.bls.gov and poke around until you find a page that shows a table.
Excel places this small yellow right-arrow button next to any tables that it sees in the open web page. All you need to do is to click one of the buttons to grab the data that the arrow points to.
Excel replaces the yellow arrow button with a green check button.
Excel displays the Import Data dialog box, as shown in Figure 2-17.
Select the Existing Worksheet radio button to place the table data into the existing, open, empty worksheet. Alternatively, select the New Worksheet radio button to have Excel place the table data into a newly inserted blank sheet.
Excel places the table data into the specified location. But I should tell you that sometimes grabbing the table data might take a few moments. Excel goes to some work to grab and arrange the table information. Figure 2-18 shows worksheet data retrieved from a web page table.
Another powerful method for retrieving data from an external data source, such as a database, is to retrieve the information directly from one of a database’s tables. In relational databases, as in Excel, information gets stored in tables.
Excel displays the Select Data Source dialog box, as shown in Figure 2-19.
Select the drive and folder where the database is stored using the list boxes provided by the Select Data Source dialog box.
If Excel displays the Select Table dialog box, continue to Step 4.
If Excel doesn’t display the Select Table dialog box but instead displays the Import Data dialog box (see Figure 2-20), skip ahead to Step 5.
Excel displays the Import Data dialog box, as shown in Figure 2-20.
If you want to place the data in an existing worksheet, use the Existing Worksheet text box to specify the top-left cell that should be filled with data. In other words, specify the first cell into which data should be placed.
Excel retrieves information from the table and places it at the specified worksheet location. Figure 2-21 shows an Excel worksheet with data retrieved from a database table in the manner just described.
Excel provides one other powerful method for retrieving information from external databases. You aren't limited to simply grabbing all the information from a specified table. You can, alternatively, query a database. By querying a database, you retrieve only information from a table that matches your criteria. You can also use a query to combine information from two or more tables. Therefore, use a query to massage and filter the data before it's actually placed in your Excel workbook.
Querying is often the best approach when you want to combine data before importing it and when you need to filter the data before importing it. For example, if you were querying a very large database or very large table — one with hundreds of thousands of records — you would need to run a query in order to reduce the amount of information actually imported into Excel.
To run a database query and import query results, follow these steps:
Excel displays the Choose Data Source dialog box, as shown in Figure 2-22.
For example, to query a Microsoft Access database, click the MS Access Database entry and then OK.
You can query the results of a query by clicking the Queries tab and then selecting one of the items listed there.
You can also query an OLAP cube and grab information from that. If you want to query a query or an OLAP cube, consult with the database administrator. The database administrator can tell you what query or OLAP cube you want to grab data from.
Excel displays the Select Database dialog box, as shown in Figure 2-23. Use this dialog box to identify both the location and the name of the database that you want to query.
Excel displays the Query Wizard - Choose Columns dialog box, as shown in Figure 2-24.
You use the Query Wizard - Choose Columns dialog box to pick which tables and which table fields you want to appear in your query results. In the Available Tables and Columns box, Excel lists tables and fields. Initially, this list shows only tables, but you can see the fields within a table by clicking the + symbol next to the table.
To add all the fields in a table to your list, click the table name and then click the right-facing arrow button that points to the Columns in Your Query list box.
To remove a field, select the field in the Columns in Your Query list box and then click the left-facing arrow button that points to the Available Tables and Columns list box.
This all sounds very complicated, but it really isn’t. Essentially, all you do is to identify the columns of information that you want in your Excel list. Figure 2-25 shows how the Query Wizard - Choose Columns dialog box looks if you want to build a data list that includes information such as the type of plan, the state, and the contract year. (The actual database field names are cryptic, of course.)
Excel displays the Query Wizard - Filter Data dialog box, as shown in Figure 2-26.
You can filter the data returned as part of your query by using the Only Include Rows Where text boxes. For example, to include only rows in which the state abbreviation field shows CA, click the state_abbrev field in the Column to Filter list box. Then select the Equals filtering operation from the first drop-down list and enter or select the value CA into the second drop-down list; see how this looks in Figure 2-26.
The Query Wizard - Filter Data dialog box performs the same sorts of filtering that you can perform with the AutoFilter command and the Advanced Filter command. Because I discuss these tools in Chapter 1, I won't repeat that discussion here. However, note that you can perform quite sophisticated filtering as part of your query.
Excel displays the Query Wizard - Sort Order dialog box, as shown in Figure 2-27.
Select the field or column that you want to use for sorting from the Sort By drop-down list. By selecting either the Ascending or Descending radio button, choose whether the field should be arranged in an ascending or descending order, respectively.
You can also use additional sort keys by selecting fields in the first and second Then By drop-down lists.
You sort query results the same way that you sort rows in an Excel table. If you have more questions about how to sort rows, refer to Chapter 1. Sorting works the same whether you’re talking about query results or rows in a list.
Excel displays the Query Wizard - Finish dialog box, as shown in Figure 2-28.
This dialog box provides radio buttons, from which you choose where you want to place your query result data: in Excel, in a Microsoft Query window that you can then review, or in an OLAP cube. Typically (and this is what I assume here in this book), you simply want to return the data to Microsoft Excel and place the data in a workbook. To make this choice, select the Return Data to Microsoft Office Excel radio button.
After you click the Finish button to complete the Query Wizard, Excel displays the Import Data dialog box; refer to Figure 2-20.
Use this dialog box to specify where query result data should be placed.
Excel places the data at the location that you chose.
By using the instructions that I describe in this chapter to retrieve data from some external source, you can probably get the data rather quickly into an Excel workbook. But it’s possible that you've also found that the data is pretty raw. And so you are saying to yourself (or at least if I were in your shoes, I would be saying this), “Wow, this stuff is pretty raw.”
But don't worry: You are where you need to be. It’s okay for your information to be raw at this point. In Chapter 3, I discuss how you clean up the workbook by eliminating rows and columns and information that’s not part of your data. I also cover how you scrub and rearrange the actual data in your workbook so that it appears in a format and structure that’s useful to you in your upcoming analysis.
The bottom line is this: Don't worry that your data seems pretty raw right now. Getting your data into a workbook accomplishes an important step. All you need to do now is spend a little time on your housekeeping. Read through the next chapter for the lowdown on how to do that.
By the way, if the process of importing data from some external source has resulted in very clean and pristine data — and this might be the case if you've grabbed data from a well-designed database or with help from the corporate database administrator — that's great. You can jump right into the data analysis techniques that I start describing in Chapter 4.
18.217.107.229