External data is data that resides outside of Excel in a file, database, server, or Web site. You can import external data into Excel either directly into a PivotTable or into a worksheet for additional types of data analysis.
A vast amount of data exists in the world, and most of it resides in some kind of nonworkbook format. Some data exists in simple text files, perhaps as comma-separated lists of items. Other data resides in tables, either in Word documents or, more likely, in Access databases. There is also an increasing amount of data that resides in Web pages and in XML files.
By definition, all this data is not directly available to you via Excel. However, Excel offers a number of tools that enable you to import external data into the program. Depending on your needs and on the type of data, you can either import the data directly into a PivotTable report, or you can store the data on a worksheet and then build your PivotTable from the resulting worksheet range. In most cases, Excel also enables you to refresh the data so that you are always working with the most up-to-date version of the data.
Excel can access a wide variety of external data types. However, in this appendix you only learn about six of them: data source files, Access tables, Word tables, text files, Web pages, and XML files.
In Appendix B, you learned about ODBC data sources, which give you access to data residing in databases such as Access or dBase or on servers such as SQL Server and Oracle; see the Appendix B section "Understanding Microsoft Query." However, there are many other data source types, including data connection files — which connect to specific objects in a data source, such as an Access table — Web queries, OLAP cubes, query files — saved via Microsoft Query; see Appendix B — Web-based data retrieval services, and XML files. See also the next task, "Import Data from a Data Source."
Microsoft Access is the Office suite's relational database management system, so it is often used to store and manage the bulk of the data used by a person, team, department, or company. You can connect to Access tables either via Microsoft Query or by importing table data directly into Excel. See the task "Import Data from an Access Table," later in this appendix.
Simple nuggets of nonrelational data are often stored in a table embedded in a Word document. You can only perform so much analysis on that data within Word, so it is often useful to import the data from the Word table into an Excel worksheet. See the task "Import Data from a Word Table," later in this appendix.
Text files often contain useful data. If that data is formatted properly — for example, each line has the same number of items, all separated by spaces, commas, or tabs — then it is possible to import that data into Excel for further analysis. See the task "Import Data from a Text File," later in this appendix.
People and companies are storing useful data on Web pages that reside either on the Internet or the company's intranet. This data is often a combination of text and tables, but you cannot analyze Web-based data in any meaningful way in your Web browser. Fortunately, Excel enables you to create a Web query that lets you import text and/or tables from a Web page. See the task "Import Data from a Web Page," later in this appendix.
XML — eXtensible Markup Language — is redefining how data is stored. This is reflected in the large number of tools that Excel now has for dealing with XML data, particularly tools for importing XML data into Excel. See the task "Import Data from an XML File," later in this appendix.
To use external data, you must have access to it. This usually means knowing at least one of the following: the location of the data or the login information required to authorize your use of the data.
By definition, external data resides somewhere other than in an Excel worksheet on your system. Therefore, to access external data, you must at least know where it is located. Here are some of the possibilities:
On your computer — The data may reside in a file in your hard disk, on a CD or DVD disc, or on a memory card or other removable storage medium.
On your network — The data may reside in a folder on a computer that is part of your local or wide area network. If that folder has been shared with the network, and if you have the appropriate permissions to view files in that folder, then you can access the data within the files.
On a server — Some data is part of a large, server-based database management system, such as SQL Server or Oracle. In this case, you need to know the name or network address of the server.
On a Web page — If the data resides on a Web page, either as text or as a table, you need to know the address of the Web page.
On a Web server — Some data resides on special Web servers that run data retrieval services such as Windows SharePoint Services. In this case, you need to know the address of the server and the location of the data on that server.
Knowing where the data is located is probably all that is required if you are dealing with a local file or database or, usually, a Web page. However, after you start accessing data remotely — on a network, database server, or Web server — authorization will usually also be required to secure that access. See the administrator of the resource to obtain a username or login ID as well as a password.
After you have access to the data, your next step is to import it into Excel for analysis and manipulation. You have two choices:
If you are building a PivotTable using the external data as the source, then in most cases Excel enables you to import the data directly into a PivotTable. The advantage here is that Excel does not have to store two copies of the data: one on a worksheet and another in the pivot cache. The disadvantage is that you can only analyze the data using the PivotTable report. Other types of data analysis that require direct access to worksheet data are not possible.
In all cases, you can also import the data directly into an Excel worksheet. Depending on the amount of data, this can make your worksheet quite large. However, having direct access to the data gives you maximum flexibility when it comes to analyzing the data. Not only can you create a PivotTable from the worksheet data, but you can also use Excel with other data analysis tools: lists, database functions, scenarios, and what-if analysis.
You can quickly import data into just about any format by importing the data from a defined data source file.
You learned how to create data source files in several places in this book. For OLAP queries, see the Chapter 11 task, "Create an OLAP Cube Data Source;" for OLAP cube files, see the Chapter 11 task, "Create an Offline OLAP Cube;" for ODBC data sources, see the Appendix B task, "Define a Data Source;" for Microsoft Query files, see Appendix B, the tip in the task "Return the Query Results." You can also create data connection files that point to specific objects in a database, such as an Access table. Excel also considers file types such as Access databases and projects, dBase files, Web pages, text files, and Lotus 1-2-3 spreadsheets to be data sources.
In this task, you learn how to import data from a data connection file, which uses the .odc extension. This is a data source that connects you to a wide variety of data, including ODBC, SQL Server, SQL Server OLAP Services, Oracle, and Web-based data retrieval services. See the tip on the following page to learn how to create a data connection file. Note, however, that not all data connection file types support direct import into a PivotTable. For example, if you use a data retrieval service such as MSN MoneyCentral, you cannot import that data directly to a PivotTable.
If you want to use Excel to analyze data from a table within an Access database, you can import the table to an Excel worksheet.
In Appendix B, you learned how to use Microsoft Query to create a database query to extract records from a database, filter and sort the records, and return the results to Excel. You learned that you can create a database query for any ODBC data source, including an Access database.
If you simply want the raw data from an Access table, you can still use Microsoft Query. That is, you add the table to the query, add the asterisk (*) "field" — representing all the table's fields — to the data grid, and then return the results without adding any criteria to filter the records.
However, Excel gives you an easier way to do this: you can import the table directly from the Access database. To make this technique even easier, Excel automatically creates a data connection file for the database and table that you import. Therefore, you can import the same table in the future simply by opening the data connection file.
Note, too, that you can also use the steps in this task to import data from any query that is already defined in the Access database.
This task uses the Northwind.mdb database that comes with Microsoft Access, or you can create your own sample database.
The Select Data Source dialog box appears.
If another user has the database open, you may see the Data Link Properties dialog box. If so, make sure the login information is correct and then click Test Connection until you are able to connect successfully. Then click OK.
Microsoft Excel displays the Select Table dialog box.
Microsoft Excel displays the Import Data dialog box.
Excel imports the data to the worksheet.
You can improve your analysis of Word table data by importing the table into an Excel worksheet.
Word tables are collections of rows and columns and cells, which means they look something like Excel ranges. Moreover, you can insert fields into Word table cells to perform calculations. In fact, Word fields support cell references such as B1 — the cell in the second column and first row of the table — and you can use cell references, built-in functions such as SUM and AVERAGE, and operators such as addition (+), multiplication (*), and greater than (>), to build formulas that calculate results based on the table data.
However, even the most powerful Word field formula is a far cry from what you can do in Excel, which offers far more sophisticated data analysis tools. Therefore, to analyze your Word table data properly, you should import the table into an Excel worksheet.
This task uses the 2006Budget.doc Word file, available at www.wiley.com/go/pivotablesvb
, or you can create your own sample Word table.
A quick way to select a table in Word is to click any cell in the table and then press Alt+Shift+5.
You can also click the Copy button (
Word copies the table to the Clipboard.
You can also click the Paste button (
Excel pastes the Word table data.
You can analyze the data contained in certain text files by importing some or all the data into an Excel worksheet.
Nowadays, most data resides in some kind of special format: database object, XML file, Excel workbook, and so on. However, it is still relatively common to come across data stored in simple text files because text is a universal format that users can work with on any system and in a wide variety of programs, including Excel.
Note, however, that you cannot import just any text files into Excel. Some or all the files must use one of these two structures:
Delimited — This is a text structure in which each item on a line of text is separated by a character, called a delimiter. The most common text delimiter is the comma (,), and there is even a special text format called Comma Separated Values (CSV) that uses the comma delimiter. A delimited text file is imported into Excel by placing each line of text on a separate row and each item between the delimiter in a separate cell.
Fixed width — This is a text structure in which all the items on a line of text use up a set amount of space — say, 10 characters or 20 characters — and these fixed widths are the same on every line of text. For example, the first item on every line might use 5 characters, the second item on every line might use 15 characters, and so on. A fixed-width text file is imported into Excel by placing each line of text on a separate row and each fixed-width item in a separate cell.
The importing of text files into Excel is handled by the Text Import Wizard, the steps for which vary depending on whether you are importing a delimited or fixed-width text file.
This task uses the StockPrices.csv text file, available at www.wiley.com/go/pivottablesvb
, or you can create your own sample text file.
The Select Data Source dialog box appears.
The Text Import Wizard appears.
For delimited text, continue with "Import Delimited Data." For fixed-width text, skip to "Import Fixed-Width Data."
The second Text Import Wizard dialog box appears.
The third Text Import Wizard dialog box appears.
To complete this task, see "Finish the Text Import Wizard."
If you are importing data that uses the fixed-width structure, then you need to tell Excel where the separation between each field occurs.
In a fixed-width text file, each column of data is a constant width. The Text Import Wizard is usually quite good at determining the width of each column of data, and in most cases the wizard automatically sets up column break lines, which are vertical lines that separate one field from the next. However, titles or introductory text at the beginning of the file can throw off the wizard's calculations, so you should check carefully that the proposed break lines are accurate. In the second Text Import Wizard dialog box, you can scroll through all the data to see if any break line is improperly positioned for the data in a particular field. If you find a break line in the wrong position, you can move it to the correct position before importing the text.
Note, too, that in some cases the Text Import Wizard adds an extra break line. For example, if the text file has three columns of data, the wizard may suggest three break lines, which divide the data into four columns. In this case, you can delete the extra break line.
You need to have performed the steps in "Start the Text Import Wizard" before continuing with this section.
The second Text Import Wizard dialog box appears.
To create a break line, click the ruler at the point where you want the break to appear.
To delete a break line, double-click it.
The third Text Import Wizard dialog box appears.
You can analyze Web page data by importing it into Excel using a Web Query.
To make data more readily available to a wide variety of users, many people are placing data on Web pages that are accessible via the Internet or a corporate intranet — a local network that uses Web servers and similar technologies to implement Web sites and pages that are accessible only to network users. Although this data is often text, most Web page data comes in one of two formats:
Table — This is a rectangular array of rows and columns, with data values in the cells created by the intersection of the rows and columns.
Preformatted text — This is text that has been structured with predefined spacing. In many cases, this spacing is used to organize data into columns with fixed widths.
Both types of data are suitable for import into Excel, which enables you to perform more extensive data analysis using Excel's tools. To import Web page data into Excel, you must create a Web query — a data request that specifies the page address and the table or preformatted text that you want to import.
You can analyze data that currently resides in XML format by importing that data into Excel and then manipulating and analyzing the resulting XML list.
XML — eXtensible Markup Language — is a standard that enables the management and sharing of structured data using simple text files. These XML files organize data using tags, among other elements, that specify the equivalent of a table name and field names. Here is a simple XML example that constitutes a single record in a table named "Products":
<Products> <ProductName>Chai</ProductName> <CompanyName>Exotic Liquids</CompanyName> <ContactName>Charlotte Cooper</ContactName> </Products>
These XML files are readable by a wide variety of database programs and other applications, including Excel 2003. Because the XML is just text, if you want to work with the data, you must import the XML file into another application. If you want to perform data analysis on the XML file, for example, then you must import the XML data into Excel.
Excel usually stores imported XML data in an XML list, a range that looks and operates much like a regular Excel list, except that it has a few XML-specific features.
External data often changes, and you can ensure that you are working with the most up-to-date version of the information by refreshing the imported data.
Refreshing the imported data means retrieving the most current version of the source data. This is a straightforward operation most of the time. However, it is possible to construct a query that accesses confidential information or destroys some or all the external data. Therefore, when you refresh imported data, Excel always lets you know the potential risks and asks if you are sure the query is safe.
Remember, as well, that most external data resides on servers or in remote network locations. Therefore, the refresh may take some time, depending on the amount of data, the load on the server, and the amount of traffic on the network.
Excel asks if you trust the data source.
Excel refreshes the imported data.
The refresh may take a long time. To check the status of the refresh, click the Refresh Status button (
If the refresh is taking too long, click the Cancel Refresh button (
3.136.18.141