4. Building a Bento Library from Your Own Data

This chapter shows you how to organize your data and how to import data from other sources such as spreadsheets.

Getting Started Organizing Your Data

In the first few chapters of this book, you saw the basics of Bento: how to use the Bento window, how to create and delete records, and how to use the Bento data management tools such as finding and sorting. You even saw how to create databases from the Bento templates and started finding out how you can customize the look and feel of your personal databases.

This chapter helps you move in a new direction. You know the Bento basics; now what can you actually—practically—use it for in your own life? The acid test is whether Bento can help you organize all the data you have floating around your computer. Many people have accumulated lots of data over the years, and computer disks have grown bigger and bigger. If you buy a new computer periodically, the hard disk will be larger than the one you are used to, and you might be able to move all your data from the old crowded computer and its hard disk to the new hard disk and have plenty of space left over.

The data can consist of old emails and documents that just might be needed some day, which can be in the form of spreadsheets and even a variety of database formats, large and small. Maybe you have AppleWorks files that you rely on for your daily activities. The format may be old, but the data is valuable, and—most important—Bento allows you to get to it.

This chapter shows you how to go about moving data into Bento, where it can be stored and searched quickly. If you are using old software to manage your data, you may legitimately be worried that at some point the software might break and you may not be able to find a new version, leaving your data inaccessible. (The term for such data and software is legacy, as in legacy software or legacy data.) Although you probably do not want to think about it, the fact is that data stored in old software and unsupported formats is a critical risk to any person or organization that relies on it.

image Importing data into a new library as described in this chapter is simpler than importing data into an existing library. For more details on importing and exporting data and libraries, see Chapter 14, “Importing and Exporting Bento Data and Libraries,” p. 213.

Reviewing Your Legacy Data

Moving data into Bento is not difficult, and it is something that you should do for your legacy data. Start by protecting your most important data and the data for which your applications are old and not easily replaced or updated. Often, the easiest way to do this is to move them into Bento.

Data that you move into Bento is going to wind up as a set of records with fields in them because that is the basic Bento paradigm. Often, you move data from spreadsheets or database tables because those are comparable structures. Sometimes, the data is in word processing documents formatted as a table. Free-format data that is not tabular in nature is not a particularly good candidate for Bento import.

Note

Importing is used to describe the process of reading a file and converting it into one or more Bento records automatically. Though importing data that is in tabular format can be a simple, automated process, even the most free-format data can still be moved into Bento by copying and pasting each data element individually into a Bento record.

Working with Data Formats

Instead of converting your data into records and fields, you can often take the existing file and reference it from Bento. (Referencing a file means just that: the file remains where it is and Bento can reference it to provide access to its data.) You can import references to files in Bento, so your movies, free-form poetry, and other data are accessible and organized by Bento. However, remember that the application that Bento uses to open the file needs to be available. Storing a reference to a file that needs to be opened by a program you do not have or that is no longer available does not do you any good.

Fortunately, many programs allow you to save your documents in alternate formats. These are generally available in Save As or Export commands located in the File menu. Figure 4.1 shows the Microsoft Word Save As menu.

Figure 4.1. Many applications let you save a copy of a file in a new format.

image

These formats are a combination of proprietary formats and formats that are widely available. Many programs support older versions of their native formats for reading and writing. Word, for example, lets you save backward-compatible files going back to Word 4.0. You can also save the basic text in a variety of formats; these files can be opened by any program that can read text. Rich Text Format (RTF) is a standard format that preserves a great deal of style information including fonts and basic paragraph formatting. Saving old word-processing files in RTF or a text-based format makes them accessible by many programs.

The iWork applications Pages and Numbers use an Export command, as shown in Figure 4.2. In iWork, the Save As command is used to save a copy of the file with another name and to control options such as whether movies are saved in the file. These are still Pages and Numbers files; Export is reserved for changing the file format. Also note in Figure 4.2 that you can set security for the exported files. If you are working on your own computer, you can use the defaults, which have no security settings. If you are using a shared computer, consider using security depending on the circumstances of the data and the user environment.

Figure 4.2. iWork applications use the Export command.

image

Notice that you can save Numbers files as native Excel spreadsheets, and you can save Pages files as native Word documents. Plain text and RTF are also supported in Pages, while comma-separated value (CSV) is supported in Numbers. Both support Portable Document Format (PDF).

A standard installation of Mac OS X includes TextEdit, which can read and write both plain text and RTF files. This means that if you convert old files to either of those formats, you are able to store references to them in Bento and open them automatically. (You can also open them from the Finder.)

If you do not have Adobe Reader to read PDF files, you can download it for free from www.adobe.com. This provides you with another standard format to use to convert your legacy files. It also provides you with a safety net in case your legacy application does not support Save As or Export. In the Print dialog that is used by most applications, you can use the PDF button in the lower left to save the printed output as a PDF file, as shown in Figure 4.3.

Figure 4.3. Save printouts as PDF files.

image

In most cases, this means that if you can print it, you can save the image as a PDF file and open that with a double-click that launches Adobe Reader if necessary. (Many applications, including Safari, can open PDF files themselves.)

If you are worried about not being able to open old files in the future, converting them to text, RTF, or PDF should solve the problem.

A related issue of file formats arises with media files. You can use Bento’s media fields to store images, sounds, and movies. Supported file types include PDF, JPG, TIFF, GIF, MP3, PICT, MOV, and PSD (Photoshop).

image In Bento 3, integration with iPhoto makes your photo libraries easily available to Bento. There is more on this in Chapter 10, “Using Built-In Bento Libraries for Address Book and iPhoto,” p. 155.

If you have an image, sound, or movie that is not in a supported format, you can often use the Export or Save As command in the other application to convert it.

Show Me: Media 4.1—A video about using file references in Bento

image

Access this video file through your registered Web Edition at my.safaribooksonline.com/9780131388611/media.

Performing a Basic Data Import

This section walks you through the process of importing data either from your own data (current or legacy) or data that you download from the Web. CSV is generally provided as an exportable format from spreadsheets such as Excel or Numbers and databases such as FileMaker Pro.

The data used in this example is from the U.S. Census Bureau; it consists of population projections from 2004 to 2030, as well as the 2000 census data. The data is arranged by state, age, and sex. It consists of 13,572 records, so it provides a good example of a fairly substantial data import. In addition, the data demonstrates some of the issues you need to watch out for when importing data.

Note

The data file can be downloaded from the U.S. Census Bureau website (www.census.gov) along with a number of other data files. In addition, you can download a ZIP archive of the file from this book’s site as described in the introduction. Despite the massive amount of data in the file, the file itself is actually quite small, just 2.3MB; the ZIP archive is 1MB.

Bento supports a variety of input formats. Comma-separated values (CSV), is a common format for spreadsheets and some databases. In addition to CSV data, you can import data directly from Excel spreadsheets, from Numbers spreadsheets, and from tab-delimited files such as those created by AppleWorks. The basic process is the same as importing CSV data.

CSV data can be numeric or text and looks like this:

red, blue, green
1, 2, 3
hello there, Henry IV, 20,354

As you can see, the commas separate the data values, although in other formats you may see a different separator such as a semi-colon used. The first thing to check in your data is whether it contains commas that are not separators. For example, in the third line of this example, are there three values (the last one being a number—20,354), or are there four values (the last two being two numbers—20 and 354)?

If data contains embedded commas, you can place the entire value inside quotation marks. There is no ambiguity in this line of data, which contains three (not four) values:

Hello there, Henry IV, "20,354"

If the data itself contains quotation marks, double them. For example, the first element of this line of data might be a quote that is written:

"Hello, there"

If you want to preserve the quotes, it should be written as:

""Hello, there""

In most cases, you do not have to worry about this problem. If you are working with an application that can export CSV data (such as FileMaker Pro, Numbers, or Excel), all this is done for you automatically.

The issues of embedded commas and quotes are one of the reasons some people prefer tab-delimited formats. With these, each value is separated from the next one on a line by a tab character. This means that you can easily use commas and quotation marks within the data. However, the problem is not solved if your data itself contains tab characters. Furthermore, if you are manually checking the data, commas or semi-colons are easy to spot: tab characters do not print. Fortunately, for any given set of data, it is usually possible to pick a delimited format that is satisfactory, at least for most of the data. The message to be learned is that relying on any characters to delimit data fields is prone to problems. Get the data into a format where this is not an issue: Bento’s internal format is one such, as are the formats used internally by Excel and Numbers. When Bento reads such a file, it can correctly read commas, semi-colons, and tabs without confusion. And once your data is stored in a format that is unambiguous about its delimiters, you are safe.

Another thing Bento handles for you automatically is labeling the data. If you have labels in your file, the first record consists of the names of the fields being exported. (This may be an export option in the program from which you are exporting data.) If this option is available, use it; it makes your import easier, as you see later in this section.

Tell Me More: Media 4.2—A discussion about checking your data conversion

image

Access this video file through your registered Web Edition at my.safaribooksonline.com/9780131388611/media.

image LET ME TRY IT

Import Data Into a New Bento Library

The simplest way to import data into Bento is to automatically create a new library.

  1. Begin by choosing File > Import. The Import window shown in Figure 4.4 opens.

    Figure 4.4. Begin to import data.

    image

  2. Select the file to import, and, for its destination, select New Library. Provide a name for the library. Bento attempts to open the file and display the first record in the main section of the Import window. In this case, the first record contains field names, and they are displayed. Use the left and right arrows below the data display to move through the file. As you can see from the checkbox beneath the field values, you have an option of whether or not to use the first record as field names.

    Tip

    As shown at the bottom of Figure 4.4, you can step through the data and choose any record to use as the column names. That record is treated as the first record in the file, and the remaining records are imported. If you select record 5 as the column names, the data import starts with column 6.

  3. As you can see, you can use a pop-up menu in Figure 4.4 to choose the delimiter for your data. If you are uncertain which character to use, experiment with the three choices. When you hit on the correct one, the display at the bottom of Figure 4.4 will look better organized than with the other choices.
  4. Step through a few records, as shown in Figure 4.5, to make certain that the data looks correct. Note that the field names picked up from the first (or selected) record in Figure 4.4 are shown next to the data values as you step through each record. If commas or quotation marks are mismatched, you will often see the problem here, and you can correct the import data. Also check the total number of records. If it is off, that, too, is likely a comma or quotation mark issue.

    Figure 4.5. Step through the records.

    image

  5. By default, all fields are imported as text. Change the field types to the actual types of the data. Do this based on the field names as well as on the data. For example, data that is organized into tabular form often contains totals and subtotals. These can pose issues that you need to deal with. In the population data, for example, you can easily see that the Age field contains the age of people. However, that does not mean that it is a numeric field.

    The Age field sometimes contains the word “Total,” as shown in Figure 4.5. In addition, it contains the value 85+ for people 85 or older.

    Figure 4.6 shows how you can change the field types. You can also indicate that a field is not to be imported. Remember that the field names and types apply to all records. Changing them in the Import dialog has to be done once—not separately—for each imported record.

    Figure 4.6. Change field types as needed.

    image

  6. Now you are ready to import the data. Bento is very fast. (This data import takes less than a minute.) Bento creates a library for you, and you can then use the techniques in the preceding chapters to rearrange the fields and customize it, as shown in Figure 4.7.

    Figure 4.7. Customize the library.

    image

  7. Here is another area where you can check the data import. You can show the summary row in table view and total various columns or count the values as shown in Figure 4.8.

    Figure 4.8. Use the summary row to check your data import.

    image

image You will find more about using the summary row in Chapter 6, “Working with Bento Fields and Calculations,” p. 87.

You can do this in conjunction with finding subsets of data. This is one of the reasons you need to know what you are importing. This data set (like many) includes total and subtotal rows. That means that the totals of columns are much too high because they contain basic data as well as totals and subtotals. This is a common occurrence. You may want to use what are called check totals, which are simple arithmetic totals of the numbers in a column, possibly including subtotals and totals. The number is meaningless except to check that the values have been entered or transferred correctly.

In fact, there are quite a few anomalies that you may run into. The next section gives you some tips about cleaning up data.

Show Me: Media 4.3—A video about importing data into a new Bento library

image

Access this video file through your registered Web Edition at my.safaribooksonline.com/9780131388611/media.

Cleaning Up Imported Data

When you are importing data to Bento or any other destination, you often need to do some cleanup work. It is usually easiest to do the cleanup before the import. This is because if you are using the original program, you have all the tools available to manipulate the data. In addition, the cleaner the data, the fewer import errors you have.

The most basic form of cleanup is to verify the data. Are the phone numbers still valid? Are the names correctly spelled? Of course, while it is desirable to clean up the data as much as possible and as early as possible in the process, this is the real world. One way to handle data importing is to add a new field to the Bento library after you have imported the data. Make it a checkbox field called Verified. Leave the data in its basic state, and, as you use it in Bento, check that it is valid. This means that the data you actually use is cleaned up before the rarely used data. Check the Verified checkbox so that you know you have checked individual data records. Gradually, you will clean up most or even all your data. (And remember that you can find data that is not verified using Advanced Find.)

Tip

A variation on this strategy works for data that can change over time. Instead of a Verified checkbox, create a date field called Last Checked. Whenever you check the data (perhaps by actually using the email address or telephone number), choose Insert > Current Date and Time to update that field.

There is another common cleanup process that’s often necessary when importing spreadsheets. Spreadsheets today occupy a place somewhere in the middle of databases (because they support data storage and searching), word processing documents (because you can format them for display and printing), and traditional data tables (rows and columns). The hardest spreadsheets to import are those that are specially formatted to look their best. As you have seen, Bento is happy to import an initial record with field names. But what happens when you have five beautifully formatted rows of text with titles centered (and perhaps with parenthetical comments)?

What happens is that you have to remove all that beautiful formatting and extraneous text so that you have only a single title row (or none at all). Naturally, you should do this in a copy of the original spreadsheet. (To help solve this problem, use the option to select a row other than the literal first row of the table for column names and a location to start importing.)

In the old days, a spreadsheet document consisted of a single sheet. Today, a spreadsheet can have several sheets within it, and formulas can reference all the sheets within a spreadsheet workbook. But old habits die hard, and many spreadsheets have a single set of rows and columns in the upper left and a variety of smaller rows and columns tables (each with its own nicely formatted labels and titles) all over the spreadsheet. The only way to successfully import this data is to split the complex spreadsheet apart into basic rows and columns tables, each with at most a single title row at the top. Or in Bento simply copy these smaller tables and paste them into a table view.

Note

Dealing with these spreadsheets is not really a Bento issue. If you have one of these old spreadsheets, it still functions better as a spreadsheet if you split it apart into separate sheets, each of which is a simple rows and columns table.

Finally, consider cleaning up spreadsheets by removing total and subtotal rows. You can use these values for checking your import, but by stripping the spreadsheet down to its data and removing calculated totals and pretty formatting, you will make your life easier in the long run. (You can also omit them from the import dialog rather than deleting them from the spreadsheet itself.)

Show Me: Media 4.4—A video about cleaning up a spreadsheet to import

image

Access this video file through your registered Web Edition at my.safaribooksonline.com/9780131388611/media.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.137.10