Chapter 9. Input and Output

Introduction

Most programs need to interact with the outside world, and one common way of doing so is by reading and writing files. Files are normally on some persistent medium such as a disk drive, and, for the most part, we shall happily ignore the differences between a hard disk (and all the operating system-dependent filesystem types), a floppy or zip drive, a CD-ROM, and others. For now, they’re just files.

Correcting Misconceptions

Java’s approach to input/output is sufficiently different from that of older languages (C, Fortran, Pascal) that people coming from those languages are often critical of Java’s I/O model. I can offer no better defense than that provided in the preface to Elliotte Rusty Harold’s book Java I/O :

Java is the first programming language with a modern, object-oriented approach to input and output. Java’s I/O model is more powerful and more suited to real-world tasks than any other major language used today. Surprisingly, however, I/O in Java has a bad reputation. It is widely believed (falsely) that Java I/O can’t handle basic tasks that are easily accomplished in other languages like C, C++, and Pascal. In particular, it is commonly said that:

-- I/O is too complicated for introductory students; or, more specifically, there’s no good way to read a number from the console.

-- Java can’t handle basic formatting tasks like printing PI with three decimal digits of precision.

[Rusty’s book shows] that not only can Java handle these two tasks with relative ease and grace; it can do anything C and C++ can do, and a whole lot more. Java’s I/O capabilities not only match those of classic languages like C and Pascal, they vastly surpass them.

The most common complaint about Java I/O among students, teachers, authors of textbooks, and posters to comp.lang.java is that there’s no simple way to read a number from the console (System.in). Many otherwise excellent introductory Java books repeat this canard. Some textbooks go to great lengths to reproduce the behavior they’re accustomed to from C or Pascal, apparently so teachers don’t have to significantly rewrite the tired Pascal exercises they’ve been using for the last 20 years. However, new books that aren’t committed to the old ways of doing things generally use command-line interfaces for basic exercises, then rapidly introduce the graphical user interfaces any real [desktop] program is going to use anyway. Apple wisely abandoned the command-line interface back in 1984, and the rest of the world is slowly catching up. Although System.in and System.out are certainly convenient for teaching and debugging, in 1999 no completed, cross-platform program should even assume the existence of a console for either input or output.

The second common complaint about Java I/O is that it can’t handle formatted output; that is, that there’s no equivalent of printf( ) in Java. In a very narrow sense, this is true, because Java does not support the variable length arguments lists a function like printf( ) requires. Nonetheless, a number of misguided souls (your author not least among them) [has] at one time or another embarked on futile efforts to reproduce printf( ) in Java. This may have been necessary in Java 1.0, but as of Java 1.1, it’s no longer needed. The java.text package, described in Chapter 16 [of Rusty’s book, and in Chapter 5 of the present work], provides complete support for formatting numbers. Furthermore, the java.text package goes way beyond the limited capabilities of printf( ). It supports not only different precisions and widths, but also internationalization, currency formats, grouping symbols, and a lot more. It can easily be extended to handle Roman numerals, scientific or exponential notation, or any other number format you may require.

The underlying flaw in most people’s analysis of Java I/O is that they’ve confused input and output with the formatting and interpreting of data. Java is the first major language to cleanly separate the classes that read and write bytes (primarily, various kinds of input streams and output streams) from the classes that interpret this data. You often need to format strings without necessarily writing them on the console. You may also need to write large chunks of data without worrying about what they represent. Traditional languages that connect formatting and interpretation of I/O and hard-wire a few specific formats are extremely difficult to extend to other formats. In essence, you have to give up and start from scratch every time you want to process a new format.

Furthermore, C’s printf(), fprintf(), and sprintf( ) family only really works well on Unix (where, not coincidentally, C was invented). On other platforms the underlying assumption that every target may be treated as a file fails, and these standard library functions must be replaced by other functions from the host API.

Java’s clean separation between formatting and I/O allows you to create new formatting classes without throwing away the I/O classes, and to write new I/O classes while still using the old formatting classes. Formatting and interpreting strings are fundamentally different operations from moving bytes from one device to another. Java is the first major language to recognize and take advantage of this.

To which I can only add, “Well said, Rusty.” What Rusty doesn’t mention is an obvious corollary of this flexibility: it can often take a bit more coding to do some of the command-line, standard-in/standard-out operations. You’ll see most of these in this chapter, and you’ll see throughout the book how flexible Java I/O really is.

This chapter covers all the normal input/output operations such as opening/closing and reading/writing files. Files are assumed to reside on some kind of file store or permanent storage. I don’t discuss how such a filesystem or disk I/O system works -- consult a book on operating system design for the general details, or a platform-specific book on system internals or filesystem design for such details. Network filesystems such as Sun’s Network File System (NFS, common on Unix and available for Windows though products such as Hummingbird NFS), Macintosh Appletalk File System (available for Unix via NetATalk), and SMB (MS-Windows network filesystem, available for Unix with the freeware Samba program) are assumed to work “just like” disk filesystems, except where noted. And while you could even provide your own network filesystem layer using the material covered in Chapter 16, it is exceedingly difficult to design your own network virtual filesystem, and probably better to use one of the existing ones.

Streams and Readers/Writers

Java provides two sets of classes for reading and writing. The Stream section of package java.io (see Figure 9-1) is for reading or writing bytes of data. Older languages tended to assume that a byte (which is a machine-specific collection of bits, usually eight bits on modern computers) is exactly the same thing as a “character” -- a letter, digit, or other linguistic element. However, Java is designed to be used interanationally, and eight bits is simply not enough to handle the many different character sets used around the world. Script-based languages like Arabic and Indian languages, and pictographic languages like Chinese, Japanese, and Korean each have many more than 256 characters, the maximum that can be represented in an eight-bit byte. The unification of these many character code sets is called, not surprisingly, Unicode. Actually, it’s not the first such unification, but it’s the most widely used standard at this time. Both Java and XML use Unicode as their character sets, allowing you to read and write text in any of these human languages. But you have to use Readers and Writers, not Streams, for textual data.

java.io classes

Figure 9-1.  java.io classes

You see, Unicode itself doesn’t solve the entire problem. Many of these human languages were used on computers long before Unicode was invented, and they didn’t all pick the same representation as Unicode. And they all have zillions of files encoded in a particular representation that isn’t Unicode. So conversion routines are needed when reading and writing to convert between Unicode String objects used inside the Java machine and the particular external representation that a user’s files are written in. These converters are packaged inside a powerful set of classes called Readers and Writers. Readers/Writers are always used instead of InputStreams/OutputStreams when you want to deal with characters instead of bytes. We’ll see more on this conversion, and how to specify which conversion, a little later in this chapter.

See Also

One topic not addressed here is the issue of hardcopy printing. Java includes two similar schemes for printing onto paper, both using the same graphics model as is used in AWT, the basic Window System package. For this reason, I defer discussion of printing to Chapter 12.

Another topic not covered here is that of having the read or write occur concurrently with other program activity. This requires the use of threads, or multiple flows of control within a single program. Threaded I/O is a necessity in many programs: those reading from slow devices such as tape drives, those reading from or writing to network connections, and those with a GUI. For this reason the topic is given considerable attention, in the context of multi-threaded applications, in Chapter 24.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.20.68