New I/O Package

image

JDK 1.4 introduced a package called java.nio. “Nio” stands for “new I/O” (so the JDK 1.5 I/O improvements must be “even newer than new I/O”). The java.nio package and subpackages support four important features not previously well-provisioned in Java:

  • A non-blocking I/O facility for writing scalable servers

  • A file interface that supports locks and memory mapping

  • A pattern-matching facility based on Perl-style regular expressions

  • Character-set encoders and decoders

Instead of building these on top of streams or file descriptors, these features are implemented using two new concepts: buffers and channels.

The right way to understand a buffer is to think of it as a big array in memory that holds data. Just like an array, a buffer can only hold things that are all the same type. So you can have a byte buffer, a char buffer, a short, double, float, and int buffer. If your file contains a mixture of floats and ints, you can pull them out of a Byte Buffer. Byte Buffer has methods to get and put all the primitive types except boolean.

The idea behind the Buffer class is to have a region of memory which can be accessed from both native code and Java at the same time, and that region has some special I/O characteristics in the native OS. Buffer is a kind of “native” array—native code can access it directly, and Java can use method calls to get its hands on the contents.

Because a buffer is essentially an area of memory, it can do things relating to memory, like clear its contents, support read/write or read-only operations, give you a range of elements, and tell you how many data elements it contains.

The second new concept in java.nio is the channel. A channel is a connection between a buffer and something that can give or receive data, such as a file or a socket. Because a channel connects to an underlying physical device, it can do things relating to an I/O device like support read/writes or provide file locks. There are channel classes specialized for files, for sockets, for pipes, and so on. You may think of a channel as an alternative to a stream. It has fewer fancy features (no elaborate wrapper classes), but it may have higher performance.

The following sections describe some features of java.nio and how to use them.

Multiplexed non-blocking server I/O

Channels have their own package at java.nio.channels. The package contains classes called DatagramChannel, SocketChannel, and ServerSocketChannel, among others. The single, critical feature that channels offer, and that classes DatagramSocket, Socket, ServerSocket do not, is the ability to do I/O without blocking the thread.

Non-blocking

The class java.nio.channels.SelectableChannel supports non-blocking I/O using what is called “selector-based multiplexing.” Blocking I/O means that the I/O method does not return until the data transfer has taken place. If there is no input to read yet (because of network delays or because the user didn't type it yet), the entire thread will be blocked from continuing further. That's bad in terms of resource consumption. Simple non-blocking I/O, also known as “asynchronous I/O,” means that you make the method call and it schedules the data transfer to take place at some future point, then returns almost immediately. No threads are blocked waiting.

Non-polling

Simple non-blocking I/O is helpful, but you can still burn up a lot of unproductive cpu time polling each descriptor to see if it is ready for more I/O yet. Selector-based multiplexing avoids this waste. It essentially says “monitor all these socket channels, and let me know when the next one of them is ready with data to transfer.” The verb “to multiplex” means “to transmit several messages over the same medium all at the same time.” Many different TV stations are multiplexed onto the cable of cable TV. The multiplexed part of selectors is that several data transfers may be underway at once, and the run-time system scans for pending data on the whole set of channels.

We won't review multiplexed I/O in depth, except to say that it provides the same kind of scalable I/O support as the select() system call available on all server operating systems. When select() is called, it blocks until one of a given set of socket descriptors is ready for reading or writing, or a timeout expires (whichever comes first).

Get a channel from a socket

To use multiplexed I/O in Java, first you get the channel from a socket. Then you register one or more channels with a Selector object, getting back a key. Finally, you do a select() on the Selector object. That waits until it can return a collection of keys that are ready for data transfer. Why is it OK for select() to block, but not other I/O methods? Because select() has a timeout feature. If no I/O becomes ready within so many milliseconds, the call returns. Also, you may be doing a select to get input from twenty sockets, but only one thread is blocked, not twenty.

Even if multiplexed I/O sounds involved, it is familiar to those who have used it in libraries for other languages. The additional objects of a key and a selector add a little more flexibility to the design. So now you can write non-blocking I/O in Java. When I/O is no longer a bottleneck in one thread, your web applications become scalable (when the load on the web server increases, you can speed it up by adding more processors).

Recovering from blocked server I/O

Channels support non-blocking I/O; they also make it easier to recover from blocked I/O. Server-based systems need to be reliable and scalable, but their threads can hang because of congestion, file mounting problems, or other remote access issues. Servers need the ability to get rid of threads that are “stuck” in network I/O. Scalability generally means that you assign a new thread to process each incoming request. That thread will be allocated to a socket, and that socket consumes a file descriptor. The system and each process have limited quantities of file descriptors. The idea is that the request comes in on a socket, where a thread serves it with any necessary I/O, database access, etc. The thread returns the answer to the client, then the thread and socket terminate and are reclaimed for further use.

All I/O in Java up to JDK 1.3 was synchronous or “blocking.” When you execute a read or write instruction in Java, the method does not return until the data has been safely passed along. If there is a delay in the user response or the operating system or network, such that the data transfer cannot complete, then that thread hangs (waits indefinitely). When a thread becomes non-responsive for any reason, those file descriptors stay unavailable, eventually dragging down performance by reducing the number of sockets available to handle requests.

Some system designs rely on sockets not being consumed in this way. The designers of Java originally provided a method called interrupt() in class Thread. If some other thread called the interrupt() method of a Thread that was blocked, it was supposed to break out of the hang and get an InterruptedIOException while leaving the stream open for further attempts. Unfortunately, it turns out that the Windows API does not have any way to implement this that is both efficient and reliable. The interrupt method never worked well.

Workaround to blocking I/O

Instead, programmers used the workaround of closing the file descriptor or handle that was not responding to the I/O request. That unwedged the thread at the cost of leaving the I/O in an unknown state. The cost is generally acceptable. The most common reason to interrupt a thread is to ask it to shut down. If you plan to shut the thread down anyway, you might as well discard any socket in an unknown state and recover by opening a new connection.

Channels now officially provide the close-on-interrupt semantics that were in widespread unofficial use before. Any time you need to break threads out of blocking I/O operations, you should use a channel. You can either close the channel of a blocked thread, causing it to receive an AsynchronousCloseException, or you can call the interrupt() method of a thread that is blocked on channel I/O, thus closing the channel and delivering a ClosedByInterruptException to the blocked thread. File channels are always safe for use by multiple concurrent threads. Channels correctly support asynchronous interruption and closing.

Getting and using a file channel

There are several classes that support channels, but we'll only talk about FileChannels here. The network kinds of Channel (Datagrams and Sockets) are similar, but support slightly fewer operations because it doesn't make sense to lock sockets (they are inherently single-user). You get a FileChannel by calling the getChannel() method upon an instance of one of the classes FileInputStream, FileOutputStream, or RandomAccessFile in the java.io package.

RandomAccessFile raf = new RandomAccessFile("C:\data.txt", "rw");
FileChannel myFileChannel = raf.getChannel();

The channel you get back is connected to the underlying physical file, and it will be open for read access in the case of FileInputStream, or for write access in the case of FileOutputStream. In the case of RandomAccessFile, the channel will be open for reading, or reading-and-writing, to match the mode that the random access file was instantiated with. In the code fragment above, you'll be able to read from and write into the channel to file c:data.txt.

Once you have a file channel, there are several methods to read it into a buffer, or write it out from a buffer. The channel connects to the underlying file, while the buffer provides a place in memory to put the bytes. We'll finish up this summary of channels, then move on to buffers in the next section.

Channels also have methods to transfer data to and from another channel, to apply exclusive access locks to a file, and to map the file into memory. Mapping a file into memory is an advanced OS technique that uses the virtual memory subsystem to bring some or all of a file into the address space of a process. File mapping is an alternative to the read and write system calls used by streams. It is explained with an example a little later.

Here is the signature of a FileChannel method that reads from the channel (and hence from the file that the channel is connected to) into a byte buffer:

int read(ByteBuffer dst) throws IOException

You would use it like this:

ByteBuffer myBB = ByteBuffer.allocate(1024);
bytesRead = myFileChannel.read(myBB);

Channels, maps, and buffers maintain a notion of their “current position”, just as file streams do. Bytes are read starting at this channel's current position, and then the position is updated with the number of bytes actually read. That number could be zero if nothing was read, or -1 to indicate that the channel has reached the end of stream. As with streams, you can “mark()” the current position to remember it, and then invoke reset() to return to that position later, and get the same input again (a fairly useless feature).

Here is the signature of a FileChannel method that writes from a byte buffer into the channel, and hence into the file that the channel is connected to:

int write(ByteBuffer src) throws IOException

It writes a sequence of bytes from the given buffer to the channel. It returns the number of bytes written, which may be zero. You would use it like this:

ByteBuffer myBB = ByteBuffer.allocate(1024);
// operations to put data into the byte buffer
myBB.put( ...   // we'll cover these soon
myBB.flip();   // changes over to writing the buffer
int bytesWritten = myFileChannel.write(myBB);

These lines of code write the bytes from the buffer through the channel into a file. Channels also support “scatter” reads into several buffers one after the other, and “gather” writes from several buffers into one channel. Scatter/gather I/O is convenient for certain protocol exchanges of fixed length messages. It also helps the kernel to use several small buffers instead of one big one.

What if you want to do filtering on the contents of a channel by wrapping additional classes, as we saw with the Reader/Writer and Streams classes? You cannot do that directly with a channel, but you can obtain a reader/writer/stream corresponding to a channel and then go on to wrap that in the usual way. The utility class java.nio.channels.Channels has half a dozen static methods that have the effect of converting each way between a channel and a Reader or a Writer or an InputStream or an OutputStream.

Buffers

You will create buffers either by an allocate method call, or by wrapping an existing array (or string for a character buffer) to form a buffer with the array contents, or by getting a buffer back from a channel map. You don't use a constructor to get a buffer: it's a hint that there is a lot more going on here than mere object allocation. Here's a sample line that obtains a 1 Kbyte buffer for you:

ByteBuffer bb1 = java.nio.ByteBuffer.allocate(1024);

Here's how you wrap an array to get a buffer that is filled with the contents of the array:

byte [] myByteArray = {0x11,0x22,0x33,0x44,0x55,0x66,0x77};
ByteBuffer bb2 = java.nio.ByteBuffer.wrap( myByteArray );

The wrapping is another example of the wrapper design pattern. A byte buffer can do pretty much all the things an array can do, and a few things of its own. As we saw above, you can write from a buffer into a channel, and thus into a file, pipe, or socket like this:

int count = fc.write(bb2);

This is very powerful. You can write an entire data array in two or three statements with no looping! Here's the entire program to read a file into a buffer:

Read a file using a Channel and a Buffer

import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class MyBuffer {
    public static void main(String[] a) throws Exception {
       // Get a Channel for the file
       File f = new File("email.txt");
       FileInputStream fis = new FileInputStream(f);
       FileChannel fc = fis.getChannel();
       ByteBuffer bb1 = java.nio.ByteBuffer.allocate((int)f.length());

       // Once you have a buffer, you can read a file into it like this:
       int count = fc.read(bb1);
       System.out.println("read "+ count + " bytes from email.txt");
       fc.close();
    }
}

Other Buffer methods

Once your buffer is loaded with data from a channel, how do you access that in your program, and can you change the buffer? There are several kinds of operations upon buffers. They are as follows:

  • After reading into a buffer, you need to rewind() it, to reset the position mark to the beginning. If you are about to start writing or getting from the buffer after a series of reads or puts, you need to call the flip() method.

  • The get() and put() methods that read and write the next single data item, e.g.:

    byte b = bb1.get(); // gets the next byte
    Double d = bb1.getDouble(); // gets the next 8 bytes into a double
    ByteBuffer result = bb1.putChar('X'), // puts a char into the buffer

Gets and puts are done at the current index position in the buffer, and move the position immediately past what was just transferred. All the putSomething() methods are optional. They will only be supported if the underlying operating system supports this operation on a buffer.

There are get() and put() operations for these types: byte, char, double, float, int, long, and short. Notice that you read bytes into a buffer, but are able to get() and put() larger pieces of data. As always, you have to know the types of data that are stored in your files. A program cannot figure that out from looking at the bits.

  • Absolute get() and put() methods that read and write a datum at a given offset, e.g.,

    bb1.put(319, (byte) 0xF); // puts this byte at offset 319 in the buffer
    bb1.putLong(256, 1234567890L); // puts this long at offset 256 in the buffer

  • Bulk get() methods that transfer a sequence of bytes from this buffer into a byte array, e.g.,

    byte[] destination = new byte[2048];
    bb1.rewind();
    bb1.get( destination ); // fills the array from the buffer

You should avoid unnecessary copying of data for performance reasons. Work directly with the buffer where possible.

  • Bulk put() methods that transfer contiguous sequences of bytes from an array into this buffer, e.g.,

    byte[] b2 = { 1,2,3,4,5,6,7,8,9,0xA };
    bb1.put(b2);

    You can also “wrap” an array around a buffer. That causes the buffer to be filled with the contents of the array. Unlike put(), it also causes further changes to either of the buffer or the array to be reflected in the other. One way to implement this is to relocate the buffer to occupy the same storage as the array.

    byte[] b3 = { 1,2,3,4,5,6,7,8,9,0xA };
    bb1.wrap(b3);

    The buffer's capacity changes to match that of the wrapping array.

  • Methods for allocating, compacting, duplicating, and extracting a subrange of (“slicing”) a buffer.

View Buffers

Channels can only read from or write into a byte buffer. Even if the underlying file contains ints or longs, a channel cannot write into an int buffer. However, after you have read in the bytes, you can open a view buffer that is a differently typed interpretation of the underlying byte buffer. A view buffer is simply another buffer whose content is backed by the byte buffer. Changes to the byte buffer's content will be visible in the view buffer, and vice versa; the two buffers' current position and sizes are independent.

Here is how you get a view buffer that interprets its data as floats:

bb1.rewind();  // need to move buffer index back to beginning
FloatBuffer myFB = bb1.asFloatBuffer();

There are corresponding asSomethingBuffer() methods for the types char, short, int, long, and double. View buffers have a couple of advantages compared with the type-specific get and put methods described above. A view buffer is indexed in terms of the size of its values, not individual bytes. If you execute myFB.put(8, 3.14159F), it will make the 8th float in the buffer (bytes 56 to 63), not the 8th to 11th bytes, have the value of pi. A view buffer also provides bulk get and put methods for its type, as shown in the following complete program.

A bulk transfer from a buffer to an int array

import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class Buffer2 {
    public static void main(String[] a) throws Exception {
        // Get a Channel for the file
        FileInputStream fis =
                       new FileInputStream("numbers.bin");
        FileChannel fc = fis.getChannel();
        ByteBuffer bb = java.nio.ByteBuffer.allocate(400);
        // fill the buffer from the file
        int count = fc.read(bb);

        bb.rewind(); // need to move buffer index back to start
        IntBuffer ib = bb.asIntBuffer();
        // bulk get into an int array
        int[] myIntArray = new int[50];
        ib.get( myIntArray );
        for (int i=0; i<5; i++) {
           System.out.println( "arr["+i+"]="+ myIntArray[i] );
        }
        fc.close();
    }
}

If you compile and run this program, you will see printed the first few values in the file numbers.bin. Create the file first and put any junk in there. The contents are brought into the program with a channel that is read into a byte buffer. The byte buffer is rewound and then overlayed with an int view buffer. An int array is then filled with data from the int view buffer in a single get operation. Finally, the first five ints in the array are printed. You should compare them with the values you get by reading numbers.bin with a data input stream. Depending on what is in the file originally, you will see output like this:

arr[0]=2003461731
arr[1]=1751280235
arr[2]=1986164595
arr[3]=1986947691
arr[4]=1646294541

View buffers and characters

There is one further note about buffers. You may have noticed that the first example, MyBuffer.java in the buffer section, used a file called “email.txt” that obviously contained characters. When we read it in, these characters ended up in Java 8-bit bytes, not in Java 16-bit characters. What if we wanted to move each single ASCII byte in the file into a Java double-byte char? It turns out that this is now simple to do automatically.

Once you have the data from your file in a byte buffer, you can specify a new encoding and decode it from one buffer into another. The last section of this chapter is a lengthy description of character set encodings and the order in which bytes may appear. Here is the code that reads ASCII characters from a file, and ensures that they end up in Java Unicode double-byte chars:

// Get a Channel for the source file
FileInputStream fis = new FileInputStream("email.txt");
FileChannel fc = fis.getChannel();

// Get a Buffer from the source file
MappedByteBuffer bb =
   fc.map(FileChannel.MapMode.READ_ONLY, 0, (int)fc.size());

Charset cs = Charset.forName("8859_1");
CharsetDecoder cd = cs.newDecoder();
CharBuffer cb = cd.decode(bb);

These lines are part of a complete example presented later.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.122.82