Memory Mapped I/O

Let's return to the topic of memory-mapped I/O. We stated that file channels/buffers are an alternative to reads/writes on streams. Memory-mapped I/O is an alternative to both, implemented as a refinement to channels/buffers. The whole point of mapped I/O is faster I/O. When transferring large amounts of data, mapped I/O can be faster because it uses virtual memory to make the file contents appear in your address space. It takes some initial setup and puts more work on the virtual memory subsystem, but mapped I/O avoids the extra copying from buffers in kernel memory into buffers in your process.

When you read with a stream, the OS reads from the disk into a buffer owned by the device driver and then moves the contents from kernel space to your buffer in user space. Memory mapping only needs a couple of bits twiddled in the VM system to say “that disk page is now part of this process address space.” So why doesn't everyone use mapped I/O all the time? Kernel whackers do, and the rest of the world is still hearing about the feature. Also, it's not part of the ANSI C API, which is one of the most widely used I/O APIs.

It came from Multics

Mapped memory is also known as shared memory. As well as offering performance improvements for larger files, it can be used for bulk data transfer between cooperating processes that all map in the same file. These processes don't even have to be on the same system, as long as the same file is visible to each. Mapped files were first used in Multics, the 1960s operating system that was wildly over budget and schedule, but which was the stepfather of Unix (and thus the ancestor of Linux, MaxOS X, and Solaris, too).

When you do a map operation on a FileChannel to map a file into memory, your return value is a mapped byte buffer that is connected to the file. The run-time system is expected to use the operating system features for memory mapping. The result is that when you write in the buffer, that data appears in the file. If you read from the buffer, you get the data that is in the file. Everyone is familiar with the way an operating system can read an executable file and make the instructions appear in the address space of a process. Mapped I/O does essentially the same thing for data files. The signature of FileChannel's map method is:

MappedByteBuffer map(int mode, long position, int size)
     throws IOException;

The position argument is the offset in the file where you want the mapping to start. This will usually be offset zero, to start at the beginning. The size is the number of bytes that you want from the file. This will usually be myFile.length() to get the whole thing.

The mode argument is one of FileChannel.MapMode.READ_ONLY,

FileChannel.MapMode.READ_WRITE, or FileChannel.MapMode.PRIVATE, for read-only, read-and-write, or copy-on-write mapping. Copy-on-write is a variation of read-and-write mapping that says “if any process changes the content of this map it gets its own private copy with its change; everyone else can carry on sharing the unchanged version.” It's mostly used in systems programming to share data pages of executables, and there was little reason to hide the semantics from Java, even though it's not something used much by applications.

Here's an example of mapping a FileChannel and, hence, the underlying File into memory:

File f = new File("data.txt");
FileInputStream fis = new FileInputStream(f);
FileChannel fc = fis.getChannel();

MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_ONLY, 0, (int) f.length() );

Note that file lengths are given in a long, but that you can only map an int's worth of memory (2 GByte) in any one map, so you must make sure the map size argument is typed as an int. We do that here by using the cast “(int)”. If the physical memory available to your JVM cannot hold all the file, the virtual memory subsystem will bring in pieces of it as needed without you doing anything, or even being aware of it.

Direct buffers

A MappedByteBuffer is also termed a “direct” buffer. A direct byte buffer may also be created by invoking the allocateDirect factory method of this class. The buffers returned by this method typically have somewhat higher allocation and deallocation costs than non-direct buffers. So direct buffers should only be used for large, long-lived buffers that are subject to the underlying system's native I/O operations. The code that follows shows a file being written using a channel and buffer. Then the same file is read back in using mapped I/O. The data is compared with what was written, and it had better match.

Mapped I/O example

import java.io.*;
import java.nio.*;
import java.nio.channels.*;
public class MyMap {
   public static void main(String args[]) throws Exception {
      FileOutputStream fos = new FileOutputStream("ints.bin");
      FileChannel c = fos.getChannel();
      ////////////// write using channel and buffer //////////
      ByteBuffer bb = ByteBuffer.allocate(40);
      IntBuffer ib = bb.asIntBuffer();   // this is a view
      // fill the buffer
      for (int i=0; i<10; i++)  ib.put(i);
      // write the buffer full of ints to the channel and thus file
      c.write(bb);
      c.force(true);   // commit to disk
      c.close();

      /////////////////////// read back using mapped I/O //////////
      // read back loads of ints into a channel
      FileInputStream fis = new FileInputStream("ints.bin");
      c = fis.getChannel();
      MappedByteBuffer mbb = c.map(FileChannel.MapMode.READ_ONLY, 0, 40);
      // int num = c.read(mbb);   // you don't read a mapped buffer!
      System.out.println("byte buff capacity: " + mbb.capacity() );
      System.out.println("byte buff position: " + mbb.position() );
      System.out.println("byte buff    limit: " + mbb.limit() );
      for (int i=0; i<10; i++)  {
          int j = mbb.getInt();
          if (j != i) System.out.println("data mismatch: "+i+","+j);
      }
      System.out.println("Read the ints back from file ok");
   }
}

Many details of memory-mapped file behavior are inherently dependent upon the underlying operating system, and so they are not specified in Java.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.18.141