Learning more about virtual file systems

Tcl, like almost every programming language, allows you to operate on files and directories existing in the underlying file system. In the 'good old days', the Tcl interpreter was able to operate only on native filesystems specific to the operating system it was compiled for. From version 8.4, Tcl incorporates the concept of virtual file systems (VFSs). The idea is to separate normal Tcl commands from the real file system calls. Such separation makes it easy to add support for additional filesystems, because all that is required is to create appropriate drivers. VFS allows us to redirect all FS related calls to the driver responsible for the proper handling of these calls.

Luckily for us, there is already a Tcl extension called tclvfs (http://sourceforge.net/projects/tclvfs/) that offers a number of drivers for various virtual file systems. The word 'virtual' means that the code operates in a unified manner on files / directories located inside something that seems to be a normal, native filesystem, but in reality, the filesystem could be anything - for example an FTP share or the contents of a .zip file. The real power of this solution is that you do not have to use any special commands to handle it once it is mounted.

Supported by tclvfs, VFSs are categorized as separate packages located in the vfs namespace. Before starting operations on any VFS, you have to mount it first. Two types of mount are available: a VFS can be mounted as a directory at the specified path, or more generally, the entire protocol is mounted. This second case is referred to as urltype, and it allows file paths starting with ftp://, http://, or file://. See the examples below in the FTP section.

The most important types of supported VFS are:

  • FTP: Allows you to operate on files on FTP server as though these files were stored in a local directory. To start using it, use the package require vfs::ftp or package require vfs::urltype commands. Consider the following example:
    package require vfs::ftp
    vfs::ftp::Mount ftp ftp://ftp.kernel.org/pub/linux/kernel/v2.6 v2.6
    file copy v2.6/linux-2.6.31.4.tar.bz2 .
    
    

    In this case, we mount the FTP directory containing kernel sources as a virtual v2.6 local directory (so this is the first type of mount), and then copy a file from it using the standard file copy command to the current directory.

    The second type of mount would be like this:

    package require vfs::urltype
    vfs::urltype::Mount ftp
    file copy ftp://ftp.kernel.org/pub/linux/kernel/v2.6/linux-2.6.31.4.tar.bz2 .
    
    

    Here you can assume that after the second line, a new kind of drive was created—ftp:// , and any access to that drive causes the underlying tclvfs library to connect to the FTP server and make it available for usage. Effectively, it will also result in a call to package require vfs::ftp, so there is no significant difference between the two types of mounts.

  • HTTP—this is similar to FTP and basically it is enough to replace ftp with http in previous examples.
  • ZIP—allows access to files stored inside the compressed ZIP archive (note the urltype access does not apply here). Consider archive test.zip; to list its contents, the following could be used:
    package require vfs::zip
    set test_mnt [vfs::zip::Mount test.zip test.zip]
    puts [glob test.zip/*]
    vfs::zip::Unmount $test_mnt test.zip
    
    

    The example code mounts the archive as virtual test.zip directory, lists its contents using the glob command, and unmounts the archive in the end. The output from that code is:

    test.zip/README.txt

    As you can see, the archive contains the README.txt file.

    Note that the main drawback of support for ZIP files is that they offer only read-only access, so while reading their contents is easy, modifying it using only vfs is impossible.

  • MK4—it is possible to use a Metakit database as a container for files and directories, thanks to the mk4vfs driver. In this way, such a database can be considered as an archive similar to a zip file. The driver offers full support for reading and writing files, so the Metakit database is a perfect candidate for keeping the files on our yet-to-be-created one file application.

    The following example shows the creation of empty archives, copying a file into it, and verifying if it's there:

    package require vfs::mk4
    vfs::mk4::Mount C:\myKit.mk myKit
    file copy C:\README.txt myKit
    puts [glob myKit/*]
    vfs::unmount myKit
    

    The output from that code is:

    myKit/README.txt
    
    

    The first command, package require vfs::mk4, loads the appropriate VFS driver (mk4vfs). Next, we mount the Metakit database file located at C:myKit.mk (note that the backslash character needs to be escaped with another backslash) to be available at the myKit mountpoint. If the file does not exist, it will be created. The README.txt file, containing the text "hello World!", is copied into the myKit mountpoint (and effectively into .mk file), and then we list its contents to check if the file is really there. The last line unmounts the database, causing all pending operations to commit. The .mk file created in this example will be the subject of investigation in the next section.

At the moment, Metakit has become a standard when it comes to storing and accessing files in an efficient and elegant way, but we are living in a fast-changing world and there is a possibility that soon some other, better solution will become popular and supplant the current one.

Getting into the details of VFS and Metakit

The good news is that it is not required for you to know or remember the internal structure of a Metakit database used with mk4vfs, because from the script level, you operate on its contents in a normal way, as you would on other file structures. If you feel that you do not need to know more about it, you can go directly to the next section.

Why was Metakit chosen as the practical implementation of the VFS container? It is the nested subviews feature that makes it a perfect candidate to reflect directories and files, as you will see in the next example.

So what really happens when you write a file into a directory that is mapped to a Metakit database? An appropriate VFS driver, in this case mk4vfs, captures the request to save files and stores them transparently in the database. Based on the knowledge that you already gained from this chapter, you can easily have a look into this database to get some idea of how the files are stored. The following simple code is one of the possibilities:

package require Mk4tcl
mk::file open myKit C:\myKit.mk
puts nonewline "layout of myKit: "
puts [mk::view layout myKit]
puts -nonewline "size of myKit.dirs: "
puts [mk::view size myKit.dirs]
puts "contents of first dirs entry:"
puts [mk::get myKit.dirs!0]
puts "contents of first files entry:"
puts [mk::get myKit.dirs!0.files!0]
mk::file close myKit

The code produces the following output:

layout of myKit: {dirs {name parent:I {files {name size:I date:I contents:B}}}}
size of myKit.dirs: 1
contents of first dirs entry:
name <root> parent -1
contents of first files entry:
name README.txt size 12 date 1256126970 contents {hello World!}

What the code does may be described as a reverse-engineering of mk4vfs operations. Instead of going through the specification, we inspect the database directly to understand its layout. First we load the package Mk4tcl and open the myKit.mk database. Next we would like to learn what the layout of the database is, so mk::view layout myKit comes handy. Let's analyze its output carefully:

{dirs {name parent:I {files {name size:I date:I contents:B}}}}

We see that myKit.mk contains one view called dirs. This view (table) consists of the columns with name (by default, it is simply a string), parent (integer value), and the nested subview named files. The names are self explanatory. The dirs view represents the entire directory structure inside this virtual folder. name is the directory name and parent points to a directory at a higher level (the number in the dirs row represents that directory).

Each directory has a subview called files where all the files belonging to this directory are stored. Each row in this subview consists of:

  • name—keeps the file name (string)
  • size—size of the file (integer)
  • date—creation date (stored as an integer)
  • contents—contents of the file (binary raw data)

Next, the entire number of rows is printed out using mk::view size myKit.dirs command—in this case, there is only one entry corresponding to the top-level root directory. mk::get myKit.dirs!0 is used for getting the first directory entry:

name <root> parent -1

mk::get myKit.dirs!0.files!0 returns the entry for the only file in this directory:

name README.txt size 12 date 1256126970 contents {hello World!}

The filename is README.txt, its size is 12 bytes, and its content is the text 'hello World!'.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.71.21