Chapter 7. Taking It Offline

The Internet may seem to be always on these days but, let’s be honest, it’s not. There are times and places when even the most modern mobile devices are out of range of the network for one reason or another.

Chapter 4 looked at how to have data stored local to the browser so that it does not require network access to use. However, if the web page on which the application is hosted is not available, having the data handy will be of no use.

With more and more of the modern application infrastructure moving into the browser, being able to access this software at any time has become critically important. The problem is that the standard web application assumes that many components, including JavaScript sources, HTML, images, CSS, and so forth, will be loaded with the web page. In order to be able to use those resources when the user does not have access to the Internet requires that copies of those files be stored locally, and used by the browser when needed. HTML5 lets a programmer give the browser a listing (known as a manifest) of files that should be loaded and saved. The browser will be able to access these files even when there is no network connection to the server.

The files listed in the manifest will also be loaded from the local disk even if the browser is online, thus giving the end user the experience of the ultimate content delivery network.

As long as the browser is online when a page is loaded, it will check the manifest file with the server. If the manifest file has changed, the browser will attempt to redownload all the files listed for download in the manifest. Once all the files in the manifest have been downloaded, the browser will update the file cache to show the new files.

Introduction to the Manifest File

The ability to access files while offline was one of the features introduced by Google in Gears. The user provided a manifest as a JSON file, which then directed the browser to load other required files offline. When the browser next visited that page, the files would be loaded from the local disk instead of from the network. When the version field of the manifest file was updated, Gears would check all the files in the manifest for updates.

The HTML5 manifest is similar in idea but somewhat different in implementation. One nice thing about it is that you can implement a manifest in an application without using any JavaScript code, which Gears did require. To create a manifest, add the manifest attribute containing the name of the manifest file to the document’s <html> tag (see Example 7-1).

Example 7-1. HTML manifest declaration

<!DOCTYPE HTML>
<html manifest="/cache.manifest">
<body>
        ...
</body>
</html>

The manifest file must be served with the MIME type text/cache-manifest. This can be done via the web server configuration files. For the Apache web server, add the following line to the config file. For other web servers, consult the server’s documentation. The filename does not matter as long as the file has the correct MIME type, but cache.manifest seems to be a good default choice:

AddType text/cache-manifest  .manifest

Structure of the Manifest File

The format of the manifest file is in fact pretty simple. The first line must be just the words CACHE MANIFEST. After that comes a list of files, one per line, to include in the manifest (see Example 7-2). Comments can be marked with the pound (#) character.

The manifest will cache HTTP GET requests, while POST, PUT, and DELETE will still go to the network. If the page has an active manifest file, all GET requests will be directed to local storage. But for some files, offline access does not make sense. These can include various server resources such as Ajax calls, or collections of documents that could get so large as to overflow the cache area. These files can be included in a NETWORK section of the manifest. Any URLs in the NETWORK section will bypass the cache and load directly from the server. The HTML5 manifest specification requires that any non-included files be explictly opted out of the manifest.

In other cases, you may wish to provide different content depending on whether the user is offline or online. The manifest provides a FALLBACK section for such resources. The user will be shown different content, depending on whether the browser has a connection to the Internet or not. On each line of the FALLBACK section, the first file is loaded from the server when a connection is available, and the second file is loaded locally when the connection is not available.

Both the NETWORK and FALLBACK sections list file patterns, not specific files. So it is possible to list entire directories or URL paths here, as well as file types such as images (e.g., *.jpg).

Example 7-2. Manifest file

CACHE MANIFEST
# 11 October 2010
/index.php
/js/jquery.js
/css/style.css
/images/logo.png

NETWORK:
/request.php

FALLBACK:
/about.html /offline-about.html

Updates to the Manifest File

The browser will update the files in the manifest whenever the manifest file itself changes. There are several ways to handle this. It is possible to add a version number in a comment in the file. If the project is making use of a version control system like Subversion, you can use the version number tag for this.

The problem with using a version number from a version control system is that it requires a programmer to remember to update that file every time any file in the system changes. It would be much better to create an automated system that updates the manifest file whenever a file listed in it changes, and run that script as part of a deployment procedure.

For instance, you could write a script that checks all the files in the manifest for changes and then change the manifest file itself when one of the files changes. A simple way to do this is to write a script that loops over all the files in the manifest, then does an MD5 checksum on each one, then puts a final checksum into the manifest file. This will ensure that any changes will cause the manifest file to update.

This script is probably too slow to run from the web server, as running it hundreds of times a second would be overkill. However, it can be efficiently run in the development environment. One option would be to have it run from an editor when a file is saved. Another option is to run it as part of the check-in process for a version control system.

In Example 7-3, we parse the manifest file and do a few things with it. The program uses the Symfony Yaml Library to load a list of files to use as a manifest. As a bonus, the program first checks that no file has been included more than once. It also checks that every file exists, because missing files will break the manifest. By adding each file’s MD5 as a comment after the filename, the script makes sure that any updated file will cause a manifest change so that the browser will update its content. It takes a datafile in the format of Example 7-5. Example 7-3 will output a manifest file with the MD5 hash as a comment in the file, as in Example 7-4.

Example 7-3. Automatically updating a manifest file

<?php

header('Content-Type: text/cache-manifest'),
echo ("CACHE MANIFEST
");
$files = sfYaml::load('manifest.yml'),
$hashes = '';
$files = unique($files);

foreach($files->cache as $file)
  {
    if(file_exists($file))
      {
    echo $file."
";
    $hashes .=md5_file($file);
      }
  }
echo "
NETWORK:
"
foreach ($files->network as $file)
{
  echo $file. "
";
}

echo "
FALLBACK:
"
foreach ($files->fallback as $file)
{
  echo $file. "
";
}

echo "# HASH: ". md5($hashes) . "
";

Example 7-4. Manifest with MD5 hash

CACHE MANIFEST
index.html
css/style.js
js/jquery.js
js/myscript.js

NETWORK:
network/file

FALLBACK:
/avatars/ /offline-avatars/offline.png

#HASH: 090c7e8fe42c16777fba844f835e839b

Example 7-5. The data for Example 7-3

files:
  - index.html
  - css/style.css
  - js/jquery.js
  - js/myscript.js

network:
  - network/file

faillback: 
  - /avatars/ /offline-avatars/offline.png

Warning

The manifest is not always very good about updating when you think it should. Even with a new version of a manifest, it can often take some time to update the content in the browser. Unless you set the cache control headers, the browser will not download the manifest again until several hours after it was last downloaded. Make sure the cache control headers don’t cause the browser to only download the file, say, every five years, or use the ETag header. Or, better yet, have the server set a no cache header. Be sure to test well.

Events

When the browser loads a page with a manifest file, it will fire a checking event on the window.applicationCache object. This event will fire whether or not the page has been visited before.

If the cache has not been seen before, the browser will fire a downloading event, and start to download the files. This event will also fire if the manifest file has changed. If the manifest has not changed the browser will fire a noupdate event.

As the browser downloads the files, it fires a series of progress events. These can be used if you wish to provide some form of feedback to the user to let her know that software is downloading.

Once all the files have downloaded, the cached event is fired.

If anything goes wrong, the browser will fire the error event. This can be caused by a problem in the HTML page, a defective manifest, or a failure to download the manifest or any resource listed in it. Normally, if a single file is missing from the manifest, the cache won’t download any of the files in the manifest. When a manifest changes and ends up including a bad link, the old version of the file will be retained. If there was no existing manifest at the time the erroneous manifest is downloaded, the browser will not create an incomplete offline storage, but will continue to rely on the network.

However, it is possible that not all browsers or browser versions will handle erroneous manifests in the exact way just described. Having an automatic test to validate all the URLs in a manifest is a good idea. This can be a very hard error to catch because there may be very little visible evidence of what went wrong. Catching the error object in your JavaScript and presenting it to the user would be a good idea, as would some form of automatic testing for bad links.

In Google Chrome, the Developer Tools can show a list of files in the manifest (see Figure 7-1). Under the Storage tab, the Application Cache item will show the status of various items.

Chrome Storage Viewer showing Application Cache

Figure 7-1. Chrome Storage Viewer showing Application Cache

Warning

It is a good idea during development to turn off the manifest file, and enable it only when the project is ready to go live. Using the cache can make it very hard to develop the application if changes don’t appear quickly.

Debugging Manifest Files

Manifest files provide a particular debugging challenge. They can be the source of several special classes of bugs.

The first and most obvious bug is to include missing files in the manifest. If a file is included in the page and it is not in the manifest, it will not be loaded by the page, in the same way a missing file on the server will not be downloaded.

Many Selenium tests will not explicitly test for correct styles and the presence of images, so it is quite possible that an application missing a CSS file or image will still work to the extent that it is normally tested in Selenium. In an application that includes resources from outside web servers, those must also be whitelisted in the manifest file.

A further complication comes in some browsers, including Firefox, that make the manifest an opt-in feature. So a Selenium test may not opt into it, which would make the entire test moot. In order to test this in Firefox, it will be necessary to set up a Firefox profile in which the application cache is on by default. To do this:

  1. Quit Firefox completely.

  2. Start up Firefox from a command line with the -profileManager switch. This will result in a dialog similar to that shown in Figure 7-2. Save the custom profile.

    Firefox custom profile dialog

    Figure 7-2. Firefox custom profile dialog

  3. Restart Firefox. Go to the Firefox Options menu, select the Advanced tab and under that the Network tab (see Figure 7-3), and turn off the “Tell me when a website asks to store data for offline use” option.

Firefox Options

Figure 7-3. Firefox Options

Now, when starting up the Selenium RC server, use an option like this:

java -jar selenium-server.jar -firefoxProfileTemplate

For full details on Firefox profiles, see http://support.mozilla.com/en-US/kb/Managing+profiles.

A second class of problems can occur when the manifest is updated and the browser does not reflect the update. Normally, it will take a minute or two after loading a page for the browser to update the file cache, and the browser will not check the cache until the page is loaded. So if the server is updated, the browser will not have the new version until the user visits the page. This can cause problems if there has been an update on the server that will cause the application in the browser to fail, such as a change in the format of how data is sent between the client and server.

When the user visits the page (assuming of course that the browser is online), the browser will fetch the manifest file from the server. However, if the manifest file has a cache control header set on it, the browser may not check for a new version of the manifest. For example, if the file has a header that says the browser should check for updates only once a year (as is sometimes common on web servers), the browser will not reload the manifest file. So it is very important to ensure that the manifest file itself is not cached by the browser, or if it is cached it is done only via an ETag.

The browser can always prevent caching of the manifest file by giving the URL with a query string attached, as in cache.manifest?load=1. If the manifest file is a static text file, the query string will be ignored, but the browser will not know that and will force the server to send a fresh copy.

Different web browsers, and even different versions of a single browser, may update the manifest file somewhat differently. So it is very important to test any application using a manifest file very carefully across different browsers and browser versions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.38.176