JavaScript has, since its inception, run in a single thread. With small applications this was practical, but it runs up against certain limits now, with larger and larger applications being loaded into browsers. As more and more JavaScript is run, the application will start to block, waiting for code to finish.
JavaScript runs code from an event loop that takes events off a queue of all the events that have happened in the browser. Whenever the JavaScript runtime is idle, it takes the first event off the queue and runs the handler that goes with that event (see Figure 8-1). As long as those handlers run quickly, this makes for a responsive user experience.
In the past few years, the competition among browsers has in part revolved around the speed of JavaScript. In Chrome and Firefox, JavaScript can now run as much as 100 times faster than it did back in the days of IE 6. Because of this, it is possible to squeeze more into the event loop.
Thankfully, most of the things JavaScript has to do are fast. They tend to be on the order of manipulating some data and passing it into the DOM or making an Ajax call. So the model in Figure 8-1 works pretty well. For things that would take longer than a fraction of a second to compute, a number of tricks can prevent bottlenecks from affecting the user experience.
The main trick is to break the computation into small steps and run each one as an independent job on the queue. Each step ends with a call to the next step after a short delay—say, 1/100 of a second. This prevents the task from locking up the event queue. But it’s still fundamentally unsatisfactory, as it puts the work of the task scheduler on to the programmer. Tuning this solution to make it effective is also a demanding effort. If the time steps are too small, computation can still clog up the event queue and cause other tasks to lag behind. So things will still happen, but the user will feel the lag as the system fails to respond right away to clicks and other user-visible activities. On the other hand, if the steps between actions are too large, the computation will take a very long time to complete, causing the user to wait for her results.
Google Gears created the idea of the “worker pool,” which has turned into the HTML5 Web Worker. The interfaces are somewhat different, but the basic ideas are the same. A worker is a separate JavaScript process that can perform computations and pass messages back and forth with the main process and other workers. A Web Worker differs from a thread in Java or Python in one key aspect of design: there is no shared state. The workers and the main JavaScript instance can communicate only by passing messages.
That one difference leads to a number of key programming practices, most simpler than thread programming. Web Workers have no need for mutexes, locks, or synchronization. Deadlocks and race conditions can’t occur. This also means you can use the huge number of JavaScript packages out there without worrying whether they are thread-safe. The only changes to the browser’s JavaScript environment are a few new methods and events.
Each worker (including the main window) maintains an independent event loop. Whenever there is no code running, the JavaScript runtime returns to this event loop and takes the first message out of the queue. If there are no events in the queue, it will wait until an event arrives and then handle it. If some piece of code is running for a long time, no events will be handled until that piece of code is finished. In the main window, this will result in the browser user interface locking up. (Some browsers will offer to let you stop JavaScript at this point.) In a worker, a long task will keep the worker from accepting any new events. However, the main window, and any other workers, will continue to be responsive.
This design choice does, however, place some restrictions on the worker processes themselves. First, workers do not have access to the DOM. This also means a worker can’t use the Firebug console interface, as Firebug communicates with JavaScript by way of the DOM. Finally, JavaScript debuggers cannot access workers, so there is no way to step through code or do any of the other things that would normally be done in the debugger.
The types of applications traditionally run on the Web, and the limitations of the web browser environment, limited the computational needs that would call for a Web Worker. Until recently, most web applications manipulated small amounts of data consisting mostly of text and numbers. In these cases, a Web Worker type of construct is of limited use. Now JavaScript is asked to do a lot more, and many common situations can benefit from spawning new tasks.
The HTML5 <svg>
and
<canvas>
tags
allow JavaScript to manipulate images, potentially a computationally
heavy task. Although web browsers have been able to display images since
the release of the Mosaic browser around 1993, the browsers couldn’t
manipulate those images. If a web programmer wanted to distort an image,
overlay it transparently, and so forth, it could not be done in the
browser. In the <img>
tag, all
the browser could do is substitute a different image by changing the
src
attribute, or
change the displayed size of the image. However, the browser had no way
of knowing what the image was or accessing the raw data that made up the
image.
The recently added <canvas>
tag
makes it possible to import an existing image into a canvas and export
the raw data back into JavaScript for processing, as long as the image
was loaded from the same server as the page it is on. It is also
possible to export a frame from a video in the HTML5 <video>
tag.[1]
Once the data has been extracted from a graphic, you can pass it to a worker for post-processing. This could be useful for doing anything from cleaning up an image to doing a Fourier transform on a scientific data set. Canvas makes it possible to build complex image editing through various filters written in JavaScript, which should often use Web Workers for better performance.
In addition to graphics, JavaScript has APIs now for handling map data. Being able to import a map from the Internet and find out the user’s current location via geolocation allows a wide range of web application services.
Suppose you build a route finder into a mobile browser. It would be very nice to be able to take your phone and tell it you wish to go to “#14 King George St, Tel Aviv” and have the browser figure out where you are, direct you to the nearest bus stop, and tell you that you should take the number 82 bus to get there from the Diamond District in Ramat Gan.
An even more complex version of that software might check traffic to tell you that a different bus might take a more roundabout route and leave you a block from your destination, but probably run faster by missing a major traffic snarl.
To start up a Web Worker, create a new Worker
object and pass, as the parameter to the call, the file that contains the
code (see Example 8-1). This will create a worker
from the source file.
Example 8-1. Worker example
$(document).ready(function (){ var worker = new Worker('worker.js'), worker.onmessage = function (event){ console.info(event); }; worker.postMessage("World" ); });
The browser will load the worker, run any code that is not in an
event handler, and then launch the event loop to wait for events. The main
event to be concerned with is the message
event, which is how you send data to the
worker. The main thread sends the message by issuing postMessage()
and passing
data as the argument.
The data from the main thread is held in the event.data
field. The worker should retrieve this data through a call to
onmessage()
.
Web Workers run in a pretty minimal environment. Many of
the familiar objects and interfaces of JavaScript in the browser are
missing, including the DOM, the document
object, and the window
object.
In addition to the standard ECMAScript objects like String
,
Array
, and Date
, the following
objects and interfaces are available to the Web Worker:
ECMAScript 5 JSON interfaces can also be used, as they are part of
the language, not the browser enviroment. Furthermore, the worker can
import library scripts from the server with the importScripts()
method. This method takes a
list of one or more files, which are then loaded. This has the same
effect as using a <script>
tag in the main user
interface thread. Unlike most methods in JavaScript, importScripts
is blocking. The function will
not return until all the listed scripts have been loaded. importScripts
will execute the loaded files in
the order in which they were specified to the command.
Although localStorage
and
sessionStorage
are not accessible from the Web
Worker, IndexedDB databases are (see Chapter 5).
In addition, the IndexedDB specification says that the blocking forms of
calls can be used in a Web Worker (but not in the main window). So if
you want a worker to manipulate data through IndexedDB, it would make
sense to load the new data into the database and then send an “updated”
message to the main window or other workers so that they can take any
needed actions.
The main event that concerns a worker is the message
event, which is sent to the worker
from the postMessage
method in the
main JavaScript context to pass information. In Firefox, it is possible to pass complex JavaScript
objects. However, some versions of Chrome and Safari support only simple
data, such as strings, Booleans, and numbers. It is good practice to
encode all data into JSON before sending it to a Web Worker.
The worker can send data back to the main thread via the same
postMessage
method, and receive it
back in the main thread via the worker.onmessage
handler.
The model for worker communication is that the main task creates the worker, after which they pass messages back and forth as shown in Figure 8-2.
Example 8-1 is the “Hello World” of Web Workers. A more complex example is called for. Figure 8-3 shows a visual representation of a Mandelbrot set computed in a Web Worker. Here the worker and the main thread split up the work to draw the fractal. The worker does the actual work of computing the Mandelbrot set, while the frontend script takes that raw data and displays it in the canvas.
The frontend script (see Example 8-2) sets
up the canvas element and scales it to fit in the page. Then it creates an
object to wrap the worker interface. The wrapper object creates the worker
in the wrapper’s run()
method, passing to
the worker a parameter block that tells it what chunk of the Mandelbrot
set to compute.
The draw
method takes the data,
scales it to fit onto the canvas, sets a color, and then draws the
pixel.
The HTML Canvas does not have a “draw pixel” command, so to draw a pixel we must draw a square of size 1 and offset it by half a pixel from the spot where we want it to show up. So to draw a pixel at (20,20) the square should extend from (19.5,19.5) to (20.5,20.5). The locations on the canvas grid are not the pixels on the screen but the points between them.
The onmessage
handler then
waits for events to be sent from the worker. If the event type is draw
, the handler calls the method to draw the
new data into the canvas. If the event is log
, it is logged to the JavaScript console via
console.info()
. This
provides a very simple method to log status information from a
worker.
The startWorker
method
aliases the this
to a local variable named that
. This is because this
is not lexically scoped like other
JavaScript variables. To allow the inner function to have access to that
object, which it will need to draw a pixel, it is necessary to alias it to
a lexically scoped variable. By convention that variable is often called
that
.
Example 8-2. Mandelbrot frontend
var drawMandelSet = function drawMandelSet(){ var mandelPanel = $('body'), var width = mandelPanel.innerWidth(); var height = mandelPanel.innerHeight(); var range = [{ x: -2, y: -1.4 }, { x: 5, y: 1.4 }]; $('canvas#fractal').height(height + 100); $('canvas#fractal').width(width - 50); var left = 0; var top = 0; var canvas = $("canvas#fractal")[0]; var ctx = canvas.getContext("2d"); var params = { range: range, startx: 0.0, starty: 0.0, width: width, height: height }; var y_array = []; var worker = { params: params, draw: function draw(data){ data.forEach(function d(point){ if (this.axis.x[point.drawLoc.x] === undefined) { this.axis.x[point.drawLoc.x] = point.point.x; } if (this.axis.y[height - point.drawLoc.y] === undefined) { this.axis.y[height - point.drawLoc.y] = point.point.y; } ctx.fillStyle = pickColor(point.escapeValue); ctx.fillRect(point.drawLoc.x + 0.5, height - point.drawLoc.y + 0.5, 1, 1); }, this); }, axis: { x: [], y: [], find: function(x, y){ return new Complex(this.x[x], this.y[y]); }, reset: function(){ this.x = [], this.y = []; } }, myWorker: false, run: function startWorker(params){ this.myWorker = new Worker("js/worker.js"); var that = this; this.myWorker.postMessage(JSON.stringify(params)); this.myWorker.onmessage = function(event){ var data = JSON.parse(event.data); if (data.type === 'draw') { that.draw(JSON.parse(data.data)); } else if (event.data.type === 'log') { console.info(event); } }; } }; worker.run(params); return worker; }; $(document).ready(drawMandelSet); Function.prototype.createDelegate = function createDelegate(scope){ var fn = this; return function(){ fn.call(scope, arguments); }; }; function pickColor(escapeValue){ if (escapeValue === Complex.prototype.max_iteration) { return "black"; } var tone = 255 - escapeValue * 10; var colorCss = "rgb({r},{g},{b})".populate({ r: tone, g: tone, b: tone }); return colorCss; } String.prototype.populate = function populate(params) { var str = this.replace(/{w+}/g, function stringFormatInner(word) { return params[word.substr(1, word.length - 2)]; }); return str; };
The actual worker (see Example 8-3) is very simple. It just loads up a few other files and then waits for a message to be sent from the user interface. When it gets one, it starts the computation.
Example 8-3. Mandelbrot startup
importScripts('function.js','json2.js', 'complex.js','computeMandelbrot.js', 'buildMaster.js'), onmessage = function(event){ var data = typeof event.data === 'string'? JSON.parse(event.data) : event.data; buildMaster(data); };
The buildMaster()
function
(see Example 8-4) loops over the grid of points for
the Mandelbrot set, computing the escape value for each point (see Example 8-5). After every 200 points, the build
function sends the results of its computation back to the main thread for
drawing, and then zeros out its internal buffer of computed points. This
way, instead of waiting for the entire grid to be drawn at once, the user
sees the image build progressively.
Example 8-4. Mandelbrot build
var chunkSize = 200; function buildMaster(data){ var range = data.range; var width = data.width; var height = data.height; var startx = data.startx; var starty = data.starty; var dx = (range[1].x - range[0].x) / width; var dy = (range[1].y - range[0].y) / height; function send(line){ var lineData = JSON.stringify(line.map(function makeReturnData(point){ return { drawLoc: point.drawLoc, point: point.point, escapeValue: point.point.mandelbrot() }; })); var json = JSON.stringify({ type: 'draw', data: lineData }); postMessage(json); }; function xIter(x, maxX, drawX){ var line = []; var drawY = starty; var y = range[0].y; var maxY = range[1].y; while (y < maxY) { if (line.length % chunkSize === chunkSize - 1) { send(line); line = []; } var pt = { point: new Complex(x, y), drawLoc: { x: drawX, y: drawY } }; line.push(pt); y += dy; drawY += 1; } send(line); if (x < maxX && drawX < width) { xIter.defer(1, this, [x + dx, maxX, drawX + 1]); } } xIter(range[0].x, range[1].x, startx); }
The final part of this application is the actual mathematical computation of the Mandelbrot set shown in Example 8-5. This function is done as a while loop instead of a pure function as in Functional Programming, because JavaScript does not support tail recursion. Doing this as a recursive function would be more elegant, but would risk causing a stack overflow.
Example 8-5. Mandelbrot computation
Complex.prototype.max_iteration = 255 * 2; Complex.prototype.mandelbrot = function(){ var x0 = this.x; var y0 = this.y; var x = x0; var y = y0; var count; var x_, y_; var max_iteration = this.max_iteration; function inSet(x, y){ return x * x + y * y < 4; } count = 0; while (count < max_iteration && inSet(x, y)) { x_ = x * x - y * y + x0; y_ = 2 * x * y + y0; count += 1; x = x_; y = y_; } return count; };
While the worker is doing the calculation of the Mandelbrot set, its main event is blocked. So it is not possible for the UI process to send it a new computation task, or more correctly stated, the worker will not accept the new task until the current task is finished.
To interrupt or change a worker’s behavior—for instance, to let the user in the user interface thread select which area of the Mandelbrot set to draw and then request that the worker draw that area—you have a choice among a few methods.
The simplest method would be to kill the worker and create a new one. This has the advantage that the new worker starts off on a clean state and there can be nothing left over from the prior runs. On the other hand, it also means the worker has to load all the scripts and data from scratch. So if the worker has a long startup time, this is probably not the best approach.
The second method is a little more complex: manage the task queue manually through your program. Have a data structure in the main thread or a worker that keeps a list of blocks of data to compute. When a worker needs a task, it can send a message to that queue object and have a task sent to it. This creates more complexity but has several advantages. First, the worker does not need to be restarted when the application needs it to do something different. Second, it allows the use of multiple workers. Each worker can query the queue manager when it needs the next part of the problem.
You could also have the master task send a large number of events to the worker in sequence. However, this has the problem that there is no way from JavaScript to clear the event queue. So having a job queue that can be managed seems to be the best approach. We’ll explore this solution in the following section.
There is no requirement that an application restrict itself to one Web Worker. JavaScript is quite happy to let you start up a reasonable number of workers. Of course, this makes sense only if the problem can be easily partitioned into several workers, but many problems can be divided that way.
Each worker is an independent construction, so it is possible to create several workers from the same source code, or to create several workers that work independently.
Workers are a fairly heavy construct in JavaScript, so it is probably a bad idea to create more than, say, 10 workers on a given task. However, the optimal number is probably dependent on the user’s browser and hardware as well as the task to be performed.
Over the past 10 years, the tools for JavaScript debugging have gotten quite good. Firebug and Chrome Developer Tools both are first-rate debugging tools that can be used for testing JavaScript applications. Unfortunately, neither one can access code running in a Web Worker. So you can’t set break points or step through your code in a worker. Nor do workers show up in the list of loaded scripts that appear in the respective script tags of Firebug and Chrome. Nor can Selenium or QUnit directly test code running in a Web Worker.
Errors in a worker are reported back to the console in Firefox and Chrome. Of course, in many cases, knowing the line and file where the error occurred does not help all that much, as the actual bug was somewhere else.
Chrome does provide the programmer a method for debugging Web Workers. The Chrome Developer Tools script panel contains a Web Workers checkbox. This option causes Chrome to simulate a worker using an iframe.
Being able to use Web Workers to pull complex functions out of the user’s browser task offers great power for the programmer. Firefox has supported Web Workers since version 3.5 and Chrome has supported them since version 4. Safari and Opera have also supported them for some time. However, as of this writing, Microsoft Internet Exporer does not support Web Workers (though support may appear in IE version 10), nor does Safari on iOS, so it is not possible to use Web Workers on the iPad/iPod/iPhone platform.
What would be ideal is a library that would enable a programmer to
abstract out the code to be run into a function or module and a runner
that would use the best available mechanism to run that code in the
backround: via a Web Worker if available, and otherwise via a setTimeout
method.
Furthermore, the library would provide a common set of interfaces that
could be used for the various interactions, such as posting a message back to the main application.
Such a library should always use feature detection rather than browser detection to figure out which version of the code to run. While a given browser may or may not support Web Workers right now, in the future that will change and a library needs to be able to work with those changes.
The actual function to do the work in this pattern will be called
repeatedly with the run state as a parameter. It should do whatever
processing it needs to do and return a modified state parameter that will
be used to call it again until it finishes its job and calls the
stop()
method, or is
otherwise interrupted. The run function (see Example 8-6) should be
treated as a pure function; it should just process its inputs and return a
value, but not effect any change in global state, because a different set
of interfaces will be available to it depending on whether it is running
as a Web Worker or not.
Example 8-6. Run
(function () { runner.setup(function (state) { this.postMessage({state: state}); return { time: state.time += 1 }; }, { time: 0 }); }());
When running in a Web Worker (see Example 8-7), the run function can
be run from inside a standard loop. The system is set up via a postMessage
call with
some initial parameters that are passed as the initial state to the run
method. That method will be repeatedly called by the while loop until it
calls the stop function, at which point the state will be posted back to
the main message.
Example 8-7. Running a function with a Web Worker
var runner = { stopFlag: false, postMessage: function (message) { self.postMessage(message); }, stop: function () { this.stopFlag = true; }, error: function (error) { this.stopFlag = true; }, setup: function (run) { this.run = run; var that = this; self.onmessage = function message(event) { that.execute(JSON.parse(event.data)); }; }, execute: function (state) { var that = this; setTimeout(function runIterator() { that.state = that.run.apply(that, [that.state]); if (that.stopFlag) { that.postMessage(that.state); } else { that.execute(); } }, 16); } }; (function () { runner.setup(function (state) { var newstate = state; //modify newstate here return newstate; }, { time: 0 }); }());
If Web Workers are not available in the browser, the method should be run through a short, repeating timeout instead of in a while loop (see Example 8-8). A while loop would block the message queue, so the main thread could not send messages to the worker. Using a timeout frees up the main thread—the whole goal of this library—and also lets a message change the state of the run function as needed.
Once again, the runner calls the run function with a state
parameter that should be returned by the
callback function. However, because this is not a Web Worker, the runner
will then call the window.setTimeout()
method to delay the next iteration by some amount of time and call the
function again.
Example 8-8. Running a function without a Web Worker
var runner = { stopFlag: false, // override this function onmessage: function (msg) { if (msg.state) { var state = msg.state; $('#status').html("time: " + state.time); } if (msg.set) { this.state = msg.set; } return this.state; }, postMessage: function (message) { this.onmessage(message); }, stop: function () { this.stopFlag = true; }, error: function (error) { this.stopFlag = true; }, setup: function (run, state) { this.run = run; this.state = state; this.execute(); }, execute: function () { var that = this; setTimeout(function runIterator() { that.state = that.run.apply(that, [that.state]); if (that.stopFlag) { that.postMessage(that.state); } else { that.execute(); } }, 250); } };
Communications between the simulated Web Worker and the main body of
the code are also somewhat different. Because there is no postMessage()
method with
a callback, the runner must simulate it by presenting a
mechanism to register a callback that can take the same parameters as the
Web Worker’s onmessage()
handler.
This concept of how to make code portable between a Web Worker and regular JavaScript is presented as a model and not a full solution. It is missing some features, such as loading code. It is also missing a way to call an asynchronous method such as an Ajax call, and resume processing when done. This would be necessary because, although in general Web Workers are designed for processor-intensive work, there will be times when access to an Ajax call or IndexedDB makes sense.
When programming JavaScript in the main thread, programmers use a library such as jQuery to improve the API and to hide differences between browsers. For use with Web Workers, there is a jQuery extension called jQuery Hive that provides much of this functionality. Hive includes the PollenJS library in the main JavaScript thread. The library includes interfaces to create workers.
Hive will also encode and decode messages between the main thread
and worker if needed. In some browsers (notably Firefox), complex data can
be sent over the postMessage()
interface. However, in some versions of Chrome and Safari, postMessage()
will handle only a string or other simple data.
Hive also includes a subset of the jQuery API in the worker itself.
The most important methods in the Hive API are $.get()
and $.post()
, which mirror
the APIs in jQuery. If a worker needs to access the server via Ajax, for
instance, using Hive will make your life much easier.
Hive also includes access to a persistent storage interface via
$.storage
. To set a
value, use $.storage(name, value)
.
Calling $.storage(name)
without the
second value parameter will return
the existing value, if set.
Also included in Hive are $.decode()
and $.encode()
, which can be
used to decode or encode JSON messages.
[1] See HTML5 Canvas by Steve Fulton and Jeff Fulton (O’Reilly) for more information on the graphics in HTML5.
3.133.154.64