Every async function in JavaScript is built on some other async function(s). It’s async functions all the way down (to native code)!
The converse is also true: any function that uses an async function has to provide the result of that operation in an async way. As we learned from Blocking the Thread, JavaScript doesn’t provide a mechanism for preventing a function from returning until an async operation has finished. In fact, until the function returns, no async events will fire.
In this section, we’ll look at some common patterns in async function design. We’ll see that functions can be mercurial, deciding to be async only some of the time. But first, let’s define exactly what an async function is.
The term async function is a bit of a misnomer: if you call a function, your program simply won’t continue until that function returns. What JavaScripters mean when they call a function “async” is that it can cause another function (called a callback when it’s passed as an argument to the function) to run later, from the event queue. So, an async function that takes a callback will never fail this test:
| var functionHasReturned = false; |
| asyncFunction(function() { |
| console.assert(functionHasReturned); |
| }); |
| functionHasReturned = true; |
Another term for async functions is nonblocking. The term emphasizes how speedy they are: a query made with an async MySQL driver may take an hour, but the function that sent the query will return in a matter of microseconds—a boon to web servers that need to quickly process a high volume of incoming requests.
Typically, functions that take a callback take it as their last argument. (Regrettably, the venerable setTimeout and setInterval are exceptions to this convention.) But some async functions take callbacks indirectly, by returning a Promise or using PubSub. We’ll learn about those patterns later in the book.
Unfortunately, the only way to be sure whether a function is async or not is to inspect its source code. Some functions that are synchronous have an API that looks async, either because they might become async in the future or because callbacks provide a convenient way to return multiple arguments. When in doubt, don’t depend on a function being async.
There are functions that are async sometimes but not at other times. For instance, jQuery’s eponymous function (typically aliased as $) can be used to delay a function until the DOM has finished loading. But if the DOM has already finished loading, there’s no delay; its callback fires immediately.
This unpredictable behavior can get you in a lot of trouble if you aren’t careful. One mistake I’ve seen (and made myself) is assuming that $ will run a function after other scripts on the page have loaded.
| // application.js |
| $(function() { |
| utils.log('Ready'); |
| }); |
| |
| // utils.js |
| window.utils = { |
| log: function() { |
| if (window.console) console.log.apply(console, arguments); |
| } |
| }; |
| <script src="application.js"></script> |
| <script src="util.js"></script> |
This code works fine—unless the browser loads the page from the cache, making the DOM ready before the script runs. When that happens, the callback passed to $ runs before utils.log is set, causing an error. (We could avoid this situation by taking a more modern approach to client-side dependency management. See Chapter 6, Async Script Loading.)
Let’s look at another example.
A common variety of sometimes-async functions is async request functions that cache their results. For example, suppose we’re writing a browser-based calculator that uses web workers to run calculations in a separate thread. (We’ll learn about the Web Worker API in Chapter 5, Multithreading with Workers.) Our main script might look like this:[19]
| var calculationCache = {}, |
| calculationCallbacks = {}, |
| mathWorker = new Worker('calculator.js'); |
| |
| mathWorker.addEventListener('message', function(e) { |
| var message = e.data; |
| calculationCache[message.formula] = message.result; |
| calculationCallbacks[message.formula](message.result); |
| }); |
| |
| function runCalculation(formula, callback) { |
| if (formula in calculationCache) { |
| return callback(calculationCache[formula]); |
| }; |
| if (formula in calculationCallbacks) { |
| return setTimeout(function() { |
| runCalculation(formula, callback); |
| }, 0); |
| }; |
| mathWorker.postMessage(formula); |
| calculationCallbacks[formula] = callback; |
| } |
Here, the runCalculation function is synchronous when the result has already been cached but is asynchronous otherwise. There are three possible scenarios.
The formula has already been computed, so the result is in the calculationCache. In this case, runCalculation is synchronous.
The formula has been sent to the worker, but the result hasn’t been received yet. In this case, runCalculation sets a timeout to call itself again; the process will repeat until the result is in calculationCache.
The formula hasn’t yet been sent to the worker. In this case, we’ll invoke the callback from the worker’s ’message’ event listener.
Notice that in scenarios 2 and 3, we’re waiting for a task to be completed in two different ways. I wrote the example this way to illustrate common approaches when we’re waiting for something to change, like the value of the cached computation. Should we prefer one approach over the other? Let’s look at that next.
In runCalculation, we waited for the worker to finish its job by either repeating the same function call from a timeout (async recursion) or simply storing a callback.
Which approach is best? At first glance, it might seem easiest to use only async recursion, eliminating the need for the calculationCallbacks object. Newcomers to JavaScript often use setTimeout for this purpose because it resembles a common idiom of thread-based languages. A Java version of this program would probably have a loop like this:
| while (!calculationCache.get(formula)) { |
| Thread.sleep(0); |
| }; |
But timeouts aren’t free. In large numbers, they can create a significant computational load. The scary thing about async recursion is that there’s no limit to the number of timeouts that could be firing while we wait for the job to finish. Plus, it makes our application’s event structure unnecessarily complicated. For these reasons, async recursion should be regarded as an anti-pattern.
We can avoid async recursion in our calculator by storing an array of callbacks for each formula.
| var calculationCache = {}, |
| calculationCallbacks = {}, |
| mathWorker = new Worker('calculator.js'); |
| mathWorker.addEventListener('message', function(e) { |
| var message = e.data; |
| calculationCache[message.formula] = message.result; |
| calculationCallbacks[message.formula] |
| .forEach(function(callback) { |
| callback(message.result); |
| }); |
| }); |
| |
| function runCalculation(formula, callback) { |
| if (formula in calculationCache) { |
| return callback(calculationCache[formula]); |
| }; |
| if (formula in calculationCallbacks) { |
| return calculationCallbacks[formula].push(callback); |
| }; |
| mathWorker.postMessage(formula); |
| calculationCallbacks[formula] = [callback]; |
| } |
Without the timeout, our code is much more straightforward, as well as more efficient.
In general, avoid async recursion. It’s necessary only when you’re dealing with a library that provides async functionality without any kind of callback mechanism. And if you’re ever in that situation, the first thing you should do is write a patch for that library. Or find a better one.
In both of our implementations of runCalculation, we sometimes return a value. This was an arbitrary choice made for brevity. The line
| return callback(calculationCache[formula]); |
could easily have been written as
| callback(calculationCache[formula]); |
| return; |
because the return value isn’t intended to be used. This is a common idiom in JavaScript, and it’s usually harmless.
However, some functions both return a useful value and take a callback. In those cases, it’s important to remember that the callback will be called either synchronously (before the return) or asynchronously (after the return).
Never define a potentially synchronous function that returns a value that might be useful in the callback. For example, this function that opens a WebSocket[20] connection to a given server (caching to ensure only one connection per server) violates that rule:
| var webSocketCache = {}; |
| function openWebSocket(serverAddress, callback) { |
| var socket; |
| |
| if (serverAddress in webSocketCache) { |
| socket = webSocketCache[serverAddress]; |
| |
| if (socket.readyState === WebSocket.OPEN) { |
| callback(); |
| } else { |
| socket.onopen = _.compose(callback, socket.onopen); |
| }; |
| } else { |
| socket = new WebSocket(serverAddress); |
| webSocketCache[serverAddress] = socket; |
| socket.onopen = callback; |
| }; |
| return socket; |
| }; |
(This code relies on the Underscore.js library. _.compose defines a new function that runs both callback and the original socket.onopen callback.[21])
The problem with this code is that if the socket is already cached and open, then the callback will run before the function returns, breaking this code:
| var socket = openWebSocket(url, function() { |
| socket.send('Hello, server!'); |
| }); |
The solution? Wrap the callback in a setTimeout.
| if (socket.readyState === WebSocket.OPEN) { |
| setTimeout(callback, 0); |
| } else { |
| // ... |
| } |
Using a timeout here may feel like a kludge, but it’s much better than having an inconsistent API.
In this section, we’ve seen several best practices for writing async functions. Don’t rely on a function always being async, unless you’ve read its source code. Avoid using timer methods to wait for something to change. When returning a value and running a callback from the same function, make sure the callback runs after the return.
This is a lot of information to take in at once, but writing good async functions is key to writing good JavaScript.
3.15.4.52