Chapter 9. Advanced Asynchronous Recipes

Almost all the design patterns we've seen so far can be considered generic and applicable to many different areas of an application. There is, however, a set of patterns that are more specific and focused on solving well-defined problems; we can call these patterns recipes. As in real-life cooking, we have a set of well-defined steps to follow that will lead us to an expected outcome. Of course, this doesn't mean that we can't use some creativity to customize the recipes to match the taste of our guests, but the outline of the procedure is usually the one that matters. In this chapter, we are going to provide some popular recipes to solve some specific problems we encounter in our everyday Node.js development. These recipes include the following:

  • Requiring modules that are initialized asynchronously
  • Batching and caching asynchronous operations to get a performance boost in busy applications, using only minimal development effort
  • Running synchronous CPU-bound operations that can block the event loop and cripple the ability of Node.js to handle concurrent requests

Requiring asynchronously initialized modules

In Chapter 2, Node.js Essential Patterns, when we discussed the fundamental properties of the Node.js module system, we mentioned the fact that require() works synchronously and that module.exports cannot be set asynchronously.

This is one of the main reasons for the existence of synchronous API in the core modules and many npm packages, they are provided more as a convenient alternative, to be used primarily for initialization tasks rather than a substitute for asynchronous API.

Unfortunately, this is not always possible; a synchronous API might not always be available, especially for components using the network during their initialization phase, for example, to perform handshake protocols or to retrieve configuration parameters. This is the case for many database drivers and clients for middleware systems such as message queues.

Canonical solutions

Let's take an example: a module called db, which connects to a remote database. The db module will be able to accept requests only after the connection and the handshake with the server have been completed. In this scenario, we usually have two options:

  • Making sure that the module is initialized before starting to use it, otherwise wait for its initialization. This process has to be done every time we want to invoke an operation on the asynchronous module:
       const db = require('aDb'); //The async module 
 
       module.exports = function findAll(type, callback) { 
         if(db.connected) {  //is it initialized? 
           runFind(); 
         } else { 
           db.once('connected', runFind); 
         } 
         function runFind() { 
           db.findAll(type, callback); 
         }); 
       }; 
  • Use Dependency Injection (DI) instead of directly requiring the asynchronous module. By doing this, we can delay the initialization of some modules until their asynchronous dependencies are fully initialized. This technique shifts the complexity of managing the module initialization to another component, usually the parent module. In the following example, this component is app.js:
       //in the module app.js 
       const db = require('aDb'); //The async module 
       const findAllFactory = require('./findAll'); 
       db.on('connected', function() { 
         const findAll = findAllFactory(db); 
       }); 
 
       //in the module findAll.js 
       module.exports = db => { 
         //db is guaranteed to be initialized 
         return function findAll(type, callback) { 
           db.findAll(type, callback); 
         } 
       } 

We can immediately see that the first option can become highly undesirable, considering the amount of boilerplate code involved.

Also, the second option, which uses DI, is sometimes undesirable, as we have seen in Chapter 7, Wiring Modules. In big projects, it can quickly become over-complicated, especially if done manually and with asynchronously initialized modules. These problems would be mitigated if we were using a DI container designed to support asynchronously initialized modules.

As we will see, though, there is a third alternative that allows us to easily isolate the module from the initialization state of its dependencies.

Preinitialization queues

A simple pattern to decouple a module from the initialization state of a dependency involves the use of queues and the Command pattern. The idea is to save all the operations received by a module while it's not yet initialized and then execute them as soon as all the initialization steps have been completed.

Implementing a module that initializes asynchronously

To demonstrate this simple but effective technique, let's build a small test application; nothing fancy, just something to verify our assumptions. Let's start by creating an asynchronously initialized module called asyncModule.js:

const asyncModule = module.exports; 
 
asyncModule.initialized = false; 
 
asyncModule.initialize = callback => { 
  setTimeout(function() { 
    asyncModule.initialized = true; 
    callback(); 
  }, 10000); 
}; 
 
asyncModule.tellMeSomething = callback => { 
  process.nextTick(() => { 
    if(!asyncModule.initialized) { 
      return callback( 
        new Error('I don't have anything to say right now') 
      ); 
    } 
    callback(null, 'Current time is: ' + new Date()); 
  }); 
}; 

In the preceding code, asyncModule tries to demonstrate how an asynchronously initialized module works. It exposes an initialize() method, which after a delay of 10 seconds, sets the initialized variable to true and notifies its callback (10 seconds is a lot for a real application, but for us it's great for highlighting any race conditions). The other method, tellMeSomething(), returns the current time, but if the module is not yet initialized, it generates an error.

The next step is to create another module depending on the service we just created. Let's consider a simple HTTP request handler implemented in a file called routes.js:

const asyncModule = require('./asyncModule'); 
 
module.exports.say = (req, res) => { 
  asyncModule.tellMeSomething((err, something) => { 
    if(err) { 
      res.writeHead(500); 
      return res.end('Error:' + err.message); 
    } 
    res.writeHead(200); 
    res.end('I say: ' + something); 
  }); 
}; 

The handler invokes the tellMeSomething() method of asyncModule, then it writes the result into an HTTP response. As we can see, we are not performing any checks on the initialization state of asyncModule, and as we can imagine, this will likely lead to problems.

Now, let's create a very basic HTTP server using nothing but the core http module (the app.js file):

const http = require('http'); 
const routes = require('./routes'); 
const asyncModule = require('./asyncModule'); 
 
asyncModule.initialize(() => { 
  console.log('Async module initialized'); 
}); 
 
http.createServer((req, res) => { 
  if (req.method === 'GET' && req.url === '/say') { 
    return routes.say(req, res); 
  } 
  res.writeHead(404); 
  res.end('Not found'); 
}).listen(8000, () => console.log('Started')); 

The preceding small module is the entry point of our application, and all it does is trigger the initialization of asyncModule and create an HTTP server that makes use of the request handler we created previously (routes.say()).

We can now try to fire up our server by executing the app.js module as usual. After the server is started, we can try to hit the URL, http://localhost:8000/say, with a browser and see what comes back from our asyncModule.

As expected, if we send the request just after the server is started, the result will be an error as follows:

Error:I don't have anything to say right now

This means that asyncModule is not yet initialized, but we still tried to use it. Depending on the implementation details of the asynchronously initialized module, we could have received a graceful error, lost important information, or even crashed the entire application. In general, the situation we just described has to always be avoided. Most of the time, a few failing requests might not be a concern or the initialization might be so fast that, in practice, it would never happen; however, for high load applications and cloud servers designed to autoscale, both of these assumptions might quickly get obliterated.

Wrapping the module with preinitialization queues

To add robustness to our server, we are now going to refactor it by applying the pattern we described at the beginning of the section. We will queue any operations invoked on asyncModule during the time it's not yet initialized and then flush the queue as soon we are ready to process them. This looks like a great application for the State pattern! We will need two states, one that queues all the operations while the module is not yet initialized, and another that simply delegates each method to the original asyncModule module, when the initialization is complete.

Often, we don't have the chance to modify the code of the asynchronous module; so, to add our queuing layer, we will need to create a proxy around the original asyncModule module.

Let's start to work on the code; let's create a new file named asyncModuleWrapper.js and let's start building it piece-by-piece. The first thing that we need to do is to create the object that delegates the operations to the active state:

const asyncModule = require('./asyncModule'); 
 
const asyncModuleWrapper = module.exports; 
 
asyncModuleWrapper.initialized = false; 
asyncModuleWrapper.initialize = () => { 
  activeState.initialize.apply(activeState, arguments); 
}; 
 
asyncModuleWrapper.tellMeSomething = () => { 
  activeState.tellMeSomething.apply(activeState, arguments); 
}; 

In the preceding code, asyncModuleWrapper simply delegates each of its methods to the currently active state. Let's see then what the two states look like, starting from notInitializedState:

const pending = []; 
const notInitializedState = { 
 
  initialize: function(callback) { 
    asyncModule.initialize(() => { 
      asyncModuleWrapper.initalized = true; 
      activeState = initializedState;                 //[1] 
 
      pending.forEach(req => {                        //[2] 
        asyncModule[req.method].apply(null, req.args); 
      }); 
      pending = []; 
 
      callback();                                     //[3] 
    }); 
  }, 
 
  tellMeSomething: callback => { 
    return pending.push({ 
      method: 'tellMeSomething', 
      args: arguments 
    }); 
  } 
}; 

When the initialize() method is invoked, we trigger the initialization of the original asyncModule module, providing a callback proxy. This allows our wrapper to know when the original module is initialized and consequently triggers the following operations:

  1. Updates the activeState variable with the next state object in our flow—initializedState.
  2. Executes all the commands that were previously stored in the pending queue.
  3. Invokes the original callback.

As the module at this point is not yet initialized, the tellMeSomething() method of this state simply creates a new Command object and adds it to the queue of the pending operations.

At this point, the pattern should already be clear when the original asyncModule module is not yet initialized, our wrapper will simply queue all the received requests. Then, when we are notified that the initialization is complete, we execute all the queued operations and then switch the internal state to initializedState. Let's see then, what this last piece of the wrapper looks like:

let initializedState = asyncModule; 

Without (probably) any surprise, the initializedState object is simply a reference to the original asyncModule! In fact, when the initialization is complete, we can safely route any request directly to the original module. Nothing more is required.

At last, we have to set the initial active state, which of course will be notInitializedState:

let activeState = notInitializedState; 

We can now try to launch our test server again, but first, let's not forget to replace the references to the original asyncModule module with our new asyncModuleWrapper object; this has to be done in the app.js and routes.js modules.

After doing this, if we try to send a request to the server again, we will see that during the time the asyncModule module is not yet initialized, the requests will not fail; instead, they will hang until the initialization is completed and will only then be actually executed. We can surely affirm that this is a much more robust behavior.

Tip

Pattern 

If a module is initialized asynchronously, queue every operation until the module is fully initialized.

Now, our server can start accepting requests immediately after it's started and it guarantees that none of these requests will ever fail because of the initialization state of its modules. We were able to obtain this result without using DI or requiring verbose and error-prone checks to verify the state of the asynchronous module.

In the wild

The pattern we just presented is used by many database drivers and ORM libraries. The most notable is Mongoose (http://mongoosejs.com), which is an ORM for MongoDB. With Mongoose, it's not necessary to wait for the database connection to open in order to be able to send queries, because each operation is queued and then executed later when the connection with the database is fully established. This clearly boosts the usability of its API.

Tip

Take a look at the code of Mongoose to see how every method in the native driver is proxied to add the preinitialization queue (it also demonstrates an alternative way of implementing this pattern). You can find the code fragment responsible for implementing the pattern at https://github.com/LearnBoost/mongoose/blob/21f16c62e2f3230fe616745a40f22b4385a11b11/lib/drivers/node-mongodb-native/collection.js#L103-138.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.185.196