Chapter 3. Node.js

Outside browsers, there’s only one JavaScript runtime of note, and that’s Node.js. 1 While it started as a platform emphasizing single-threaded concurrency in servers with continuation-passing style callbacks, a lot of effort went into making a general purpose programming platform.

Many tasks performed by Node.js programs don’t fit into its traditional use case of serving web requests or handling network connections. Instead, a lot of newer Node.js programs are command-line tools acting as build systems, or parts of them, for JavaScript. Such programs are typically heavy on I/O operations, just like servers are, but they also typically do a lot of data processing.

For example, tools like Babel and TypeScript will transform your code from one language (or language version) to another. Tools like Webpack, Rollup, and Parcel will bundle and minify your code for distribution to your web frontend or to other environments where load-times are crucial, like serverless environments. In situations like these, while there’s a lot of filesystem I/O going on, there’s also a lot of data processing, which is generally done synchronously. These are the sorts of situations where parallelism is handy, and might get the job done quicker.

Parallelism can also be useful in the original Node.js use case, which is servers. Data processing may happen a lot, depending on your application. For example server side rendering (SSR) involves a lot of string manipulation where the source data is already known. This is one of many examples where we might want to add parallelism to our solutions. “Example Benchmark: Template Rendering” examines a situation where parallelism improves template rendering time.

Today, we have worker_threads for parallelizing our code. This wasn’t always the case, but that didn’t mean we were limited to single-threaded concurrency.

Before We Had Threads

Prior to threads being available in Node.js, if you wanted to take advantage of CPU cores, you needed to use processes. As discussed in Chapter 1, we don’t get some of the benefits we’d get from threads if we use processes. That being said, if shared memory isn’t important (and in many cases it isn’t!) then processes are perfectly able to solve these kinds of problems for you.

Consider Figure 1-1 from Chapter 1. In that scenario, we have threads responding to HTTP requests sent to them from a main thread, which is listening on a port. While this concept is great for handling traffic from several CPU cores, we can also use processes to achieve a similar effect. It might look something like Figure 3-1.

A web server system might offload work to processes on a round-robin basis.
Figure 3-1. Processes as they might be used in an HTTP server.

Although we could do something like this using the child_process API in Node.js, we’re better off using cluster, which was purpose-built for this use case. This module’s purpose is to spread network traffic across several worker processes. Let’s go ahead and use it in a simple “Hello, World” example.

The code in Example 3-1 is a standard HTTP server in Node.js. It simply responds to any request, regardless of path or method, with “Hello, World!”, followed by a new line character.

Example 3-1. A “Hello, World” server in Node.js
const http = require('http');

http.createServer((req, res) => {
  res.end('Hello, World!
');
}).listen(3000);

Now, let’s add four processes with cluster. With the cluster module, the common approach is to use an if block to detect whether we’re in the main listening process or one of the worker processes. If we’re in the main process, then we have to do the work of spawning the worker processes. Otherwise, we just set up an ordinary web server as before in each of the workers. This should look something like Example 3-2.

Example 3-2. A “Hello, World” server in Node.js using cluster
const http = require('http');
const cluster = require('cluster'); 1

if (cluster.isPrimary) { 2
  cluster.fork(); 3
  cluster.fork();
  cluster.fork();
  cluster.fork();
} else {
  http.createServer((req, res) => {
    res.end('Hello, World!
');
  }).listen(3000); 4
}
1

Require the cluster module.

2

Change code paths depending on whether we’re in the primary process.

3

In the primary process, create four worker processes.

4

In the worker processes, create a web server and listen, like in Example 3-1.

You may notice that we’re creating web servers that listen on the same port in four difference processes. It seems like a mistake. After all, if we try to bind a server to a port that’s already being used, we usually get an error. Don’t worry! We’re not actually listening on the same port four times. It turns out Node.js does some magic for us in cluster.

When worker processes are set up in a cluster, any call to listen() will actually cause Node.js to listen on the primary process rather than on the worker. Then, once a connection is received in the primary process, it’s handed off to a worker process via IPC. On most systems, this happens on a round-robin basis. This somewhat convoluted system is how each worker can appear to be listening on the same port, when in fact it’s just the primary process listening on that port, and passing connections off to all the workers.

Note

Historically, the isPrimary property on cluster used to be called isMaster, and for compatibility reasons, it’s still there as an alias at time of writing. The change was introduced in Node.js v16.0.0.

This change was made in an effort to reduce the amount of potentially harmful language in Node.js. The project aims to be a welcoming community, and words with a given usage that are rooted in a history of slavery are antithetical to that goal.

Processes incur some extra overhead that threads don’t, and we also don’t get shared memory which helps with faster transfer of data. For that, we need the worker_threads module.

The worker_threads Module

Node.js’s support for threads is in a built-in module called worker_threads. It provides an interface to threads that mimics a lot of what you’d find in web browsers for web workers. Since Node.js is not a web browser, not all the APIs are the same, and the environment inside these worker threads isn’t the same as what you’d find inside web workers.

Instead, inside Node.js worker threads you’ll find the usual Node.js API available via require, or import if you’re using EcmaScript Modules (ESM). There are a few differences in the API compared to the main thread though:

  • You can’t exit the program with process.exit(). Instead this will just exit the thread.

  • You can’t change working directories with process.chdir(). In fact, this function is not even available.

  • You can’t handle signals with process.on().

Another important thing to note is that the libuv worker pool is shared across worker threads. Recall “Hidden Threads” where it was noted that the libuv thread pool consists of a default of four threads to create non-blocking interfaces to low-level blocking APIs. If you’re finding yourself bound by that thread pool’s size (due, for example, to a lot of filesystem I/O), you’ll find that adding more threads via worker_threads won’t lighten the load. Instead, apart from considering various caching solutions and other optimizations, consider increasing your UV_THREADPOOL_SIZE. Likewise, you might find that you have little choice but to increase this when adding JavaScript threads via the worker_threads module, due to their usage of the libuv thread pool.

There are other caveats too, so you’re encouraged to have a look at the Node.js documentation for a full list of differences for your particular version of Node.js.

You can create a new worker thread by using the Worker constructor, like in Example 3-3.

Example 3-3. Spawning a new worker thread in Node.js
const { Worker } = require('worker_threads');

const worker = new Worker('worker-file-name.js'); 1
1

The filename here is the entrypoint file that we want to run inside the worker thread. This is similar to the entrypoint in the main file that we’d specify as an argument to node on the command-line.

workerData

It’s not sufficient to just be able to create a worker thread. We need to interact with it! The Worker constructor takes a second argument, an options object, that among other things allows us to specify a set of data to be passed immediately to the worker thread. The options object property is called workerData, and its contents will be copied into the worker thread via the Appendix. Inside the thread, we can access the cloned data via the workerData property of the worker_threads module. You can see how this works in Example 3-4.

Example 3-4. Passing data to a worker thread via workerData.
const {
  Worker,
  isMainThread,
  workerData
} = require('worker_threads');
const assert = require('assert');

if (isMainThread) { 1
  const worker = new Worker(__filename, { workerData: { num: 42 } });
} else {
  assert.strictEqual(workerData.num, 42);
}
1

Rather than using a separate file for the worker thread, we can use the current file with __filename and switch the behaviour based on isMainThread.

It’s important to note that the properties of the workerData object are cloned rather than shared between threads. Unlike in C, shared memory in JavaScript threads does not mean all the variables are visible. This means any changes you make in that object will not be visible in the other thread. They are separate objects. That being said, you can have memory that’s shared between threads via SharedArrayBuffer. These can be shared via workerData or by being sent through a MessagePort, which is covered in the next section. Additionally, SharedArrayBuffer is covered in depth in Chapter 4.

MessagePort

A MessagePort is one end of a two-way data stream. By default, one is provided to every worker thread to provide a communication channel to and from the main thread. It’s available in the worker thread as the parentPort property of the worker_threads module.

To send a message via the port, the postMesage() method is called on it. The first argument is any object that can be passed via the Appendix, which will end up being the message data being passed to the other end of the port. When a message is received on the port, the message event is fired, with the message data being the first argument to the event handler function. In the main thread, the event and the postMessage() method are on the worker instance itself, rather than having to get them from a MessagePort instance. Example 3-5 shows a simple example where messages sent to the main thread are echoed back to a worker thread.

Example 3-5. Bidirectional communication via the default MessagePorts.
const {
  Worker,
  isMainThread,
  parentPort
} = require('worker_threads');

if (isMainThread) {
  const worker = new Worker(__filename);
  worker.on('message', msg => {
    worker.postMessage(msg);
  });
} else {
  parentPort.on('message', msg => {
    console.log('We got a message from the main thread:', msg);
  });
  parentPort.postMessage('Hello, World!');
}

You can also create a pair of MessagePort instances connected to each other via the MessageChannel constructor. You can then pass one of the ports via an existing message port (like the default one) or via workerData. You might want to do this in situations where neither of two threads that need to communicate are the main thread, or even just for organizational purposes. Example 3-6 is the same as the previous example, except using ports created via MessageChannel and passed via workerData.

Example 3-6. Bidirectional communication via MessagePort created with MessageChannel.
const {
  Worker,
  isMainThread,
  MessageChannel,
  workerData
} = require('worker_threads');

if (isMainThread) {
  const { port1, port2 } = new MessageChannel();
  const worker = new Worker(__filename, {
    workerData: {
      port: port2
    },
    transferList: [port2]
  });
  port1.on('message', msg => {
    port1.postMessage(msg);
  });
} else {
  const { port } = workerData;
  port.on('message', msg => {
    console.log('We got a message from the main thread:', msg);
  });
  port.postMessage('Hello, World!');
}

You’ll notice we used the transferList option when instantiating the Worker. This is a way of transferring ownership of objects from one thread to another. This is required when sending any MessagePort, ArrayBuffer, or FileHandle objects via workerData or postMessage. Once these objects are transferred, they can no longer be used on the sending side.

Tip

In more recent versions of Node.js, WHATWG ReadableStream and WritableStream are available2 and in use by some APIs. They can be transferred via transferList over MessagePorts, to enable another way of communicating across threads. Under the hood, these are implemented using a MessagePort to send data across.

Happycoin: Revisited

Now that we’ve seen the basics of spawning threads in Node.js and having them communicate with each other, we have enough to rebuild our example from “Threads in C: Get Rich with Happycoin” in Node.js.

Recall that Happycoin is our imaginary cryptocurrency, with a completely ridiculous proof of work algorithm that goes as follows:

  1. Generate a random unsigned 64-bit integer.

  2. Determine whether or not the integer is happy.

  3. If it’s not happy, it’s not a Happycoin.

  4. If it’s not divisible by 10,000, it’s not a Happycoin.

  5. Otherwise, it’s a Happycoin.

Much like we did in C, we’ll make a single-threaded version first, and then adapt the code to run on multiple threads.

With Only the Main Thread

Let’s start with generating random numbers. First, let’s create a file called happycoin.js, in a directory called ch3-happycoin/. Fill it with the contents of Example 3-7.

Example 3-7. ch3-happycoin/happycoin.js
const crypto = require('crypto');

const big64arr = new BigUint64Array(1)
function random64() {
  crypto.randomFillSync(big64arr);
  return big64arr[0];
}

They crypto module in Node.js gives us some handy functions for getting cryptographically secure random numbers. We’ll definitely want this, since we’re building a cryptocurrency after all! Luckily, it’s less of an ordeal than it is in C.

The randomFillSync function fills a given TypedArray with random data. Since we’re looking for only a single 64-bit unsigned integer, we can use a BigUint64Array. This particular TypedArray, along with its cousin BigInt64Array, are recent additions to JavaScript that were made possible by the new bigint type, which stores arbitrarily large integers. Returning the first (and only) element of this array after we’ve filled it with random data gives us the random 64-bit unsigned integer that we’re looking for.

Now let’s add our happy number calculation. Add the contents of Example 3-8 to your file.

Example 3-8. ch3-happycoin/happycoin.js
function sumDigitsSquared(num) {
  let total = 0n;
  while (num > 0) {
    const numModBase = num % 10n;
    total += numModBase ** 2n;
    num = num / 10n;
  }
  return total;
}

function isHappy(num) {
  while (num != 1n && num != 4n) {
    num = sumDigitsSquared(num);
  }
  return num === 1n;
}

function isHappycoin(num) {
  return isHappy(num) && num % 10000n === 0n;
}

These three functions, sumDigitsSquared, isHappy, and isHappycoin, are direct translations from their C counterparts in “Threads in C: Get Rich with Happycoin”. One thing you might notice if you’re not familiar with bigint is the n suffix on all the number literals in this code. This suffix tells JavaScript that these numbers are to be treated as bigint values, rather than values of type number. This is important because while both types support mathematical operators like +, -, **, and so on, they cannot interoperate without doing an explicit conversion. For example, 1 + 1n would be invalid, because it’s an attempt to add the number 1 to the bigint 1.

Let’s finish off the file by implementing our Happycoin mining loop, and outputting the count of found Happycoins. Add Example 3-9 to your file.

Example 3-9. ch3-happycoin/happycoin.js
let count = 0;
for (let i = 1; i < 10_000_000; i++) {
  const randomNum = random64();
  if (isHappycoin(randomNum)) {
    process.stdout.write(randomNum.toString() + ' ');
    count++;
  }
}

process.stdout.write('
count ' + count + '
');

The code here is very similar to what we did in C. We loop 10,000,000 times, getting a random number and checking if it’s a Happycoin. If it is, we print it out. Note that we’re not using console.log() here because we don’t want to insert a newline character after each number found. Instead we want spaces, so we’re writing to the output stream directly. When we output the count after the loop, we need an additional newline character at the beginning of the output to separate it from the numbers above.

To run this program, use the following command in your ch3-happycoin directory:

$ node happycoin.js

Your output should be exactly the same as it was in C. That is, it should look something like this:

5503819098300300000
... 125 entries redacted for brevity ...
5273033273820010000
count 127

This takes quite a bit longer than the C example. On a run-of-the-mill machine, this took about 1 minute and 45 seconds with Node.js v16.0.0.

There are a variety of reasons why this takes so much longer. When building applications and optimizing for performance, it’s important to figure out what the source of performance overhead are. Yes, in general, JavaScript is often “slower than C” but this enormous difference can’t be explained by that alone. Yes, we’ll get better performance in the next section when we split this into multiple threads of work, but as you’ll see, it’s not nearly enough to make this implementation compelling when compared to the C example.

And on that note, let’s see what this looks like when we use worker_threads to split out the load.

With Four Worker Threads

To add worker threads, we’ll start from the code we had. Copy the contents of happycoin.js to happycoin-threads.js. Then insert the contents of Example 3-10 at the very beginning of the file, before the existing content.

Example 3-10. ch3-happycoin/happycoin-threads.js
const {
  Worker,
  isMainThread,
  parentPort
} = require('worker_threads');

We’ll need these parts of the worker_threads library, so we require them at the beginning.

Now, replace everything from let count = 0; to the end of the file with Example 3-11.

Example 3-11. ch3-happycoin/happycoin-threads.js
const THREAD_COUNT = 4;

if (isMainThread) {
  let inFlight = THREAD_COUNT;
  let count = 0;
  for (let i = 0; i < THREAD_COUNT; i++) {
    const worker = new Worker(__filename);
    worker.on('message', msg => {
      if (msg === 'done') {
        if (--inFlight === 0) {
          process.stdout.write('
count ' + count + '
');
        }
      } else if (typeof msg === 'bigint') {
        process.stdout.write(msg.toString() + ' ');
        count++;
      }
    })
  }
} else {
  for (let i = 1; i < 10_000_000/THREAD_COUNT; i++) {
    const randomNum = random64();
    if (isHappycoin(randomNum)) {
      parentPort.postMessage(randomNum);
    }
  }
  parentPort.postMessage('done');
}

We’re splitting behavior here with an if block. If we’re on the main thread, we start four worker threads, using the current file. Remember, __filename is a string containing the path and name of the current file. We then add a message handler for that worker. In the message handler, if the message is simply done then the worker has completed it’s work, and if all other workers are done, we’ll output the count. If the message is a number, or more correctly, a bigint, then we assume it’s a Happycoin, and we’ll print it out and add it to the count like we did in the single-threaded example.

On the else side of the if block, we’re running in one of the worker threads. In here, we’ll do the same sort of loop as we did in the single threaded example, except we’re only looping 1/4 of the number of times we did before, since we’re doing the same work across four threads. Also, rather than writing directly to the output stream, we’re sending found Happycoins back to the main thread via the MessagePort given to us, called parentPort. We’ve already set up the handler on the main thread for this. When the loop exists, we send a done on the parentPort to indicate to the main thread that we won’t be finding any more Happycoins on this thread.

We could have simply printed the Happycoins to the output immediately, but just like with the C example, we don’t want the different threads to clobber each other in the output, so we need to synchronize. Chapter 4 and Chapter 5 go over more advanced techniques for synchronization but for now it’s enough to just send the data back to the main thread through the parentPort and let the main thread handle output.

Now that we’re done adding threads to this example, you can run it with the following command in your ch3-happycoin directory:

$ node happycoin-threads.js

You should see output that looks something like this:

17241719184686550000
... 137 entries redacted for brevity ...
17618203841507830000
count 139

Like with the C example, this code runs quite a bit faster. In a test on the same computer and Node.js version as the single-threaded example, it ran in about 33 seconds. This is a huge improvement over the single-threaded example, so another big win for threads!

Note

This is not the only way to split this kind of problem up for thread-based computation. For example, other synchronization techniques could be used to avoid passing data between threads, or the messages could be batched. Always be sure to test and compare to find out whether threads are an ideal solution, and which thread techniques are most applicable to your problem, and the most efficient.

Worker Pools with Piscina

Many types of workloads will naturally lend themselves to using threads. In Node.js, most workloads involve processing an HTTP request. If within that code you find yourself doing a lot of math or synchronous data processing, it may make sense to offload that work to one or more threads. These types of operations involve submitting a single task to a thread, and waiting for a result from it. In much the same way a threaded web server often works, it makes sense to maintain a pool of workers that can be sent various tasks from the main thread.

This section only takes a shallow look at thread pools, adapting the familiar Happycoins application and abstracting the pooling mechanism using a package. “Thread Pool” covers thread pools extensively, building out an implementation from scratch.

Note

The concept of pooled resources isn’t unique to threads. For example, web browsers typically create pools of socket connections to web servers, so that they can multiplex all the various HTTP requests required to render a web page across those connections. Database client libraries often do a similar thing with sockets connected to the database server.

There’s a handy module available for Node.js called generic-pool which is a helper module for dealing with arbitrary pooled resources. These resources could be anything, like database connections, other sockets, local caches, threads, or pretty much anything else that might require having multiple instances of something but only accessing one at a time, without caring which one it is.

For the use case of discrete tasks sent to a pool of worker threads, we have the piscina module at our disposal. This module encapsulates the work of setting up a bunch of worker threads and allocating tasks to them. The name of the module comes from the Italian word for “pool”.

The basic usage is straightforward. You create an instance of Piscina, passing in a filename, which will be used in the worker thread. Behind the scenes, a pool of worker threads is created, and a queue is set up to handle incoming tasks. You can enqueue a task by calling .run(), passing in a value containing all the data necessary to complete this task, noting that the values will be cloned as they would be with postMessage(). This returns a promise that resolves once the tasks have been completed by a worker, giving a result value. In the file to be run in the worker, a function must be exported that takes in whatever is passed to .run() and returns the result value. This function can also be an async function, so that you can do asynchronous tasks in a worker thread if you need to. A basic example calculating square roots in worker threads is found in Example 3-12.

Example 3-12. Computing square roots with piscina.
const Piscina = require('piscina');

if (!Piscina.isWorkerThread) { 1
  const piscina = new Piscina({ filename: __filename }); 2
  piscina.run(9).then(squareRootOfNine => { 3
    console.log('The square root of nine is', squareRootOfNine);
  });
}

module.exports = num => Math.sqrt(num); 4
1

Much like cluster and worker_threads, piscina provides a handy boolean for determining whether we’re in the main thread or a worker thread.

2

We’ll use the same technique for using the same file as we did with the Happycoin example.

3

Since .run() returns a promise, we can just call .then() on it.

4

The exported function is used in the worker thread to perform the actual work. In this case, it’s just calculating a square root.

While it’s all fine and good to run one task on the pool, we need to be able to run many tasks on the pool. Let’s say we want to calculate the square roots of every number less than ten million. Let’s go ahead and loop ten million times. We’ll also replace the logging with an assertion that we’ve gotten a numeric result, since logging will be quite noisy. Have a look at Example 3-13.

Example 3-13. Computing ten million square roots with piscina.
const Piscina = require('piscina');
const assert = require('assert');

if (!Piscina.isWorkerThread) {
  const piscina = new Piscina({ filename: __filename });
  for (let i = 0; i < 10_000_000; i++) {
    piscina.run(i).then(squareRootOfI => {
      assert.ok(typeof squareRootOfI === 'number');
    });
  }
}

module.exports = num => Math.sqrt(num);

This seems like it ought to work. We’re submitting ten million numbers to be processed by the worker pool. However, if you run this code, you’ll get a non-recoverable JavaScript memory allocation error. On one trial of this with Node.sj 16.0.0, the following output was observed.

FATAL ERROR: Reached heap limit Allocation failed
    - JavaScript heap out of memory
 1: 0xb12b00 node::Abort() [node]
 2: 0xa2fe25 node::FatalError(char const*, char const*) [node]
 3: 0xcf8a9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*,
    char const*, bool) [node]
 4: 0xcf8e17 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*,
    char const*, bool) [node]
 5: 0xee2d65  [node]
[ ... 13 more lines of a not-particularly-useful C++ stacktrace ... ]
Aborted (core dumped)

What’s going on here? It turns out the underlying task queue is not infinite. By default, the task queue will keep growing and growing until we run into an allocation error like this one. To avoid having this happen, we need to set a reasonable limit. The piscina module lets you set a limit by using a maxQueue option in its constructor, which can be set to any positive integer. Through experimentation, the maintainers of piscina have found that an ideal maxQueue value is the square of the number of worker threads it’s using. Handily, you can use this number without even knowing it by setting maxQueue to auto.

Once we’ve established a bound for the queue size, we need to be able to handle it when the queue is full. There are two ways to detect that the queue is full:

  1. Compare the values of piscina.queueSize and piscina.options.maxQueue. If they’re equal, then the queue is full. This can be done prior to calling piscina.run() to avoid attempting to enqueue when it’s full. This is the recommended way to check.

  2. If piscina.run() is called when the queue is full, the returned promise will reject with an error indicated that the queue is full. This isn’t ideal because by this point we’re already in a further tick of event loop and many other attempts to enqueue may already have happened.

When we know that the queue is full, we need a way of knowing when it’ll be ready for new tasks again. Fortunately piscina pools emit a drain event once the queue is empty, which is certainly an ideal time to start adding new tasks. In Example 3-14, we put this all together with an async function around the loop that submits the tasks.

Example 3-14. Computing ten million square roots with piscina, without crashing.
const Piscina = require('piscina');
const assert = require('assert');
const { once } = require('events');

if (!Piscina.isWorkerThread) {
  const piscina = new Piscina({
    filename: __filename,
    maxQueue: 'auto' 1
  });
  (async () => { 2
    for (let i = 0; i < 10_000_000; i++) {
      if (piscina.queueSize === piscina.options.maxQueue) { 3
        await once(piscina, 'drain'); 4
      }
      piscina.run(i).then(squareRootOfI => {
        assert.ok(typeof squareRootOfI === 'number');
      });
    }
  })();
}

module.exports = num => Math.sqrt(num);
1

The maxQueue option is set to auto, which limits the queue size to the square of the number of threads that piscina is using.

2

The for loop is wrapped in an async IIFE in order to use an await within it.

3

When this check is true, the queue is full.

4

We then wait for the drain event before submitting any new tasks to the queue.

Running this code does not result in an out-of-memory crash like it did before. It takes a fairly long time to complete, but it does finally exit without issue.

As seen here, its easy to fall into a trap where using a tool in what seems like the most sensible way isn’t the best approach. It’s important to fully understand tools like piscina when building out your multithreaded applications.

On that note, let’s see what happens when we try to use piscina to mine Happycoins.

A Pool Full of Happycoins

To use piscina to produce Happycoins, we’ll use a slightly different approach from what we did in the original worker_threads implementation. Instead of getting a message back every time we have a Happycoin, we’ll batch them together and send them all at once when we’re done. This tradeoff saves us the effort of setting up a MessageChannel to send data back to the main thread with, the side effect is that we’ll only get our results in batches, rather than as soon as they’re ready. The main thread will still do the job of spawning the appropriate threads, and retrieving all the results.

Tradeoffs

To start off, copy your happycoin-threads.js file to a new one called happycoin-piscina.js. We’ll build off our old worker_threads example here. Now replace everything before the require('crypto') line with Example 3-15.

Example 3-15. ch3-happycoin/happycoin-threads.js
const Piscina = require('piscina');

Yep, that’s it! Now we’ll get to the more substantial stuff. Replace everything after the isHappycoin() function declaration with the contents of Example 3-16.

Example 3-16. ch3-happycoin/happycoin-threads.js
const THREAD_COUNT = 4;

if (!Piscina.isWorkerThread) { 1
  const piscina = new Piscina({
    filename: __filename, 2
    minThreads: THREAD_COUNT, 3
    maxThreads: THREAD_COUNT
  })
  let done = 0;
  let count = 0;
  for (let i = 0; i < THREAD_COUNT; i++) { 4
    (async () => {
      const { total, happycoins } = await piscina.run() 5
      process.stdout.write(happycoins);
      count += total;
      if (++done === THREAD_COUNT) { 6
        console.log('
count', count);
      }
    })();
  }
}
1

We’ll use the handy isWorkerThread property to check that we’re in the main thread.

2

We’re using the same technique as earlier to create worker threads using this same file.

3

We want to restrict the number of threads to be exactly four, to match our previous examples. We’ll want to time this and see what happens, so sticking with four threads reduces the number of variables here.

4

We know we have four threads, so we’ll enqueue our task four times. Each one will complete once it’s checked it’s chunk of random numbers for Happycoins.

5

We submit the task to the queue in this async IIFE, so that they all get queued in the same event loop iteration. Don’t worry, we won’t get out-of-memory errors like we did before, since we know we have exactly four threads and we’re only enqueueing four tasks. As we’ll see later, the task returns both the output string and the total count of Happycoins that the thread has found.

6

Much like we’ve done in previous Happycoin implementations, we’ll check that all threads have completed their tasks before outputting the grand total count of Happycoins that we’ve found.

Next we’ll add the code from Example 3-17, which adds the exported function that’s used in piscina’s worker threads.

Example 3-17. ch3-happycoin/happycoin-piscina.js
module.exports = () => {
  let happycoins = '';
  let total = 0;
  for (let i = 0; i < 10_000_000/THREAD_COUNT; i++) { 1
    const randomNum = random64();
    if (isHappycoin(randomNum)) {
      happycoins += randomNum.toString() + ' ';
      total++;
    }
  }
  return { total, happycoins }; 2
}
1

We’re doing our typical Happycoin-hunting loop here, but as in other parallelism examples, we’re dividing our total search space by the number of threads.

2

We’re passing the string of found Happycoins and the total count of them back to the main thread by returning a value from this function.

In order to run this, we’ll have to install piscina if you haven’t done so yet for the earlier examples. You can use the following two commands in your ch3-happycoin directory to set up an Node.js project and add the piscina dependency. The third line can then be used to run the code:

$ npm init -y
$ npm install piscina
$ node happycoin-piscina.js

You should see output the same as earlier examples, with a slight twist. Rather than seeing each Happycoin come in one-by-one, you’ll see them either roughly all at once, or in four large groupings of them. This is the tradeoff we made by returning the whole strings rather than the Happycoins one-by-one. This code should run in roughly the same time as happycoin-threads.js, since it uses the same principle, but with the abstraction layer that piscina provides us.

You can see that we’re not using piscina in the typical manner. We’re not passing it a multitude of discrete tasks that end up requiring careful queueing. The primary reason for this is performance.

If, for example, we had a loop iterating ten million times in the main thread, each time adding another task to the queue and await-ing its response, it would end up being just as slow as running all the code synchronously on the main thread. We could not await the reply and just add things to the queue as soon as we can, but it turns out the overhead of passing messages twenty million times is a lot greater than simply passing eight messages.

When dealing with raw data, like numbers or byte streams, there are usually faster ways of transferring data between threads using SharedArrayBuffers, and we’ll see more about those in the next chapter.

1 Yes, other non-browser JavaScript runtimes exist, like Deno, but Node.js has such a massive amount of popularity and market share at time of writing that it’s the only one worth talking about here. This may change by the time you’re reading this, and that’s great for the world of JavaScript! Hopefully, there’s a newer edition of this book that covers your non-browser JavaScript runtime of choice.

2 https://nodejs.org/api/webstreams.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.206.238.189