Chapter 7. WebAssembly

While the title of this book is Mulithreaded JavaScript, modern JavaScript runtimes also support WebAssembly. For the unaware, WebAssembly (often abbreviated as WASM) is a binary-encoded instruction format that runs on a stack-based virtual machine. It’s designed with security in mind, and runs in a sandbox where the only things it has access to are memory and functions provided by the host environment. The main motivation behind having such a thing in browsers and other JavaScript runtimes is to run the parts of your program that are performance-sensitive in an environment where execution can happen much faster than JavaScript. Another goal is to provide a compile target for typically-compiled languages like C, C++, and Rust. This opens the door for developers of those languages to develop for the web.

Generally, the memory used by WebAssembly modules is represented by ArrayBuffers, but it can also be represented by SharedArrayBuffers. In addition, there are WebAssembly instructions for atomic operations, similar to the Atomics object we have in JavaScript. With SharedArrayBuffers, atomic operations, and web workers (or worker_threads in Node.js), we have enough to do the full suite of multithreaded programming tasks using WebAssembly.

Before we jump in to multithreaded WebAssembly, let’s build a “Hello, world!” example and execute it, to find the strengths and limitations of WebAssembly.

Your First WebAssembly

While WebAssembly is a binary format, a plain text format exists to represent it in human readable form. This is comparable to how machine code can by represented in a human-readable assembly language. The language for this WebAssembly text format is simply called WebAssembly text format, but the file extension typically used is .wat, so it’s common enough to refer to this language as WAT. It uses S-expressions as its primary syntactic separator, which is helpful for both parsing and readability. S-expressions, known primarily from the Lisp family of languages, are nested lists delimited by parentheses, with whitespace between each item in the list.

To get a feel for this format, let’s implement a simple addition function in WAT. Create a file called ch7-wasm-add/add.wat and add the contents of Example 7-1.

Example 7-1. ch7-wasm-add/add.wat
(module 1
  (func $add (param $a i32) (param $b i32) (result i32) 2
    local.get $a 3
    local.get $b
    i32.add)
  (export "add" (func $add)) 4
)
1

The first line declares a module. Every WAT file begins with this.

2

We declare a function called $add, taking in two 32-bit integers, and returning another 32-bit integer.

3

This is the start of the function body, in which we have three statements. The first two grab the function parameters and put them on the stack one after another. Recall that WebAssembly is stack-based. That means many operations will operate on the first (if unary) or first two (if binary) items on the stack. The third statement is a binary “add” operation on i32 values, so it grabs the top two values from the stack and adds them together, putting the result at the top of the stack. The return value for a function is the value at the top of the stack once it completes.

4

In order to use a function outside the module in the host environment, it needs to be exported. Here we export the $add function, giving it the external name add.

We can convert this WAT file to WebAssembly binary by using the wat2wasm tool from the WebAssembly Binary Toolkit (WABT). This can be done with the following one-liner in the ch7-wasm-add directory.

$ npx -p wabt wat2wasm add.wat -o add.wasm

Now we have our first WebAssembly file! These files aren’t useful outside a host environment, so let’s write a bit of JavaScript to load the WebAssembly and test the add function. Add the contents of Example 7-2 to ch7-wasm-add/add.js.

Example 7-2. ch7-wasm-add/add.js
const fs = require('fs/promises'); // Needs Node.js v14 or higher.

(async () => {
  const wasm = await fs.readFile('./add.wasm');
  const { instance: { exports: { add } } } = await WebAssembly.instantiate(wasm);
  console.log(add(2, 3));
})();

Provided you’ve created the .wasm file using the wat2wasm command above, you should be able to run this in the ch7-wasm-add directory.

$ node add.js

You can verify from the output that we are, in fact, adding via our WebAssembly module.

Simple mathematical operations on the stack don’t make any use of linear memory, or of concepts that have no meaning in WebAssembly, such as strings. Consider strings in C. Effectively, they’re nothing more than a pointer to the start of an array of bytes, terminated by a null byte. We can’t pass whole arrays by value to WebAssembly functions, or return them, but we can pass them by reference. This means that to pass a string as an argument, we need to first allocate the bytes in the linear memory and write to them, then pass the index of the first byte to the WebAssembly function. This can get more complex since we then need ways of managing the available space in the linear memory. We basically need malloc() and free() implementations operating on the linear memory.1

Hand-writing WebAssembly in WAT, while clearly possible, isn’t usually the easiest path to being productive and getting performance gains with it. It was designed to be a compile target for higher-level languages, and that’s where it really shines. “Compiling C Programs to WebAssembly with Emscripten” explores that in more detail.

Atomic Operations in WebAssembly

Although a full treatment of every WebAssembly instruction would be out of place in this book, it’s worth pointing out the instructions specific to atomic operations on shared memory, since they’re key to multithreaded WebAssembly code, whether compiled from another language, or hand-written in WAT.

WebAssembly instructions often start with the type. In the case of atomic operations, the type is always i32 or i64, corresponding to 32-bit and 64-bit integers respectively. All atomic operations have .atomic. next in the instruction name. After that, you’ll find the specific instruction name.

Let’s go over some of the atomic operation instructions. We won’t go over exact syntax, but this should give you an idea of the kinds of operations available at the instruction level:

[i32|i64].atomic.[load|load8_u|load16_u|load32_u]

The load family of instructions is equivalent to Atomics.load() in JavaScript. Using one of the suffixed instructions allows you to load smaller numbers of bits, extending the result with zeros.

[i32|i64].atomic.[store|store8|store16|store32]

The store family of instructions is equivalent to Atomics.store() in JavaScript. Using one of the suffixed instructions wraps the input value to that number of bits, and stores those at the index.

[i32|i64].atomic.[rmw|rmw8|rmw16|rmw32].[add|sub|and|or|xor|xchg|cmpxchg][|_u]

The rmw family of instructions all perform read-modify-write operations, equivalent to add(), sub(), and(), or(), xor(), exchange(), and compareExchnage() from the Atomics object in JavaScript, respectively. The operations are suffixed with a _u when they zero-extend, and rmw can have a suffix corresponding to the number of bits to be read.

The next two operations have a slightly different naming convention:

memory.atomic.[wait32|wait64]

These are equivalent to Atomics.wait() in JavaScript, suffixed according to the number of bits they operate on.

memory.atomic.notify

This is equivalent to Atomics.notify() in JavaScript.

These instructions are enough to perform the same atomic operations in WebAssembly as we can in JavaScript, but there’s an additional operation not available in JavaScript:

atomic.fence

This instruction takes no arguments and doesn’t return anything. It’s intended to be used by higher level languages that have ways of guaranteeing ordering of non-atomic accesses to shared memory.

All of these operations are used with the given WebAssembly module’s linear memory, which is the sandbox in which it gets to read and write values. When WebAssembly modules are initialized from JavaScript, they can be initialized with a linear memory provided provided as an option. This can be backed by a SharedArrayBuffer to enable usage across threads.

Although it’s certainly possible to use these instructions in WebAssembly, they suffer from the same drawback that the rest of WebAssembly does: it’s incredibly tedious and painstaking to write. Luckily, we can compile higher-level languages down to WebAssembly.

Compiling C Programs to WebAssembly with Emscripten

Since long before WebAssembly, Emscripten has been the go-to way to compile C and C++ programs for use in JavaScript environments. Today, it supports multithreaded C and C++ code using WebWorkers in browsers and worker_threads in Node.js.

In fact, a large corpus of existing multithreaded code in the wild can be compiled with Emscripten without issue. In both Node.js and browsers, Emscripten emulates the system calls used by native code compiled to WebAssembly, so that programs written in compiled languages can run without many changes.

Indeed, the C code we wrote way back in Chapter 1 can be compiled without any editing! Let’s give that a try now. We’ll use a Docker image to simplify using Emscripten. For other compiler toolchains, we’d want to make sure that the toolchain aligns with the system, but since WebAssembly and JavaScript are both platform-agnostic, we can just use the Docker image wherever Docker is supported.

First, make sure Docker is installed. Then, in your ch1-c-threads directory, run the following command:

$ docker run --rm -v $(pwd):/src -u $(id -u):$(id -g) 
  emscripten/emsdk emcc happycoin-threads.c -pthread 
  -s PTHREAD_POOL_SIZE=4 -o happycoin-threads.js

There are a few things to discuss with this command. We’re running the emscripten/emsdk image, with the current directory mounted, running as the current user. Everything after and including emcc is the command we’re running inside the container. For the most part, this looks a lot like what we’d do when using cc to compile a C program. The main difference is that the output file is a JavaScript file rather than an executable binary. Don’t worry! A .wasm file is also generated. The JS file used as a bridge to any necessary system calls, and to set up the threads, since those can’t be instantiated in WebAssembly alone.

The other extra argument is -s PTHREAD_POOL_SIZE=4. Since happycoin-threads.c uses 3 threads, we allocate them ahead of time here. There are a few ways to handle thread creation in Emscripten, largely due to not blocking on main browser threads. It’s easiest to pre-allocate here since we know how many threads we’ll need.

Now we can run our WebAssembly version of multithreaded Happycoin. We’ll run the JavaScript file with Node.js. At time of writing, this requires Node.js v16 or higher, since that’s what the output of Emscripten supports.

$ node happycoin-threads.js

The output should look a bit like the following:

120190845798210000
... 106 entries redacted for brevity ...
14356375476580480000
count 108
Pthread 0x9017f8 exited.
Pthread 0x701500 exited.
Pthread 0xd01e08 exited.
Pthread 0xb01b10 exited.

The output looks the same as our other Happycoin examples from previous chapters, but the wrapper provided by Emscripten also informs us when the threads have exited. You’ll also need to Ctrl + C to exit the program. For extra fun, see if you can figure out what needs changing in order to make the process exit when done, and avoid those Pthread messages.

One thing you may notice when comparing against the native or JavaScript versions of Happycoin is timing. It’s clearly faster than the multithreaded JavaScript version, but also a bit slower than the native multithreaded C version. As always, it’s important to take measurements of your application to ensure that you’re getting the right benefits with the right tradeoffs.

While the Happycoin example doesn’t make use of any atomic operations, Emscripten supports the full suite of POSIX thread functionality and GCC built-in atomic operation functions. This means a great multitude of C and C++ programs can compile to WebAssembly using Emscripten.

Other WebAssembly Compilers

Emscripten isn’t the only way to compile code to WebAssembly. Indeed, WebAssembly was designed primarily as a compile target, rather than as a general-purpose language in its own right. There are myriad tools for compiling well-known languages to WebAssembly, and there are even some languages built with WebAssembly as the main target in mind, rather than machine code. Some are listed here, but it’s by no means exhaustive. You’ll notice a lot of “at time of writing” here, because this space is relatively new and the best ways of creating multithreaded WebAssembly code are still being developed! At least, at time of writing.

Clang/Clang++

The LLVM C-family compilers can target WebAssembly with the -target wasm32-unknown-unkown or -target wasm64-unknown-unkown options respectively. This is actually what Emscripten is now based on, in which POSIX threads and atomic operations work as expected. At time of writing, this is some of the best support for multithreaded WebAssembly. While clang and clang++ support WebAssembly output, the recommended approach is to use Emscripten, to get the full suite of platform support in browsers and Node.js.

Rustc

The Rust programming language compiler rustc supports WebAssembly output. The Rust website is a great starting point on how to use rustc in this way. To make use of threads, you can use the wasm-bindgen-rayon crate, which provides a parallelism API implemented using WebWorkers. At time of writing, Rust’s standard library thread support won’t work.

AssemblyScript

The AssemblyScript compiler takes a subset of TypeScript as input, generates WebAssembly output. While it does not support spawning threads, it does support atomic operations and using SharedArrayBuffers, so as long as you handle the threads themselves on the JavaScript side via web workers or worker_threads, you can make full use of multithreaded programming in AssemblyScript. We’ll cover it in more depth in “AssemblyScript”.

There are, of course, many more options, with new ones arriving all the time. It’s worth having a look around the web to see if your compiled language of choice can target WebAssembly, and whether or not it supports atomic operations in WebAssembly.

AssemblyScript

AssemblyScript is a subset of TypeScript that compiles to WebAssembly. Rather than compiling an existing langauge and providing implementations of existing system APIs, AssemblyScript was designed as a way to produce WebAssembly code with a much more familiar syntax than WAT. A major selling point of AssemblyScript is that many projects use TypeScript already, so adding some AssemblyScript code in order to take advantage of WebAssembly doesn’t require as much of a context-switch or even learning an entirely different programming language.

An AssemblyScript module looks a lot like a TypeScript module. If you’re unfamiliar with TypeScript, it can be thought of as ordinary JavaScript, but with some additional syntax to indicate type information. Here is a basic TypeScript module that performs addition:

export function add(a: number, b: number): number {
  return a + b
}

You’ll notice this looks almost exactly the same as a plain ECMAScript module, with the exception of type information in the form of : number after each of the function arguments and identifying the return value’s type. The TypeScript compiler can use these types to check that any code calling this function is passing in the correct types and assuming the correct type on the return value.

AssemblyScript looks much the same, except instead of using JavaScript’s number type, there are built-in types for each of the WebAssembly types. If we wanted to write the same addition module in typescript, and assuming 32-bit integers everywhere for types, it would look something like Example 7-3. Go ahead and add that to a file called ch7-wasm-add/add.ts.

Example 7-3. ch7-wasm-add/add.ts.
export function add(a: i32, b: i32): i32 {
  return a + b
}

Since AssemblyScript files are just TypeScript, they use the .ts extension just the same. To compile a given AssemblyScript file to WebAssembly, we can use the asc command from the assemblyscript module. Try running the following command in the ch7-wasm-add directory:

$ npx -p assemblyscript asc add.ts --binaryFile add.wasm

You can try running the WebAssembly code using the same add.js file from Example 7-2. The output should be the same since the code is the same.

If you omit the --binaryFile add.wasm you’ll get the module as translated into WAT, as shown in Example 7-4. You’ll see it’s roughly the same as Example 7-1.

Example 7-4. The WAT rendition of the AssemblyScript add function.
(module
 (type $i32_i32_=>_i32 (func (param i32 i32) (result i32)))
 (memory $0 0)
 (export "add" (func $add/add))
 (export "memory" (memory $0))
 (func $add/add (param $0 i32) (param $1 i32) (result i32)
  local.get $0
  local.get $1
  i32.add
 )
)

AssemblyScript doesn’t provide the ability to spawn threads, but threads can be spawned in the JavaScript environment, and SharedArrayBuffers can be use for the WebAssembly memory. Most importantly, it supports atomic operations via a global atomics object, not particularly different from regular JavaScript’s Atomics. The main difference is that rather than operating on a TypedArray, these functions operate on the linear memory of the WebAssembly module, with a pointer and an optional offset. See the AssemblyScript documentation for details.

To see this in action, let’s create one more implementation of our Happycoin example that we’ve been iterating on since Chapter 1.

Happycoin in AssemblyScript

Much like previous versions of our Happycoin example, this approach multiplexes the crunching of numbers over several threads and sends the results back. It’s a glimpse of how multithreaded AssemblyScript can work. In a real-world application, you’d want to take advantage of shared memory and atomic operations but to keep things simple, we’ll stick with just fanning the work out to the threads.

Let’s begin by creating a directory called ch7-happycoin-as and switch to that directory. We’ll initialize a new project and add some necessary dependencies as follows:

$ npm init -y
$ npm install assemblyscript
$ npm install @asssemblyscript/loader

The assemblyscript package includes the AssemblyScript compiler, and the assemblyscript/loader package give us handy tools for interacting with the built module.

In the scripts object in the newly-created package.json, we’ll add "build" and "start" properties to simplify the compilation and running of the program:

"build": "asc happycoin.ts --binaryFile happycoin.wasm --exportRuntime",
"start": "node --no-warnings --experimental-wasi-unstable-preview1 happycoin.mjs"

The additional --exportRuntime parameter gives us some high-level tools for interacting with values from AssemblyScript. We’ll get into that a bit later.

When invoking Node.js in the "start" script, we pass the experimental WASI flag. This enables the WASI interface, giving WebAssembly access to system-level functionality that would otherwise be inaccessible. We’ll use this from AssemblyScript to generate random numbers. Because it’s experimental at time of writing, we’ll add the --no-warnings flag2 to suppress the warning we get for using WASI. The experimental status also means the flag may change in the future, so always be sure to consult the Node.js documentation for the version of Node.js you’re running.

Now, let’s write some AssemblyScript! Example 7-5 contains an AssemblyScript version of the Happycoin algorithm. Go ahead and add it to a file called happycoin.ts.

Example 7-5. ch7-happycoin-as/happycoin.ts
import 'wasi'; 1

const randArr64 = new Uint64Array(1);
const randArr8 = Uint8Array.wrap(randArr64.buffer, 0, 8); 2
function random64(): u64 {
  crypto.getRandomValues(randArr8); 3
  return randArr64[0];
}

function sumDigitsSquared(num: u64): u64 {
  let total: u64 = 0;
  while (num > 0) {
    const numModBase = num % 10;
    total += numModBase ** 2;
    num = num / 10;
  }
  return total;
}

function isHappy(num: u64): boolean {
  while (num != 1 && num != 4) {
    num = sumDigitsSquared(num);
  }
  return num === 1;
}

function isHappycoin(num: u64): boolean {
  return isHappy(num) && num % 10000 === 0;
}

export function getHappycoins(num: u32): Array<u64> {
  const result = new Array<u64>();
  for (let i: u32 = 1; i < num; i++) {
    const randomNum = random64();
    if (isHappycoin(randomNum)) {
      result.push(randomNum);
    }
  }
  return result;
}
1

The wasi module is imported here to ensure that the appropriate WASI-enabled globals are loaded.

2

We initialized a Uint64Array for our random numbers, but crypto.getRandomValues() only works with Uint8Array, so we’ll create one of those here as a view on the same buffer. Also, the TypedArray constructors in AssemblyScript aren’t overloaded, so instead there’s a static wrap() method available to construct new TypedArray instances from ArrayBuffer instances.

3

This method is the one we enabled WASI for.

If you’re familiar with TypeScript, this file looks very close to just being a TypeScript port of “Happycoin: Revisited”. You’d be correct! This is one of the major advantages of AssemblyScript. We’re not writing in a brand-new language, and yet we’re writing code that maps very closely to WebAssembly. Note that the return value of the exported function is of type Array<u64>. Exported functions in WebAssembly can’t return arrays of any kind, but they can return an index into the module’s memory (a pointer, really), which is exactly what’s happening here. We could deal with this manually, but as we’ll see, the AssemblyScript loader makes it much easier.

Of course, since AssemblyScript doesn’t provide a way of spawning threads on its own, we’ll need to do that from JavaScript. For this example, we’ll use ECMAScript modules to take advantage of top-level await, so go ahead and put the contents of Example 7-6 into a file called happycoin.mjs.

Example 7-6. ch7-happycoin-as/happycoin.mjs
import { WASI } from 'wasi'; 1
import fs from 'fs/promises';
import loader from '@assemblyscript/loader';
import { Worker, isMainThread, parentPort } from 'worker_threads';

const THREAD_COUNT = 4;

if (isMainThread) {
  let inFlight = THREAD_COUNT;
  let count = 0;
  for (let i = 0; i < THREAD_COUNT; i++) {
    const worker = new Worker(new URL(import.meta.url)); 2
    worker.on('message', msg => {
      count += msg.length;
      process.stdout.write(msg.join(' ') + ' ');
      if (--inFlight === 0) {
        process.stdout.write('
count ' + count + '
');
      }
    });
  }
} else {
  const wasi = new WASI();
  const importObject = { wasi_snapshot_preview1: wasi.wasiImport };
  const wasmFile = await fs.readFile('./happycoin.wasm');
  const happycoinModule = await loader.instantiate(wasmFile, importObject);
  wasi.start(happycoinModule);

  const happycoinsWasmArray =
    happycoinModule.exports.getHappycoins(10_000_000/THREAD_COUNT);
  const happycoins = happycoinModule.exports.__getArray(happycoinsWasmArray);
  parentPort.postMessage(happycoins);
}
1

This can’t be done without the --experimental-wasi-unstable-preview1 flag.

2

If you’re new to ESM, this might look strange. We don’t get the __filename variable available to us like we do in CommonJS modules. Instead the import.meta.url property gives us the full path as a file URL string. We need to pass that to the URL constructor for it to be usable as an input to the Worker constructor.

Adapted from “Happycoin: Revisited”, we’re again checking whether we’re in the main thread or not, and spawning four worker threads from the main thread. In the main thread, we’re expecting only one message on the default MessagePort, containing an array of found Happycoins. We simply log those, and a count of all of them once all the worker threads have sent the message.

On the else side, in the worker threads, we initialize a WASI instance to pass to the WebAssembly module, and then instantiate the module using @assemblyscript/loader, giving us what we need to handle the array return value we get from the getHappycoins function. We call the getHappycoins() method exported by the module, which gives us a pointer to an array in the WebAssembly linear memory. The __getArray function, provided by the loader, converts that pointer into a JavaScript Array, which we can then use as normal. We pass that to the main thread for output.

To run this example, run the following two commands. The first will compile the AssemblyScript to WebAssembly, and the second will run it via the JavaScript we just put together:

$ npm run build
$ npm start

The output will look roughly the same as with previous Happycoin examples. Here is the output from one local run:

7641056713284760000
... 134 entries redacted for brevity ...
10495060512882410000
count 136

As with all of these solutions, it’s important to evaluate the tradeoffs made with proper benchmarks. As an exercise, try timing this example against the other Happycoin implementations in this book. Is it faster or slower? Can you figure out why? What improvements can be made?

1 In C and other languages without automatic memory management, memory must be allocated for use with allocation functions like malloc() and then freed for later allocation with functions like free(). Memory management techniques like garbage collection make it easier to write programs in higher-level languages like JavaScript, but aren’t a built-in feature of WebAssembly.

2 In general, this isn’t a flag you want to have enabled for a production application. Hopefully by the time you read this, WASI support will no longer be experimental. If that’s the case, adjust these arguments accordingly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.141.202