Node worker threads with shared array buffers and Rust WebAssembly

WebAssembly enables Rust to run as part of a JavaScript code, which increases performance and stability. However, not all Rust applications can run under WebAssembly because it was originally designed to run inside a browser and connect with a browser-based JS. This introduces security concerns and the potential for bad behavior, which is why most of the functionality is blocked, including Rust threading and multiprocessing. It’s just a pure language with plain stdlib and web_sys, the main library for using WebAssembly functionality in Rust.

Unlike browser-based JavaScript, Node.js has all sorts of OS-level functionality. But sadly, there is no specific version of WebAssembly designed for Node.js functionality. To get around this problem, you can make a Rust-based WebAssembly project and call functions between Node.js and Rust when your project requires compute-heavy operations to make safe spots/functions.

WebAssembly was originally designed to work as an atomic component inside JavaScript and communicate with event-based messages, as WebWorker works now. Later implementations moved away from that gap, and today WebAssembly is like a compiled library with lower-level API to access.

Threading with WebAssembly

The point of having Rust-based WebAssembly in Node.js is to offload some compute-heavy parts from Node.js to Rust, which runs significantly faster for tasks that require algorithmic or memory optimization. The Rust compiler is smart enough to optimize its base functions, which makes it run faster. However, the combination of a single-threaded Node.js and Rust-based WebAssembly, which also runs without threading, won’t do much to save you time or resources.

The idea is to use a Node.js module worker_threads to spin up a new Rust WebAssembly computation without having a sync operation waiting in the main thread. Let’s demonstrate how to spin up a Node.js worker thread from JavaScript WebAssembly wrapper similar to this:

const {Worker} = require('worker_threads');

...
const worker = new Worker('wasm-wrapper.js', {...someDataIfWeNeed});
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
  if (code !== 0) {
        console.log(`Worker stopped with exit code ${code}`);
    }
});
…

wasm-wrapper.js is not the actual Wasm file; worker_threadspins up only JavaScript files, which are parsable and contain main executional processes. A Wasm file itself is just a library that exports functions to be accessible from JavaScript, which is why we need a JS wrapper function.

How to make a real WebAssembly integration in Rust

Before Rust’s involvement in WebAssembly, it was very hard to compile a project into a Wasm file due to the lack of compile tools. The Rust community made it amazingly simple to jump into WebAssembly.

To start, install Cargo/Rust and wasm-pack. Once the base installation setup is done, you’re ready to start coding.

mod utils;

use wasm_bindgen::prelude::*;

// When the `wee_alloc` feature is enabled, use `wee_alloc` as the global
// allocator.
#[cfg(feature = "wee_alloc")]
#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

#[wasm_bindgen]
pub fn greet() -> String {
    String::from("Hello World!")
}

This is a basic “Hello, World!” for Rust WebAssembly. By compiling it with wasm-pack, we get a full JS wrapper and Wasm file.

~# wasm-pack build --target=nodejs

# Output files
pkg/<project_name>_bg.wasm
    <project_name>.js

We are not going to work with the Wasm file directly because it is already wrapped with helper functions inside the JS file.

const {greet} = require('./pkg/<project_name>.js');
console.log(greet());

// This will print "Hello World!"

This basic example shows how easy it can be to make a WebAssembly integration with Node.js. Now let’s connect those two pieces with a shared buffer variable inside a worker thread.

WebAssembly and worker threads

We’re at the stage where we can call a WebAssembly function within Node.js. Again, the actual Node.js worker thread is just a JS file that needs to be executed as a separate thread.

First, let’s make two JavaScript files, like this:

// main.js - the main executional file to start program from
const { Worker } = require('worker_threads');
const worker = new Worker('./worker.js');
worker.once('message', (message) => {
  console.log(message);
});

// worker.js - worker file to be called from main.js
const { parentPort } = require('worker_threads');
const {greet} = require('./pkg/<project_name>.js');
parentPort.once('message', (message) => {
  parentPort.postMessage(greet());
});

Our greeting message will be printed from the master thread, but the actual WebAssembly is executed from the worker thread. Using this basic principle, we can execute and sync operational Wasm code as a separate thread and wait for a message from it.

A few companies are doing heavy computational 3D rendering with WebAssembly across many worker threads. This helps to keep JavaScripts’ main event loop nonblocking while scaling across many CPU cores.

What if you want to share a variable between Wasm and JS? This is a bit more complicated in theory than in practice because Rust variable borrowing and mutable references usually do their jobs. However, it is not possible to play with a straight variable from JS to Rust because the actual communicational channel goes over shared plain memory, which is just a buffer. It comes in SharedBuffer type, which helps to transfer data between different types JavaScript and Rust data models.

Shared array buffers from JS to Wasm

There are only a few types of arrays supported by Rust Wasm implementation. The most common is &[u8] byte array, which is a plain byte-based representation of data. As you know from base computer science courses, all data consists of bytes. Therefore, you can pass the actual byte array, which represents complex objects encoded in some format, over Wasm and JS.

For example, let’s modify our Rust function to handle mutable array as an argument.

...
#[wasm_bindgen]
pub fn greet(input: &mut [u8]) -> Vec<u8> {
    input[0] = 10; // just changing some value here
    Vec::from(input)
}
…

Rust code is waiting to receive a mutable pointer to an array buffer from JS memory, and because it is inside the same process, memory pointers are accessible from one to another. Since it is the actual pointer for an array and not the copy, we can change values in it and the changes will be reflected in an original JS memory.

const {greet} = require('./pkg/noders');

const arr = new Uint8Array(11);

console.log(greet(arr)); // [10, 0, 0...]

console.log(arr);  // [10, 0, 0...]

This basic principle enables you to process plain data arrays between Wasm-compiled objects and JavaScript. Of course, you could potentially build an entire shared type system for WebAssembly, because everything could be represented as a plain byte array. Remember in C when you had to make memcpy(...)with pointers being an actual structure? This could be a similar scenario, but there is no specific use case yet. Usually, just a plain byte array messaging will do.

Conclusion

WebAssembly is going to take over some heavy load tasks, and the more tools we build around it, the more seriously we’ll take it — especially if we now have the ability to share memory between JS and Wasm. Combine that with Node.js worker threads, and we have the power to scale JavaScript code across many CPU cores and even GPUs since we can access GPU over Rust WebAssembly.

Working with Node.js streams

Introduction Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams. This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once. This article will cover the following topics: Types of streams When to adopt Node.js streams Batching Composing streams in Node.js Transforming data with transform streams Piping streams Error handling Node.js streams Types of streams The following are four main types of streams in Node.js: Readable streams: The readable stream is responsible for reading data from a source file Writable streams: The writable stream is re...

NEW TECH UPDATES

Search This Blog