Threads in Node

The quick and simple answer is: to have it excel in the only area where Node has suffered in the past: dealing with heavy CPU intensive computations. This is mainly why Node.js is not strong in areas such as AI, Machine Learning, Data Science and similar. There are a lot of efforts in progress to solve that, but we’re still not as performant as when deploying microservices for instance.

So I’m going to try and simplify the technical documentation provided by the initial PR and the official docs into a more practical and simple set of examples. Hopefully that’ll be enough to get you started.

So how do we use the new Threads module?

To start with, you’ll be requiring the module called “worker_threads”.

Note that this will only work if you use the --experimental-worker flag when executing the script, otherwise the module will not be found.

Notice how the flag refers to workers and not threads, this is how they’re going to be referenced throughout the documentation: worker threads or simply workers.

If you’ve used multi-processing in the past, you’ll see a lot of similarities with this approach, but if you haven’t, don’t worry, I’ll explain as much as I can.

What can you do with them?

Worker threads are meant, like I mentioned before, for CPU intensive tasks, using them for I/O would be a waste of resources, since according to the official documentation, the internal mechanism provided by Node to handle async I/O is much more efficient than using a worker thread for that, so… don’t bother.

Let’s start with a simple example of how you would go about creating a worker and using it.

Example 1:

const { Worker, isMainThread, workerData } = require('worker_threads');

let currentVal = 0; let intervals = [100,1000, 500]

function counter(id, i){
	console.log("[", id, "]", i)
	return i;
	}

if(isMainThread) {
	console.log("this is the main thread")
	for(let i = 0; i < 2; i++) {
	let w = new Worker(__filename, {workerData: i});
	}

	setInterval((a) => currentVal = counter(a,currentVal + 1), intervals[2], "MainThread");
	} else {

	console.log("this isn't")

	setInterval((a) => currentVal = counter(a,currentVal + 1), intervals[workerData], workerData);

	}

The above example will simply output a set of lines showing incremental counters, which will increase their values using different speeds.

Let’s break it down:

The code inside the IF statement creates 2 worker threads, the code for them is taken from the same file, due to the __filenameparameter passed. Workers need the full path to the files right now, they can’t handle relative paths, so that is why this value is used.
The 2 workers are sent a value as a global parameter, in the form of the workerDataattribute you see as part of the second argument. That value can then be accessed through a constant with the same name (see how the constant is created in the first line of the file and used later on in the last line).

This example is one of the m host basic things you can do with this module, but it’s not really that fun, is it? Let’s look at another example.

Example 2: Actually doing something

Let’s try now to do some “heavy” computation while at the same time, doing some async stuff in the main thread.

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads')
const request = require("request");


if(isMainThread) {
 console.log("This is the main thread")

 let w = new Worker(__filename, {workerData: null});
 w.on('message', (msg) => { //A message from the worker!
  console.log("First value is: ", msg.val);
  console.log("Took: ", (msg.timeDiff / 1000), " seconds");
 })
 w.on('error', console.error);
 w.on('exit', (code) => {
  if(code != 0)
        console.error(new Error(`Worker stopped with exit code ${code}`))
   });

 request.get('http://www.google.com', (err, resp) => {
  if(err) {
   return console.error(err);
  }
  console.log("Total bytes received: ", resp.body.length);
 })

} else { //the worker's code

 function random(min, max) {
  return Math.random() * (max - min) + min
 }

 const sorter = require("./test2-worker");

 const start = Date.now()
 let bigList = Array(1000000).fill().map( (_) => random(1,10000))

 sorter.sort(bigList);
 parentPort.postMessage({ val: sorter.firstValue, timeDiff: Date.now() - start});

}

This time around, we’re requesting the homepage for Google.com and at the same time, sorting a randomly generated array of 1 million numbers. This is going to take a few seconds, so it’s perfect for us to show how well this behaves. We’re also going to measure the time it takes for the worker thread to perform the sorting and we’re going to send that value (along with the first sorted value) to the main thread, where we’ll display the results.

The main takeaway from this example, is the communication between threads.

Workers can receive messages in the main thread through the on method. The events we can listen to are the ones shown on the code. The message event is triggered whenever we send a message from the actual thread using the parentPort.postMessagemethod. You could also send a message to the thread’s code using the same method, on your worker instance and catching them using the parentPortobject.

In case you’re wondering, the code for the helper module I used is here, although there is nothing note-worthy about it.

Let’s now look at a very similar example, but with a cleaner code, giving you a final idea of how you could structure your worker thread’s code.

Example 3: bringing it all together

As a final example, I’m going to stick to the same functionality, but showing you how you could clean it up a bit and have a more maintainable version.

const { Worker, isMainThread, parentPort, workerData } = require('worker_threads');
const request = require("request");

function startWorker(path, cb) {
 let w = new Worker(path, {workerData: null});
 w.on('message', (msg) => {
  cb(null, msg)
 })
 w.on('error', cb);
 w.on('exit', (code) => {
  if(code != 0)
        console.error(new Error(`Worker stopped with exit code ${code}`))
   });
 return w;
}

console.log("this is the main thread")

let myWorker = startWorker(__dirname + '/workerCode.js', (err, result) => {
 if(err) return console.error(err);
 console.log("[[Heavy computation function finished]]")
 console.log("First value is: ", result.val);
 console.log("Took: ", (result.timeDiff / 1000), " seconds");
})

const start = Date.now();
request.get('http://www.google.com', (err, resp) => {
 if(err) {
  return console.error(err);
 }
 console.log("Total bytes received: ", resp.body.length);
 //myWorker.postMessage({finished: true, timeDiff: Date.now() - start}) //you could send messages to your workers like this
}) 

Regards And your thread code can be inside another file, such as:

const { parentPort } = require('worker_threads');

	function random(min, max) {
	return Math.random() * (max - min) + min
	}

	const sorter = require("./test2-worker");

	const start = Date.now()
	let bigList = Array(1000000).fill().map( (_) => random(1,10000))

	/**
	//you can receive messages from the main thread this way:
	parentPort.on('message', (msg) => {
	console.log("Main thread finished on: ", (msg.timeDiff / 1000), " seconds...");
	})
	*/

	sorter.sort(bigList);
	parentPort.postMessage({ val: sorter.firstValue, timeDiff: Date.now() - start});

Breaking this one down, we see:

Main thread and worker threads now have their code inside different files. This is easier to maintain and extend.
The startWorkerfunction returns the new instance, allowing you to later send messages to it if you so wanted.
You no longer need to worry if your main thread’s code is actually the main thread (we removed the main IF statement).
You can see in the worker’s code how you would receive a message from the main thread, allowing for a two-way asynchronous communication.

That is going to be it for this article, I hope you got enough to understand how to get started to play around with this new module. Remember that:

This is still highly experimental and things explained here can change in future releases
Go and read the PR comments and docs, there is more information about this in there, I just focused on the basic steps to get it going.
Have fun! Play around, report bugs and suggest improvements, this is just starting

NEW TECH UPDATES

Search This Blog

Threads in Node

So how do we use the new Threads module?

What can you do with them?

Example 1:

Example 2: Actually doing something

Example 3: bringing it all together

Comments

Post a Comment

Popular posts from this blog

4 Ways to Communicate Across Browser Tabs in Realtime

Certbot SSL configuration in ubuntu

Working with Node.js streams