Skip to main content

Advanced Node.Js: A Hands on Guide to Event Loop, Child Process and Worker Threads in Node.Js

What makes Node.js so performant and Scalable? Why is Node the technology of choice for so many companies? In this article, we will answer these questions and look at some of the advanced concepts that make Node.js unique. We will discuss:
  1. Event Loop ➰
  2. Concurrency Model 🚈
  3. Child Process 🎛️
  4. Threads and Worker Threads 🧵
JavaScript developers with a deeper understanding of Node.js reportedly earn 20% ~ 30% more than their peers. If you are looking to grow your knowledge of Node.js then this blog post is for you. Let’s dive in 🤿!!

What happens when you run a Node.js Program?

when we run our Node.js app it creates
  • 1 Process 🤖
  • 1 Thread 🧵
  • 1 Event Loop ➰
process is an executing program or a part of an executing program. An application can be made out of many processes. Node.js runtime, however, initiates only one process.
thread is a basic unit to which the operating system allocates processor time. Think of threads as a unit that lets you use part of your processor.
An event loop is a continuously running loop (just like a while loop). It executes one command at a time, more on this later. For now, let’s think of it as a while loop that will run until Node has executed every line of code.
Now, let’s take a look at how our code runs inside of Node.js instance.

What happens when we run this code? It will first print out Task 1 then Task 2 and then it will run the time consuming for loop (we won’t see anything in the terminal for a couple seconds) and finally it will print out Task 3. Let’s look at a diagram of what’s actually happening.
Component 1
Component 1
Node puts all our tasks into an Events queue and sends them one by one to the event loop. The event loop is single-threaded and it can only run one thing at a time. So it goes through Task 1 and Task 2 then the very big for loop and then it goes to Task 3. This is why we see a pause in the terminal after Task 2 because it is running the for a loop.
Now let’s do something different. Let’s replace that for loop with an I/O event.
Pro tip: you can generate a 100mb file in linux or mac just by running this command dd if=/dev/urandom of=ridiculously_large_file.txt bs=1048576 count=100
We would naturally assume that this will output something similar. Just like the for loop reading big files takes time and the execution on the event loop will take some time. We however, get something totally different.

But what caused this? How did Task 3 get executed before the file was read. Well let’s take a look at the visuals below to see what’s happening
Component 2
Component 2
I/O tasks, network requests, database processes are classified as blocking tasks in Node.js. So whenever the event loop encounters these tasks it sends them off to a different thread and moves on to the next task in events queue. A thread gets initiated from the thread pool to handle each blocking tasks and when it is done, it puts the result in a call-back queue. When the event loop is done executing everything in the events queue it will start executing the tasks in the call-back queue.  So that’s why we see done reading file at the end.

What makes the Single Threaded Event Loop Model Efficient? ⚙️

JavaScript was created to do just a simple things in the web browsers such as form validation or simple animations. This is why it was built with the single-threaded event loop model. Running everything in one thread is considered as a disadvantage.
However, in 2009 Ryan Dahl the creator of Node saw this simple event loop model as an opportunity to build a lightweight web server.
To better understand what problem Node.js solves we should look at the what typical web servers were like before Node.js came into play.
This is how a traditional multi-threaded web application model handles request:
  1. It maintains a thread pool (a collection of available threads)
  2. When client request comes in a thread is assigned
  3. This thread will take care of reading Client requests, processing Client requestS, performing any Blocking IO Operations (if required) and preparing Response.
  4. This thread is not free until a response is sent back
Main drawback of this model is handling concurrent users. So let’s say if we have more users visiting our sites than there are available threads then some users will need to wait until a thread frees up to get response. If a lot of users are performing blocking I/O tasks then this wait time also increases. This is also very resource-heavy if we are expecting one million concurrent users we better make sure we have enough threads to handle those requests.
Moreover, the server itself start to slow down because of increasing load. There’s also the overhead of context switching between threads and writing applications to optimize threads resource sharing can be painful.
Because of the single-threaded model Node.js, it doesn’t need to spin off new threads for every single request. Node.js also delegates blocking tasks to other components as we saw earlier. Since we don’t really care about many threads it makes node.js very lightweight and ideal for microservice-based architecture.

Drawbacks of Node’s Single Threaded Model !!!

The single-threaded event loop architecture uses resources efficiently but it doesn’t have some drawbacks. The Node.js instance cannot immediately benefit from multiple cores in your CPU. A Java application can have immediate access to more memory as we upgrade our hardware but Node runs on a single thread.
This is 2020 😄 and we are seeing more and more complicated web applications. What if our application needs to do complex computation, run a machine learning algorithm? Or What if we want to run a complicated crypto algorithm? In this case we have to harness the power of multiple cores to increase performance.
Languages like Java and C# can programmatically initiate threads and harness the power of multiple cores. In Node.js that is not an option as we saw earlier. Node’s way of solving this problem is child_process.

Child Process in Node

The child_process module gives node the ability to spawn child process by accessing operating system commands.
Let’s assume we have a REST endpoint that has a long-running function and we would like to use multiple cores in our processor to execute this function.
Here’s our code


In the example above we demonstrate how we can spin off a new process and share data between them. Using the forked process we can take advantage of multiple cores of CPU.
You can take a look at all the methods of child processes in the official node docs.
Here is a diagram of how child process work
Component 3
Component 3
child_process is a good solution but there’s another option. child_process module spins off new instances of Node to distribute the workload all these instances will each have 1 event loop 1 thread and 1 process. In 2018 Node.js introduced worker_thread. This module allows node the ability to have
  • 1 Process
  • Multiple threads
  • 1 Event Loop per thread
Yes!! You read that right 😄.
Component 4
Component 4


We check if it is the main thread and then create two workers and pass on messages. On the worker thread the data gets passed on through postMessage method and the workers execute the command.
Since worker_threads makes new threads inside the same process it requires less resources. Also we are able to pass data between these threads because they have the shared memory space.
As of January 2020 worker_threads are fully supported in the Node LST version 12. I highly recommend reading up the following post if you want to learn more about worker_threads.

And that’s it!!!


Comments

Popular posts from this blog

How to use Ngx-Charts in Angular ?

Charts helps us to visualize large amount of data in an easy to understand and interactive way. This helps businesses to grow more by taking important decisions from the data. For example, e-commerce can have charts or reports for product sales, with various categories like product type, year, etc. In angular, we have various charting libraries to create charts.  Ngx-charts  is one of them. Check out the list of  best angular chart libraries .  In this article, we will see data visualization with ngx-charts and how to use ngx-charts in angular application ? We will see, How to install ngx-charts in angular ? Create a vertical bar chart Create a pie chart, advanced pie chart and pie chart grid Introduction ngx-charts  is an open-source and declarative charting framework for angular2+. It is maintained by  Swimlane . It is using Angular to render and animate the SVG elements with all of its binding and speed goodness and uses d3 for the excellent math functio...

Understand Angular’s forRoot and forChild

  forRoot   /   forChild   is a pattern for singleton services that most of us know from routing. Routing is actually the main use case for it and as it is not commonly used outside of it, I wouldn’t be surprised if most Angular developers haven’t given it a second thought. However, as the official Angular documentation puts it: “Understanding how  forRoot()  works to make sure a service is a singleton will inform your development at a deeper level.” So let’s go. Providers & Injectors Angular comes with a dependency injection (DI) mechanism. When a component depends on a service, you don’t manually create an instance of the service. You  inject  the service and the dependency injection system takes care of providing an instance. import { Component, OnInit } from '@angular/core'; import { TestService } from 'src/app/services/test.service'; @Component({ selector: 'app-test', templateUrl: './test.component.html', styleUrls: ['./test.compon...

How to solve Puppeteer TimeoutError: Navigation timeout of 30000 ms exceeded

During the automation of multiple tasks on my job and personal projects, i decided to move on  Puppeteer  instead of the old school PhantomJS. One of the most usual problems with pages that contain a lot of content, because of the ads, images etc. is the load time, an exception is thrown (specifically the TimeoutError) after a page takes more than 30000ms (30 seconds) to load totally. To solve this problem, you will have 2 options, either to increase this timeout in the configuration or remove it at all. Personally, i prefer to remove the limit as i know that the pages that i work with will end up loading someday. In this article, i'll explain you briefly 2 ways to bypass this limitation. A. Globally on the tab The option that i prefer, as i browse multiple pages in the same tab, is to remove the timeout limit on the tab that i use to browse. For example, to remove the limit you should add: await page . setDefaultNavigationTimeout ( 0 ) ;  COPY SNIPPET The setDefaultNav...