Skip to main content

Event Loop and the Big Picture — NodeJS Event Loop

What makes NodeJS distinguished from any other programming platforms is how it handles I/O. We hear this all the time when NodeJS is introduced by someone saying “A non-blocking, event-driven platform based on google’s v8 javascript engine”. What do all these mean? What do ‘non-blocking’ and ‘event-driven’ mean? Answer for all these lies at the heart of NodeJS, the Event Loop. In this series of posts, I’m going to describe what event loop is, how it works, how it affects our applications, how to get the best out of it and much more. In this first post, I will describe how NodeJS works, how it accesses I/O and how it can work with different platforms, etc.

Reactor Pattern

NodeJS works in an event-driven model that involves an Event Demultiplexerand an Event Queue. All I/O requests will eventually generate an event of completion/failure or any other trigger, which is called an Event. These events are processed according to the following algorithm.
  1. Event demultiplexer receives I/O requests and delegates these requests to the appropriate hardware.
  2. Once the I/O request is processed (e.g, data from a file is available to be read, data from a socket is available to be read, etc.), event demultiplexer will then add the registered callback handler for the particular action in a queue to be processed. These callbacks are called events and the queue where events are added is called the Event Queue.
  3. When events are available to be processed in the event queue, they are executed sequentially in the order they were received until the queue is empty.
  4. If there are no events in the event queue or the Event Demultiplexer has no pending requests, the program will complete. Otherwise, the process will continue from the first step.
The program which orchestrates this entire mechanism is called the Event Loop.
Event Loop is a single-threaded and semi-infinite loop. The reason why this is called a semi-infinite loop is that this actually quits at some point when there is no more work to be done. In the developer’s perspective, this is where the program exits.
Don’t get yourself confused with the event loop and the NodeJS EventEmitter. EventEmitter is totally a different concept than the Event Loop.
The above diagram is a high-level overview of how NodeJS work and displays the main components of a design pattern called the Reactor Pattern. But this is much more complex than this. So how complex is this?
Event demultiplexer is not a single component which does all the types of I/O in all the OS platforms.
The Event queue is not a single queue as displayed here where all the types of events are queued in and dequeued from. And I/O is not the only event type that is getting queued.
So let’s dig deep.

Event Demultiplexer

Event Demultiplexer is not a component which exists in the real world, but an abstract concept in the reactor pattern. In the real world, event demultiplexer has been implemented in different systems in different names such as epollon Linux, kqueue on BSD systems (macOS), event ports in Solaris, IOCP (Input Output Completion Port) in Windows, etc. NodeJS consumes the low-level non-blocking, asynchronous hardware I/O functionalities provided by these implementations.
But the confusing fact is, not all the types of I/O can be performed using these implementations. Even on the same OS platform, there are complexities in supporting different types of I/O. Typically, network I/O can be performed in a non-blocking way using these epoll, kqueue, event ports and IOCP, but the file I/O is much more complex. Certain systems, such as Linux does not support complete asynchrony for file system access. And there are limitations in file system event notifications/signalling with kqueue in macOS systems (you can read more about these complications here). It is very complex/nearly impossible to address all these file system complexities in order to provide complete asynchrony.
Similar to the file I/O, certain DNS functions provided by Node API also have certain complexities. Since NodeJS DNS functions such as dns.lookupaccesses system configuration files such as nsswitch.conf,resolv.conf and /etc/hosts , file system complexities described above are also applicable to dns.lookup function.
Therefore, a thread pool has been introduced to support I/O functions which cannot be directly addressed by hardware asynchronous I/O utils such as epoll/kqueue/event ports or IOCP. Now we know that not all the I/O functions happen in the thread pool. NodeJS has done its best to do most of the I/O using non-blocking and asynchronous hardware I/O, but for the I/O types which blocks or are complex to address, it uses the thread pool.
🤔 However, I/O is not the only type of tasks performed on the thread pool. There are some Node.js crypto functions such as crypto.pbkdf2, async versions of crypto.randomBytes,crypto.randomFilland async versions of zlib functionswhich run on the libuv thread pool because they are highly CPU intensive. Running them on the threadpool prevents blocking of the event loop.
As we saw, in the real world it is really difficult to support all the different types of I/O (file I/O, network I/O, DNS, etc.) in all the different types of OS platforms. Some I/O can be performed using native hardware implementations while preserving complete asynchrony, and there are certain I/O types which should be performed in the thread pool so that the asynchronous nature can be guaranteed.
A common misconception among the developers about Node is that Node performs all the I/O in the thread pool.
To govern this entire process while supporting cross-platform I/O, there should be an abstraction layer that encapsulates these inter-platform and intra-platform complexities and expose a generalized API for the upper layers of Node.
So who does that? Please welcome….
Official libuv logo (https://github.com/libuv/libuv)
libuv is cross-platform support library which was originally written for NodeJS. It’s designed around the event-driven asynchronous I/O model.
The library provides much more than a simple abstraction over different I/O polling mechanisms: ‘handles’ and ‘streams’ provide a high level abstraction for sockets and other entities; cross-platform file I/O and threading functionality is also provided, amongst other things.
Now let’s see how libuv is composed. The following diagram is from the official libuv docs and describes how different types of I/O have been handled while exposing a generalized API.
Source: http://docs.libuv.org/en/v1.x/_images/architecture.png
Now we know that the Event Demultiplexer, is not an atomic entity, but a collection of an I/O processing APIs abstracted by the Libuv and exposed to the upper layers of NodeJS. It’s not only the event demultiplexer that libuv provides for Node. Libuv provides the entire event loop functionality to NodeJS including the event queuing mechanism.
Now let’s look at the Event Queue.

Event Queue

The event queue is supposed to be a data structure where all the events are getting enqueued and processed by the event loop sequentially until the queue is empty. But how this happens in Node is entirely different from how the abstract reactor pattern describes it. So how it differs?
There are more than one queues in NodeJS where different types of events getting queued in their own queue.
After processing one phase and before moving to the next phase, event loop will process two intermediate queues until no items are remaining in the intermediate queues.
So how many queues are there? what are the intermediate queues?
There are 4 main types of queues that are processed by the native libuv event loop.
  • Expired timers and intervals queue — consists of callbacks of expired timers added using setTimeout or interval functions added using setInterval.
  • IO Events Queue — Completed IO events
  • Immediates Queue — Callbacks added using setImmediate function
  • Close Handlers Queue— Any close event handlers.
Please note that although I mention all these to be “Queues” for simplicity, some of them are actually different types of data structures (e.g, timers are stored in a min-heap)
Besides these 4 main queues, there are additionally 2 interesting queues which I previously mentioned as ‘intermediate queues’ and are processed by Node. Although these queues are not part of libuv itself but are parts NodeJS. They are,
  • Next Ticks Queue — Callbacks added using process.nextTickfunction
  • Other Microtasks Queue — Includes other microtasks such as resolved promise callbacks
As you can see in the following diagram, Node starts the event loop by checking for any expired timers in the timers queue, and go through each queue in each step while maintaining a reference counter of total items to be processed. After processing the close handlers queue, if there are no items to be processed in any queue and there are no pending operations, the loop will exit. The processing of each queue in the event loop can be considered as a phase of the event loop.
What’s interesting about the intermediate queues depicted in red is that, as soon as one phase is complete event loop will check these two intermediate queues for any available items. If there are any items available in the intermediate queues, the event loop will immediately start processing them until the two immediate queues are emptied. Once they are empty, the event loop will continue to the next phase.
E.g, The event loop is currently processing the immediates queue which has 5 handlers to be processed. Meanwhile, two handlers are added to the next tick queue. Once the event loop completes 5 handlers in the immediates queue, event loop will detect that there are two items to be processed in the next tick queue before moving to the close handlers queue. It will then execute all the handlers in the next tick queue and then will move to process the close handlers queue.
Next tick queue has even higher priority over the Other Micro tasks queue. Although, they both are processed in between two phases of the event loop when libuv communicates back to higher layers of Node at the end of a phase. You’ll notice that I have shown the next tick queue in dark red which implies that the next tick queue is emptied before starting to process resolved promises in the microtasks queue.
Priority for next tick queue over resolved promises is only applicable for the native JS promises provided by v8. If you are using a library such as q or bluebird, you will observe an entirely different result because they predate native promises and has different semantics.
q and bluebird also differ in their own way of handling resolved promises which I will explain in a later blog post.
The convention of these so-called ‘intermediate’ queues introduces a new problem, IO starvation. Extensively filling up the next tick queue using process.nextTick function will force the event loop to keep processing the next tick queue indefinitely without moving forward. This will cause IO starvation because the event loop cannot continue without emptying the next tick queue.
To prevent this, there used to be a maximum limit for the next tick queue which can be set using process.maxTickDepth parameter, but it has been removed since NodeJS v0.12 for some reason.
I will describe each of these queues in-depth in later posts with examples.
Finally, now you know what event loop is, how it is implemented and how Node handles asynchronous I/O. Let’s now look at where Libuv is in the NodeJS architecture.

Comments

Popular posts from this blog

How to use Ngx-Charts in Angular ?

Charts helps us to visualize large amount of data in an easy to understand and interactive way. This helps businesses to grow more by taking important decisions from the data. For example, e-commerce can have charts or reports for product sales, with various categories like product type, year, etc. In angular, we have various charting libraries to create charts.  Ngx-charts  is one of them. Check out the list of  best angular chart libraries .  In this article, we will see data visualization with ngx-charts and how to use ngx-charts in angular application ? We will see, How to install ngx-charts in angular ? Create a vertical bar chart Create a pie chart, advanced pie chart and pie chart grid Introduction ngx-charts  is an open-source and declarative charting framework for angular2+. It is maintained by  Swimlane . It is using Angular to render and animate the SVG elements with all of its binding and speed goodness and uses d3 for the excellent math functio...

Understand Angular’s forRoot and forChild

  forRoot   /   forChild   is a pattern for singleton services that most of us know from routing. Routing is actually the main use case for it and as it is not commonly used outside of it, I wouldn’t be surprised if most Angular developers haven’t given it a second thought. However, as the official Angular documentation puts it: “Understanding how  forRoot()  works to make sure a service is a singleton will inform your development at a deeper level.” So let’s go. Providers & Injectors Angular comes with a dependency injection (DI) mechanism. When a component depends on a service, you don’t manually create an instance of the service. You  inject  the service and the dependency injection system takes care of providing an instance. import { Component, OnInit } from '@angular/core'; import { TestService } from 'src/app/services/test.service'; @Component({ selector: 'app-test', templateUrl: './test.component.html', styleUrls: ['./test.compon...

How to solve Puppeteer TimeoutError: Navigation timeout of 30000 ms exceeded

During the automation of multiple tasks on my job and personal projects, i decided to move on  Puppeteer  instead of the old school PhantomJS. One of the most usual problems with pages that contain a lot of content, because of the ads, images etc. is the load time, an exception is thrown (specifically the TimeoutError) after a page takes more than 30000ms (30 seconds) to load totally. To solve this problem, you will have 2 options, either to increase this timeout in the configuration or remove it at all. Personally, i prefer to remove the limit as i know that the pages that i work with will end up loading someday. In this article, i'll explain you briefly 2 ways to bypass this limitation. A. Globally on the tab The option that i prefer, as i browse multiple pages in the same tab, is to remove the timeout limit on the tab that i use to browse. For example, to remove the limit you should add: await page . setDefaultNavigationTimeout ( 0 ) ;  COPY SNIPPET The setDefaultNav...