Skip to main content

The Ultimate Node.js Production Checklist

You can use this as your checklist on production when you're deploying Node apps. Since this is a production-ready-practices article, a lot of them won't apply when you're developing apps on your local system.

Run node in cluster mode/separate node processes

Remember that Node is single threaded. It can delegate a lot of things (like HTTP requests and filesystem read/writes) to the OS which handles it in a multithreaded environment. But still, the code YOU write, the application logic, always runs in a single thread.

By running in a single thread, your Node process is always limited to only a single core on your machine. So if you have a server with multiple cores, you're wasting computation power running Node just once on your server.

What does "running Node just once" mean? You see, operating systems have a scheduler built into them which is responsible for how the execution of processes is distributed across the CPUs of the machine. When you run only 2 processes on a 2-core machine, the OS determines it is best to run both of the processes on separate cores to squeeze out maximum performance.

A similar thing needs to be done with Node. You have two options at this point:

  1. Run Node in cluster mode - Cluster mode is an architecture which comes baked into Node itself. In simple words, Node forks more processes of its own and distributes load through a single master process.
  2. Run Node processes independently - This option is slightly different from the above in the sense that you now do not have a master process controlling the child Node processes. This means that when you spawn different Node processes, they'll run completely independent of each other. No shared memory, no IPC, no communication, nada.

According to a stackoverflow answer, the latter (point 2) performs far better than the former (point 1) but is a little tricker to setup.

Why? Because in a Node app, not only is there application logic, but almost always when you're setting up servers in Node code you need to bind ports. And a single application codebase cannot bind the same port twice on the same OS.

This problem is, however, easily fixable. Environment variables, Docker containers, NGiNX frontend proxy, and so on are some of the solutions for this.

Rate Limiting your endpoints

Let's face it. Not everybody in the world has best intentions for your architecture. Sure, attacks like DDoS are simply very complicated to mitigate, and even giants like GitHub go down when something like that happens.

But the least you can do is prevent a script-kiddie from taking down your server just because you have an expensive API endpoint exposed from your server without any rate-limiting in place.

If you use Express with Node, there are 2 beautiful packages which work seamlessly together to rate limit traffic on Layer 7:

  1. Express Rate Limit - https://www.npmjs.com/package/express-rate-limit
  2. Express Slow Down - https://www.npmjs.com/package/express-slow-down

Express Slow Down actually adds incremental delay to your requests instead of dropping them. This way legit users, if they DDoS by accident (super activity of clicking buttons here and there), are simply slowed down and are not rate limited.

On the other hand, if there's a script-kiddie running scripts to take down the server, Express rate limiter monitors and rate limits that particular user, depending on the user IP, user account, or anything else you want.

Rate limiting could (should!) be applied on Layer 4 as well (Layer 4 means blocking traffic before discovering the contents of it - HTTP) through IP address. If you want, you can setup an NGiNX rule which blocks traffic on layer 4 and rejects the flood of traffic coming from a single IP, thus saving your server processes from overwhelming.

Use a frontend server for SSL termination

Node provides out of the box support for SSL handshakes with the browser using the https server module combined with the required SSL certs.

But let's be honest here, your application should not be concerned with SSL in the first place anyway. This is not something the application logic should do. Your Node code should only be responsible for what happens with the request, not the pre-processing and post-processing of data coming in and out of your server.

SSL termination refers to converting traffic from HTTPS to HTTP. And there are much better tools available than Node for that. I recommend NGiNX or HAProxy for it. Both have free versions available which get the job done and offload SSL termination from Node.

Use a frontend server for static file serving

Again, instead of using built in methods like express.static to serve static files, use frontend reverse proxy servers like NGiNX to serve static files from disk.

First of all, NGiNX can do that faster than Node (because it is built from scratch down to do only that). But it also offloads file serving from a single-threaded Node process which could use its clock cycles on something better.

Not only this – frontend proxy servers like NGiNX can also help you deliver content faster using GZIP compression. You can also set expiry headers, cache data, and much more, which is not something we should expect Node to do (however, Node can still do it).

Configure error handling

Proper error handling can save you from hours of debugging and trying to reproduce difficult bugs. On server, it is especially easy to setup architecture for error handling because you're the one running it. I recommend tools like Sentrywith Node which records, reports, and emails you whenever the server crashes due to an error in the source code.

Once that is in place, now it is time to restart the server when it crashes so the whole site doesn't just go down for hours until you manually take it up again.

For this, you can use a process manager like PM2. Or even better, use a dockerized container environment with policies like restart: always with proper memory and disk limits setup.

Docker setup ensures that even if your container runs in OME, the process spins up again (which might not happen on a PM2 environment, as the OS might kill PM2 if there's a memory leak somewhere in a running process).

Configure logs properly

All the answers lie in logs. Server hacks, server crashes, suspicious user behavior, etc. For that, you have to make sure that:

  1. Each and every request attempt is logged with the IP address/method of request/path accessed, basically as much information as you can log (except for private information like passwords and credit card information, of course)
  2. This can be achieved through the morgan package
  3. Setup file stream logs on production instead of console output. This is faster, easier to see and allows you to export logs to online log viewing services.
  4. Not all log messages have equal weight. Some logs are just there for debugging, while if some are present, it might indicate a pants-on-fire situation (like a server hack or unauthorized access). Use winston-logger for logging different levels of logs.
  5. Setup log rotation so that you don't get a log size in GBs after a month or so, when you see the server.
  6. GZIP your log files after rotation. Text is cheap, and is highly compressible and easy to store. You should never face problem with text logs as long as they are compressed and you're running a server with a decent disk space (25GB+).

Conclusion

It is easy to take note of a few practices in production which could save you tears and hours of debugging later on. Make sure you follow these best practices and let me know what you think by saying Hi on my twitter handle.

Comments

Popular posts from this blog

4 Ways to Communicate Across Browser Tabs in Realtime

1. Local Storage Events You might have already used LocalStorage, which is accessible across Tabs within the same application origin. But do you know that it also supports events? You can use this feature to communicate across Browser Tabs, where other Tabs will receive the event once the storage is updated. For example, let’s say in one Tab, we execute the following JavaScript code. window.localStorage.setItem("loggedIn", "true"); The other Tabs which listen to the event will receive it, as shown below. window.addEventListener('storage', (event) => { if (event.storageArea != localStorage) return; if (event.key === 'loggedIn') { // Do something with event.newValue } }); 2. Broadcast Channel API The Broadcast Channel API allows communication between Tabs, Windows, Frames, Iframes, and  Web Workers . One Tab can create and post to a channel as follows. const channel = new BroadcastChannel('app-data'); channel.postMessage(data); And oth...

Certbot SSL configuration in ubuntu

  Introduction Let’s Encrypt is a Certificate Authority (CA) that provides an easy way to obtain and install free  TLS/SSL certificates , thereby enabling encrypted HTTPS on web servers. It simplifies the process by providing a software client, Certbot, that attempts to automate most (if not all) of the required steps. Currently, the entire process of obtaining and installing a certificate is fully automated on both Apache and Nginx. In this tutorial, you will use Certbot to obtain a free SSL certificate for Apache on Ubuntu 18.04 and set up your certificate to renew automatically. This tutorial will use a separate Apache virtual host file instead of the default configuration file.  We recommend  creating new Apache virtual host files for each domain because it helps to avoid common mistakes and maintains the default files as a fallback configuration. Prerequisites To follow this tutorial, you will need: One Ubuntu 18.04 server set up by following this  initial ...

Working with Node.js streams

  Introduction Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams. This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once. This article will cover the following topics: Types of streams When to adopt Node.js streams Batching Composing streams in Node.js Transforming data with transform streams Piping streams Error handling Node.js streams Types of streams The following are four main types of streams in Node.js: Readable streams: The readable stream is responsible for reading data from a source file Writable streams: The writable stream is re...