Skip to main content

How to generate PDFs with Node.js

How to generate PDFs with Node.js

  1. PhantomJS
  2. Puppeteer
  3. Html-pdf-chrome

What would we do without PDFs? It’s the safest, most secure way of storing visual information, and it’s accessible on every major platform in the world. Dynamically creating a printer-friendly version of a webpage, a user’s invoice or event tickets all require creating secure information that a user can download, email, or print. With such practical functionality, the possibility of using PDFS in the foreseeable future remains high for both end-users and web application developers.

Technically implementing a PDF generator is surprisingly difficult. There’s no standardized tool for it, and the operation is more complex than one would assume. As much as we’d love to tell you the definitive method for generating PDFs, there simply isn’t one.<

If you are lacking time and patience to do it yourself, there are paid options out there. These include:

All hassle-free, PDF Crowd is the only one without a free option and, PDF Generator doesn’t have any paid options between free and $59/month.

On the other hand, if you’d rather implement a solution yourself, the best option is to create a headless browser using Node JS. A headless browser is simply a GUI-less browser that can run in the background of a webpage and continue to implement solutions that a browser normally does. There are several headless Node browsers that generate PDFs.

generating PDFs

PhantomJS

One headless browser that’s been around for some time is PhantomJS. It’s an open-source JavaScript API that’s been in use since 2011. Unfortunately, as of 2018, it is no longer in active development, so it may not continue to be a viable option for much longer. However, it’s syntax is simple to learn even for beginners, so for now it remains a staple among headless browsers.

The easiest way to create a PDF using PhantomJS is the render method. This function allows the headless browser to create a file out of any webpage; not just PDFs, but JPEGs, PNGs, and more. The method for doing this is incredibly simple.

var webPage = require('webpage');
var page = webPage.create();

page.viewportSize = { width: 1920, height: 1080 };
page.open("http://www.google.com", function start(status) {
  page.render('google_home.pdf, {format: 'pdf', quality: '100'});
  phantom.exit();
});

In this example, PhantomJS asynchronously opens the Google homepage in the background of your normal browser, and then renders the entire page as a PDF when it’s finished. If you need to give your users a printer-friendly PDF of your page, you could copy and paste this code with minimal alteration. If you need to render something else, like an invoice or a receipt, you could use a string template, and load it as a web page with PhantomJS in the same manner.

const puppeteer = require("puppeteer");
(async () {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({ width: 1440, height: 900, deviceScaleFactor: 2 });
await page.goto("file:///practice/resumeapp/resume.html", {
waitUntil: "networkidle2"
});
await page.pdf({
path: "resume.pdf",
pageRanges: "1",
format: "A4",
printBackground: true
});

await browser.close();
})(); 

Puppeteer

Perhaps the most common headless browser for Node is Puppeteer, an API that is run by Google Chrome. Thanks to that Google backing, Puppeteer is extremely powerful, and theoretically can do anything that Chrome can do. The advantages Puppeteer has over Phantom is that it’s better documented, supported, and more flexible. It is a bit more difficult for a newcomer to create PDFs, but there is plenty of information on how to use it on it’s GitHub page.
Puppeteer’s method for creating a PDF out of a webpage is much the same as Phantom’s. It makes a series of asynchronous calls in the background of your browser. Here is an example of the same code as above, done with Puppeteer.

const puppeteer = require('puppeteer');

(async () => {

  const browser = await puppeteer.launch();

  const page = await browser.newPage();

  await page.goto('https://google.com', {waitUntil: 'networkidle2'});

  await page.pdf({path: 'hn.pdf', format: 'A4'});

  await browser.close();

})();

As you can see, it’s nearly the same as the PhantomJS code, but a bit more complicated. However, the advantage of Puppeteer is its versatility. The Puppeteer API has a wide range of options involving PDF generation that you can find here. The above code produces this result, vs the original page:

undefined

Html-pdf-chrome

A third option for generating PDFs from HTML in Node is by using a library called html-pdf-chrome. This isn’t a separate headless browser, but a library that utilizes Chrome’s built-in headless browsing capabilities for the sole purpose of generating PDFs. Of the methods for conversion, this is the simplest to use that we’ve seen. The only limitation is that you must have a recent version of Chrome or Chromium installed on your machine in order to use it. Like Puppeteer and Phantom, you can convert an external page to a PDF, or load a page from a template to make something more customized. Here is an example of how to use html-pdf-chrome to convert an external page into a PDF.

import * as htmlPdf from 'html-pdf-chrome';
const options: htmlPdf.CreateOptions = {
  port: 9222, // po
};
const url = 'https://github.com/westy92/html-pdf-chrome';
const pdf = await htmlPdf.create(url, options);

As you can see, it’s very simple to use. Unfortunately, it is not as well-documented as Puppeteer and doesn’t seem to have as many options as Puppeteer or Phantom. Html-pdf-chrome is best for simpler jobs, perhaps, while Puppeteer may be a better option for more complex tasks like creating invoices.

These are just a few of the possible solutions available right now. As long as people need PDFs from the web, the field will keep evolving, and new tools will be created. Perhaps one will become the industry standard. In the meantime, one of these options can provide you with a good place to start. If you have found other options for generating PDFs with Node, please let us know! We’re always looking for new solutions.

Comments

Popular posts from this blog

How to use Ngx-Charts in Angular ?

Charts helps us to visualize large amount of data in an easy to understand and interactive way. This helps businesses to grow more by taking important decisions from the data. For example, e-commerce can have charts or reports for product sales, with various categories like product type, year, etc. In angular, we have various charting libraries to create charts.  Ngx-charts  is one of them. Check out the list of  best angular chart libraries .  In this article, we will see data visualization with ngx-charts and how to use ngx-charts in angular application ? We will see, How to install ngx-charts in angular ? Create a vertical bar chart Create a pie chart, advanced pie chart and pie chart grid Introduction ngx-charts  is an open-source and declarative charting framework for angular2+. It is maintained by  Swimlane . It is using Angular to render and animate the SVG elements with all of its binding and speed goodness and uses d3 for the excellent math functio...

Understand Angular’s forRoot and forChild

  forRoot   /   forChild   is a pattern for singleton services that most of us know from routing. Routing is actually the main use case for it and as it is not commonly used outside of it, I wouldn’t be surprised if most Angular developers haven’t given it a second thought. However, as the official Angular documentation puts it: “Understanding how  forRoot()  works to make sure a service is a singleton will inform your development at a deeper level.” So let’s go. Providers & Injectors Angular comes with a dependency injection (DI) mechanism. When a component depends on a service, you don’t manually create an instance of the service. You  inject  the service and the dependency injection system takes care of providing an instance. import { Component, OnInit } from '@angular/core'; import { TestService } from 'src/app/services/test.service'; @Component({ selector: 'app-test', templateUrl: './test.component.html', styleUrls: ['./test.compon...

How to solve Puppeteer TimeoutError: Navigation timeout of 30000 ms exceeded

During the automation of multiple tasks on my job and personal projects, i decided to move on  Puppeteer  instead of the old school PhantomJS. One of the most usual problems with pages that contain a lot of content, because of the ads, images etc. is the load time, an exception is thrown (specifically the TimeoutError) after a page takes more than 30000ms (30 seconds) to load totally. To solve this problem, you will have 2 options, either to increase this timeout in the configuration or remove it at all. Personally, i prefer to remove the limit as i know that the pages that i work with will end up loading someday. In this article, i'll explain you briefly 2 ways to bypass this limitation. A. Globally on the tab The option that i prefer, as i browse multiple pages in the same tab, is to remove the timeout limit on the tab that i use to browse. For example, to remove the limit you should add: await page . setDefaultNavigationTimeout ( 0 ) ;  COPY SNIPPET The setDefaultNav...