Skip to main content

MONGODB AGGREGATION FRAMEWORK FOR BEGINNERS

The mongoDB aggregation framework is one part of Mongo that used to scare me a lot when I was starting out and I think the same is true for a lot of beginners. The entire concept is a bit tricky to wrap your head around and people usually tend to avoid using aggregations as a whole.
In this article I’ll try to explain the aggregation framework in as easy words as possible and probably try to remove this fear of aggregation from some people’s minds.

What is aggregation ?

Aggregation is nothing but just another query method that you can use to find your documents in mongoDB. What is special about it is the way it processes the data and takes the data through stages, filtering in each stage and giving the data ahead (more explanation on it later).
What you need to get into your mind is that aggregation is nothing but just another way to findrecords.

How do aggregations work ?

This is where aggregation is really special and operates in a different way to the normal find queries. What the aggregation framework does is operate in stages, i.e, it does the first operation on the collection , gets a list of documents as the result and then the next operation will be done only on the result of the first operation.
Think of it as an imaginary pipeline, the results of the last stage are the only ones used in the further stages.
The first operation (usually the match operation) is the one that is run on the entire collection. Also, to be noted is that the first stage of the process is the one that benefits from any indexes that you have put up on the collection. The output of the first operation then serves as the input for the next and so on… until we get the output documents.

groupBy with aggregation

One of the biggest advantages of the aggregate framework is it’s ability to give us the GROUP BY sql feature in mongoDB. To have a deeper look at how the aggregate framework uses $group for grouping documents you can take a look at this article -> https://easyontheweb.com/group-by-in-mongoose-with-lookup/
With the $group feature comes in a lot of other powers that the aggregate framework gives us. We can add custom fields or values to intermediate results while going through the stages to the next operation to operate on.

How to use aggregate?

This query shows the three most important aspects of the aggregate framework. We first write a $match that is the query that we would write in the find of the simple mongoDB searching . Next, we group the documents matching the $match conditions by the “age”. Note that we use $ to access the fields of the last result in the pipeline , that is why $age is used.
The third thing we use is the addition of new fields to the resulting documents, we create the fields “women” and “total” where we are storing the name and sum of women we’ve found in the point up till that stage.
What I would recommend you to do is start by writing easy small aggregates and then move to advanced concepts and complex queries with $unwind etc. The aggregate framework is great and is a must for grouping data so you have to face it one day or the other.

Comments

Popular posts from this blog

4 Ways to Communicate Across Browser Tabs in Realtime

1. Local Storage Events You might have already used LocalStorage, which is accessible across Tabs within the same application origin. But do you know that it also supports events? You can use this feature to communicate across Browser Tabs, where other Tabs will receive the event once the storage is updated. For example, let’s say in one Tab, we execute the following JavaScript code. window.localStorage.setItem("loggedIn", "true"); The other Tabs which listen to the event will receive it, as shown below. window.addEventListener('storage', (event) => { if (event.storageArea != localStorage) return; if (event.key === 'loggedIn') { // Do something with event.newValue } }); 2. Broadcast Channel API The Broadcast Channel API allows communication between Tabs, Windows, Frames, Iframes, and  Web Workers . One Tab can create and post to a channel as follows. const channel = new BroadcastChannel('app-data'); channel.postMessage(data); And oth...

Certbot SSL configuration in ubuntu

  Introduction Let’s Encrypt is a Certificate Authority (CA) that provides an easy way to obtain and install free  TLS/SSL certificates , thereby enabling encrypted HTTPS on web servers. It simplifies the process by providing a software client, Certbot, that attempts to automate most (if not all) of the required steps. Currently, the entire process of obtaining and installing a certificate is fully automated on both Apache and Nginx. In this tutorial, you will use Certbot to obtain a free SSL certificate for Apache on Ubuntu 18.04 and set up your certificate to renew automatically. This tutorial will use a separate Apache virtual host file instead of the default configuration file.  We recommend  creating new Apache virtual host files for each domain because it helps to avoid common mistakes and maintains the default files as a fallback configuration. Prerequisites To follow this tutorial, you will need: One Ubuntu 18.04 server set up by following this  initial ...

Working with Node.js streams

  Introduction Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams. This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once. This article will cover the following topics: Types of streams When to adopt Node.js streams Batching Composing streams in Node.js Transforming data with transform streams Piping streams Error handling Node.js streams Types of streams The following are four main types of streams in Node.js: Readable streams: The readable stream is responsible for reading data from a source file Writable streams: The writable stream is re...