Skip to main content

GROUP BY IN MONGOOSE WITH LOOKUP

Group ? Populate ? Lookup ? If you too are like me and come from the SQL world of JOINS and GROUP BY , you might have a bit of a trouble getting used to these operations in MongoDB or maybe this is the first time you are looking into something like this. Either way, let us in this article see how Group By and Lookup work in Mongoose.

GROUP

So, first of all I assume that you guys know what Group or Group By (In SQL) actually is. This is an operator that allows you to group the documents or model instances in the case of mongoose using some field and display rest of the information based on that information.
Let us take an example here. Suppose there is a Post model which has many comments, these comments are stored in a comments array as Object IDs. Also, this Post model has other attributes like category , author , views etc.
What we want to achieve is to for example display the total number of views for a particular author and also load all the comments that that particular author’s posts have received. So , here we will look to group the posts by authors , match the author name you want the information for (if there is an Author model, who knows ?) . We will then use the lookup operator that will fetch the comments from the comment model. 
Note -> The lookup operator works like the populate method in mongoose, the difference being the lookup method is used in aggregates.
Let’s see how do we achieve this.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
models.Post.aggregate([
  {"$group": { "_id": "$author"},
       "totalAuthorViews" : {$sum : "$views"},
       "comments" : {$push : "$comments"}
  },
  {
    "$lookup": {
      "from": "comments",
      "localField": "comments",
      "foreignField": "_id",
      "as": "comments"
    }
  },
  {
    "$unwind": {"path": "$comments"}
  },
  {
    "$limit": 10
  }
])
Now, there are tens of other methods and options that we could have added to the aggregate above but we have more or less achieved what we set out to. To use others you may refer this link -> https://docs.mongodb.com/manual/reference/method/db.collection.aggregate/
What you could also add is a match with a query, add skip with a well… a number to be added as a skip value.
Actually, this was the thing that made me use aggregate as well, the mongoose methods distinct and skip do not work together. Therefore, I used group to achieve it using aggregate.

Lookup

Now we know what Group does, what does Lookup do? If you have used MongoDB or other document based databases you must be knowing that we often store the ObjectID of some other document as a field value. To get the fields for the other document while querying for the first document we use the lookup operator.
Actually, this is what even populate does under the hood for mongoose. And for all my SQL friends, this is the replacement for JOINS in Mongo.

Code Explanation

We saw a piece of code up that used group in aggregate instead of mongoose methods and lookup instead of populate.
Let us break down that code line by line and see what it does.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
models.Post.aggregate([
  {"$group": { "_id": "$author"},
       "totalAuthorViews" : {$sum : "$views"},
       "comments" : {$push : "$comments"}
  },
  {
    "$lookup": {
      "from": "comments",
      "localField": "comments",
      "foreignField": "_id",
      "as": "comments"
    }
  },
  {
    "$unwind": {"path": "$comments"}
  },
  {
    "$limit": 10
  }
])
Here is the code for your convenience yet again.
1. The first block inside the aggregate is the group block, what this does is create a group and determines what fields will that group project.
The _id of the group is to be set to the field you want to GROUP BY.
The other fields are which you want to project.
In the example, we have author as the _id because we want to group using the author. The other two fields being projected are views and comments. Pay close attention to comments here.
2. The next block is that of lookup. What lookup does is take four arguments -> from , localField, foreignField and as. 
from is the document you want to pick the fields from, localField is the field value you want to join using in your own document. foreignField is the same value in the ‘from’ document you want to join with. as is just an alias to be shown as in the result.
3.The unwind operator basically takes the array equivalent of the value and returns a mongo document.
A very important thing to remember is that the aggregate works as a pipeline and that the second block receives the result of the first block as the argument.
So, the first block’s result must contain the field you are using as the localField in the lookup.
This is the exact reason why we have first pushed the comment IDs to the comments field for the first block’s result.

Comments

Popular posts from this blog

4 Ways to Communicate Across Browser Tabs in Realtime

1. Local Storage Events You might have already used LocalStorage, which is accessible across Tabs within the same application origin. But do you know that it also supports events? You can use this feature to communicate across Browser Tabs, where other Tabs will receive the event once the storage is updated. For example, let’s say in one Tab, we execute the following JavaScript code. window.localStorage.setItem("loggedIn", "true"); The other Tabs which listen to the event will receive it, as shown below. window.addEventListener('storage', (event) => { if (event.storageArea != localStorage) return; if (event.key === 'loggedIn') { // Do something with event.newValue } }); 2. Broadcast Channel API The Broadcast Channel API allows communication between Tabs, Windows, Frames, Iframes, and  Web Workers . One Tab can create and post to a channel as follows. const channel = new BroadcastChannel('app-data'); channel.postMessage(data); And oth...

Certbot SSL configuration in ubuntu

  Introduction Let’s Encrypt is a Certificate Authority (CA) that provides an easy way to obtain and install free  TLS/SSL certificates , thereby enabling encrypted HTTPS on web servers. It simplifies the process by providing a software client, Certbot, that attempts to automate most (if not all) of the required steps. Currently, the entire process of obtaining and installing a certificate is fully automated on both Apache and Nginx. In this tutorial, you will use Certbot to obtain a free SSL certificate for Apache on Ubuntu 18.04 and set up your certificate to renew automatically. This tutorial will use a separate Apache virtual host file instead of the default configuration file.  We recommend  creating new Apache virtual host files for each domain because it helps to avoid common mistakes and maintains the default files as a fallback configuration. Prerequisites To follow this tutorial, you will need: One Ubuntu 18.04 server set up by following this  initial ...

Working with Node.js streams

  Introduction Streams are one of the major features that most Node.js applications rely on, especially when handling HTTP requests, reading/writing files, and making socket communications. Streams are very predictable since we can always expect data, error, and end events when using streams. This article will teach Node developers how to use streams to efficiently handle large amounts of data. This is a typical real-world challenge faced by Node developers when they have to deal with a large data source, and it may not be feasible to process this data all at once. This article will cover the following topics: Types of streams When to adopt Node.js streams Batching Composing streams in Node.js Transforming data with transform streams Piping streams Error handling Node.js streams Types of streams The following are four main types of streams in Node.js: Readable streams: The readable stream is responsible for reading data from a source file Writable streams: The writable stream is re...