Skip to main content

Learn MongoDB Aggregation with real world example

In this article, we will see what is aggregation in mongodb and how to build mongodb aggregationpipelines.Learn MongoDB Aggregation with real world example. I assume that you have some experience in MongoDB.

When you start using mongodb in the beginning stage. you often write queries in mongodb just to do CRUD(Create Read Update and Delete) operations.

But, when an application gets more complex, you may need to perform several operations on the data before sending it as a response.

For example, Consider that you are building an Analytical Dashboard where you need to show Tasks for each user. Here, server should send the all tasks for each user in an array.

Can you guess how to achieve this in our data?.. Here comes the role of Mongodb Aggregation.

Moreover, This is one of the simple example, you may face lot more complex than this. Let's see how mongodb solves this problem using aggregation pipeline.

MongoDB Aggregation Pipeline

Firstly, Mongodb aggregation is a pipeline which process the data on each pipeline stage. Each stage returns the output which turns into the input for next pipeline in the stage.

monogdb aggregator

Here, Data is passed in the each pipeline which filters,group and sort the data and returns the result.

Let's see the Aggregation Operator that are widely used in the real world applications.

Here's an example of how we are going to build mongodb aggregation pipeline,

1pipeline = [
2 { $match : {},
3 { $group : {},
4 { $sort : {},
5 ...
6]
7db.collectionName.aggregate(pipeline, options)

Before, going further. To practice the aggregation along with the article. you can import the sample dataset from this site.

Once, you download the data. you can import the dataset to mongodb using the mongoimport command.

import

1mongoimport --db aggrsample --collection test --file sample.json --jsonArray

\$match

match operator is similar to find() operator in mongoDB except that match works with aggregation. Likewise, match pipeline operator matches all the documents that satisfies the condition.

On the above dataset, let's try to match all the documents which has MA as a state.

1db.test
2 .aggregate([
3 {
4 $match: {
5 state: "MA",
6 },
7 },
8 ])
9 .pretty()

As a result, it will find all the document with MA as a state. output will be like,

match query

\$group

As the name suggests, it groups the documents based on the particular field. it can be id or any other fields.

On the top of match command, let's us group the documents by city.

1db.test.aggregate([
2 {
3 $match: {
4 state: "MA",
5 },
6 },
7 {
8 $group: {
9 _id: "$city",
10 },
11 },
12])

Further, you can see the result like,

only group

but wait, we want to retrieve all the fields in the document. why does it only return grouping field?.

Well, there is a reason for it. Once we group the document with _id field( _id field can contain any grouping field).

Then, we need to provide accumulator expression to retrieve the result of grouping pipeline. popular accumulator expressions are

  • \$first - this expression will return the first document of grouping result
  • \$push - it will push all the documents into an array based on grouping result
  • \$max - returns the highest value from the grouping documents.
  • \$sum - returns the sum(numerial value) of the grouping documents.

Here, we will see how to use pushexpression along with group

1db.test
2 .aggregate([
3 {
4 $match: {
5 state: "MA",
6 },
7 },
8 {
9 $group: {
10 _id: "$city",
11 data: {
12 $push: "$$ROOT",
13 },
14 },
15 },
16 ])
17 .pretty()

So, it will return documents like,

group push

\$project

Sometimes, you may not need all the fields in the document. you can only retrieve specific fields in the document using project operator.

1db.test
2 .aggregate([
3 {
4 $match: {
5 state: "MA",
6 },
7 },
8 {
9 $group: {
10 _id: "$city",
11 data: {
12 $push: "$$ROOT",
13 },
14 },
15 },
16 {
17 $project: {
18 _id: 0,
19 "data.loc": 1,
20 },
21 },
22 ])
23 .pretty()

project command

moreover, you can specify 0 or 1 to retrieve the specific field. By default, _id will be retrieved. you can specify _id as 0 to avoid that.

\$sort

Above all ,sort operator basically sorts the document in either ascending or descending order.

1db.test.aggregate([
2 {
3 $match: {
4 state: "MA",
5 },
6 },
7 {
8 $sort: {
9 pop: 1,
10 },
11 },
12])

Mainly, it sorts the documents based on field pop with ascending order(1 if it is ascending, -1 if it is descending).

sort command

\$limit

limit operator limits the number of documents retrieved from the database.

1db.test.aggregate([
2 {
3 $match: {
4 state: "MA",
5 },
6 },
7 {
8 $sort: {
9 pop: 1,
10 },
11 },
12 {
13 $limit: 5,
14 },
15])

So, above query will returns only five documents from the database

limit command

\$addFields

Meanwhile, sometime you need to create a custom field which can contain data that are aggregated. you can achieve this using \$addField operator.

1db.test.aggregate([
2 {
3 $match: {
4 state: "MA",
5 },
6 },
7 {
8 $addFields: {
9 stateAlias: "MAS",
10 },
11 },
12])

As a result, it will return documents such as

addField

\$lookup

After that, lookup is one of the popular aggregation operators in mongodb. if you are from SQL background. you can relate this with JOIN Query in RDBMS.

1db.universities
2 .aggregate([
3 { $match: { name: "USAL" } },
4 { $project: { _id: 0, name: 1 } },
5 {
6 $lookup: {
7 from: "courses",
8 localField: "name",
9 foreignField: "university",
10 as: "courses",
11 },
12 },
13 ])
14 .pretty()
  • from - it takes the collection that it wants to perform the join.
  • localField - it specifies the field from input document. Here, it takes the field name from the universities collection.
  • foreignField - it specifies the field from collection that it performs the join.Here, it is university field from coursescollection.
  • as - it specifies the alias for the field name.

Summary

To sum up, these are all the most common and popular aggregation operators in MongoDB. we will see in depth concepts of mongoDB aggregation operators in upcoming articles.

Comments

Popular posts from this blog

How to use Ngx-Charts in Angular ?

Charts helps us to visualize large amount of data in an easy to understand and interactive way. This helps businesses to grow more by taking important decisions from the data. For example, e-commerce can have charts or reports for product sales, with various categories like product type, year, etc. In angular, we have various charting libraries to create charts.  Ngx-charts  is one of them. Check out the list of  best angular chart libraries .  In this article, we will see data visualization with ngx-charts and how to use ngx-charts in angular application ? We will see, How to install ngx-charts in angular ? Create a vertical bar chart Create a pie chart, advanced pie chart and pie chart grid Introduction ngx-charts  is an open-source and declarative charting framework for angular2+. It is maintained by  Swimlane . It is using Angular to render and animate the SVG elements with all of its binding and speed goodness and uses d3 for the excellent math functio...

Understand Angular’s forRoot and forChild

  forRoot   /   forChild   is a pattern for singleton services that most of us know from routing. Routing is actually the main use case for it and as it is not commonly used outside of it, I wouldn’t be surprised if most Angular developers haven’t given it a second thought. However, as the official Angular documentation puts it: “Understanding how  forRoot()  works to make sure a service is a singleton will inform your development at a deeper level.” So let’s go. Providers & Injectors Angular comes with a dependency injection (DI) mechanism. When a component depends on a service, you don’t manually create an instance of the service. You  inject  the service and the dependency injection system takes care of providing an instance. import { Component, OnInit } from '@angular/core'; import { TestService } from 'src/app/services/test.service'; @Component({ selector: 'app-test', templateUrl: './test.component.html', styleUrls: ['./test.compon...

How to solve Puppeteer TimeoutError: Navigation timeout of 30000 ms exceeded

During the automation of multiple tasks on my job and personal projects, i decided to move on  Puppeteer  instead of the old school PhantomJS. One of the most usual problems with pages that contain a lot of content, because of the ads, images etc. is the load time, an exception is thrown (specifically the TimeoutError) after a page takes more than 30000ms (30 seconds) to load totally. To solve this problem, you will have 2 options, either to increase this timeout in the configuration or remove it at all. Personally, i prefer to remove the limit as i know that the pages that i work with will end up loading someday. In this article, i'll explain you briefly 2 ways to bypass this limitation. A. Globally on the tab The option that i prefer, as i browse multiple pages in the same tab, is to remove the timeout limit on the tab that i use to browse. For example, to remove the limit you should add: await page . setDefaultNavigationTimeout ( 0 ) ;  COPY SNIPPET The setDefaultNav...